<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/llm/bedrock.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Oracle Cloud Infrastructure Generative AI

Oracle Cloud Infrastructure (OCI) Generative AI is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases, and which is available through a single API.
Using the OCI Generative AI service you can access ready-to-use pretrained models, or create and host your own fine-tuned custom models based on your own data on dedicated AI clusters. Detailed documentation of the service and API is available __[here](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm)__ and __[here](https://docs.oracle.com/en-us/iaas/api/#/en/generative-ai/20231130/)__.

This notebook explains how to use OCI's Genrative AI models with LlamaIndex.

## Setup

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [None]:
%pip install llama-index-llms-oci-genai

In [None]:
!pip install llama-index

You will also need to install the OCI sdk

In [None]:
!pip install -U oci

## Basic Usage

Using LLMs offered by OCI Generative AI with LlamaIndex only requires you to initialize the OCIGenAI interface with your OCI endpoint, model ID, OCID, and authentication method.

#### Call `complete` with a prompt

In [2]:
from llama_index.llms.oci_genai import OCIGenAI

llm = OCIGenAI(
        model="cohere.command", 
        service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1"  # replace the value with the right profile name
)

resp= llm.complete("Paul Graham is ")
print(resp)

an American computer scientist, entrepreneur, and investor. He is best known for his work in computer science, his founding of the startup accelerator Y Combinator, and his essays on startup company culture and entrepreneurship.

Graham was born in Cambridge, Massachusetts in 1964. He studied at the Harvard University Computer Laboratory and received his Bachelor's degree in philosophy from Harvard in 1986. He then attended the University of Cambridge and received a Master's degree in philosophy in 1988.

Graham's career in computer science began in the early 1990s when he started working on Lisp, a programming language. He was the author of the widely acclaimed book "On Lisp" and has been known for his advocacy of the use of Lisp in modern software development.

In 1995, Graham started his first company, ViaWeb, which was one of the first e-commerce platforms and was acquired by Yahoo! in 1998.

In 2005, Graham founded Y Combinator, a startup accelerator that has provided early fundin

#### Call `chat` with a list of messages

In [3]:
from llama_index.llms.oci_genai import OCIGenAI
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are a pirate with a colorful personality"),
    ChatMessage(role="user", content="Tell me a story"),
]

llm = OCIGenAI(
        model="meta.llama-2-70b-chat",
        service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1"  # replace the value with the right profile name
)

resp = llm.chat(messages)
print(resp)

ServiceError: {'target_service': 'generative_ai_inference', 'status': 400, 'code': '400', 'opc-request-id': '139823F14D2545A098281F92212887EF/549593E68778548C52CD1C2A5448FA0F/023DE18DC1537F91FBC30F9909248DB2', 'message': '{"object":"error","message":"Conversation roles must alternate user/assistant/user/assistant/...","type":"BadRequestError","param":null,"code":400}', 'operation_name': 'chat', 'timestamp': '2024-05-06T19:27:12.088604+00:00', 'client_version': 'Oracle-PythonSDK/2.126.2+preview.1.8984', 'request_endpoint': 'POST https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/chat', 'logging_tips': 'To get more info on the failing request, refer to https://docs.oracle.com/en-us/iaas/tools/python/latest/logging.html for ways to log the request/response details.', 'troubleshooting_tips': 'See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_400__400_400 for more information about resolving this error. If you are unable to resolve this generative_ai_inference issue, please contact Oracle support and provide them this full error message.'}

## Streaming

Using `stream_complete` endpoint 

In [9]:
from llama_index.llms.oci_genai import OCIGenAI

llm = OCIGenAI(
        model="meta.llama-2-70b-chat", 
        service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1"  # replace the value with the right profile name
)

resp= llm.stream_complete("Paul Graham is ", is_stream=False, stop=["people"])
for r in resp:
    print(r.delta, end="")

100% correct. The problem is not the number of hours worked, it's the quality of those hours.

I've worked 80 hour weeks before, and I've worked 40 hour weeks. Guess which ones were more productive? The 40 hour weeks, hands down.

The reason is that when you work too many hours, you start to get burned out. You become less productive, less focused, and less creative. You start to make mistakes, and you start to resent your work.

On the other hand, when you work a reasonable number of hours, you have time to rest, recharge, and refocus. You have time to think, to reflect, and to learn. You have time to enjoy your work, and to enjoy your life outside of work.

So, don't worry about the number of hours you work. Worry about the quality of those hours. Make sure you're working smart, not hard. Make sure you're taking breaks, getting enough sleep, and taking care of yourself.

And above all, remember that work is just a part of your life. It's not the only thing that matters. Take time to 

Using `stream_chat` endpoint

In [2]:
from llama_index.llms.oci_genai import OCIGenAI
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are a pirate with a colorful personality"),
    ChatMessage(role="user", content="Tell me a story"),
]

llm = OCIGenAI(
        model="cohere.command-r",
        service_endpoint="https://ppe.inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1",  # replace the value with the right profile name
)

resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

Ahoy, matey! Ye listen to this tale, and ye'll hear the story of a fearsome pirate named Black Bart. 

Black Bart was a fearsome pirate who sailed the seven seas in search of treasure and glory. He and his crew had heard tales of a mysterious island hidden away in the vast ocean, said to be bursting to the seams with riches beyond compare. The island was guarded by a terrible monster, and many a pirate had set sail to find it, only to disappear without a trace. 

But Black Bart was not a pirate to be deterred by mere rumors. He and his crew set sail on their ship, the Wicked Wench, to find this island and claim the treasure for themselves. They sailed for many a moonlit night, braving the dangers of the high seas, until one fateful day when they spotted a mysterious figurehead looming on the horizon. It was the island, and the monster that guarded it - a giant sea serpent!

Black Bart and his crew prepared their weapons, ready for battle. Bart himself brandished his fearsome cutlass, a

In [5]:
from llama_index.llms.oci_genai import OCIGenAI
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="chatbot", content="You are a pirate with a colorful personality"),
    ChatMessage(role="user", content="Tell me a story"),
]

llm = OCIGenAI(
        model="ocid1.generativeaiendpoint.oc1.us-chicago-1.amaaaaaabgjpxjqaxfey5jyxfjlaydari5lhs4sinyxspvjcrdk5xi67s6ha",
        service_endpoint="https://ppe.inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.compartment.oc1..aaaaaaaabma2uwi3rcrlx5qxsihcr2k4ehf7jxer6p6c6ngga2zlhkgir3ka",
        provider="cohere",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1",  # replace the value with the right profile name
        context_size=128000
)

resp = llm.stream_chat(messages)
for r in resp:
    print(r.delta, end="")

Once upon a time, there was a fearsome pirate captain named Crimson Jack. He sailed the Seven Seas in search of treasure and glory, striking fear into the hearts of all who crossed his path. With a loyal crew by his side, Captain Jack embarked on countless adventures, always seeking the thrill of the chase and the shine of gold.

One day, Captain Jack received a mysterious map from an old sea dog in a tavern. The map purported to show the location of a legendary treasure, a fortune that could make him the richest pirate in history. With a gleam in his eye, Captain Jack set sail immediately, eager to claim the bounty as his own.

As he followed the map's clues, Captain Jack encountered treacherous storms, cunning enemies, and treacherous traps. He battled rival pirates, outwitted treacherous mermaids, and braved the depths of the ocean to uncover the secrets of the treasure's location. Along the way, he made new allies, including a clever navigator named Lily and a fearsome warrior name

In [3]:
from llama_index.llms.oci_genai import OCIGenAI
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="chatbot", content="You are a pirate with a colorful personality"),
    ChatMessage(role="user", content="Tell me a story"),
]

llm = OCIGenAI(
        model="cohere.command-r-plus",
        service_endpoint="https://ppe.inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaaumuuscymm6yb3wsbaicfx3mjhesghplvrvamvbypyehh5pgaasna",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1",  # replace the value with the right profile name
)

resp = llm.stream_chat(messages, is_stream=False, stop_sequences=["pirate"])
for r in resp:
    print(r.delta, end="")

ServiceError: {'target_service': 'generative_ai_inference', 'status': 400, 'code': '400', 'opc-request-id': '44947AB9CBA146988CCF5063288BAE91/3AB3F356D373DEB38829D813BA2FBE50/716635362178B7A9B772AE2CB609E4EA', 'message': 'Entity with key cohere.command-r-plus not found', 'operation_name': 'chat', 'timestamp': '2024-05-06T19:18:43.216613+00:00', 'client_version': 'Oracle-PythonSDK/2.126.2+preview.1.8984', 'request_endpoint': 'POST https://ppe.inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/chat', 'logging_tips': 'To get more info on the failing request, refer to https://docs.oracle.com/en-us/iaas/tools/python/latest/logging.html for ways to log the request/response details.', 'troubleshooting_tips': 'See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_400__400_400 for more information about resolving this error. If you are unable to resolve this generative_ai_inference issue, please contact Oracle support and provide them this full error message.'}

In [4]:
from llama_index.embeddings.oci_genai import OCIGenAIEmbeddings

embeddings = OCIGenAIEmbeddings(
                model="cohere.embed-english-light-v2.0", 
                service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
                compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q", 
                auth_type="SECURITY_TOKEN",
                auth_profile="BoatOc1"
            )

response = embeddings.get_text_embedding_batch(["Hello", "World"])
print(response)

response = embeddings.get_query_embedding("This is a query in English.")
print(response)

[[0.1508789, -0.93115234, 1.6337891, -1.9023438, -1.3085938, 0.6777344, 1.234375, 0.9926758, -0.047058105, -1.0683594, -1.046875, 1.6503906, 1.0292969, 0.82373047, -0.56396484, 0.50878906, 0.32788086, 0.5263672, 0.39819336, -1.1289062, 0.09631348, 1.234375, -0.38378906, -0.57470703, 1.0185547, 3.5957031, 7.0273438, -1.3271484, 1.4804688, 0.23937988, -1.3779297, 0.3696289, 2.5722656, -0.50390625, -0.5678711, 1.296875, 1.1767578, -0.26635742, 0.6738281, 1.6728516, 0.85595703, -0.4621582, 0.37109375, 0.8847656, -1.2285156, 0.013969421, -1.1816406, -2.0664062, -0.029022217, 1.5166016, -0.26953125, 0.22094727, -0.48535156, 0.34960938, -0.16833496, 0.93359375, 1.2314453, -0.8022461, 0.62060547, 1.5029297, -0.15258789, -0.048217773, -0.7607422, -0.5786133, -0.5698242, -1.1728516, -0.51171875, -1.8417969, -1.4238281, -1.0898438, -1.1708984, -0.6333008, 0.24951172, 0.9091797, -0.6616211, -0.6542969, 0.20202637, 0.8520508, 0.34301758, -1.5048828, 1.6074219, 1.2011719, 1.2900391, -0.08959961, -1.

In [18]:
from llama_index.embeddings.oci_genai import OCIGenAIEmbeddings

embeddings = OCIGenAIEmbeddings(
                model="ocid1.generativeaiendpoint.oc1.us-chicago-1.amaaaaaabgjpxjqarn6ervzltqwrk7foq5h7oktqwclq5v7omhmlvmdfhe4q", 
                service_endpoint="https://ppe.inference.generativeai.us-chicago-1.oci.oraclecloud.com",
                compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q", 
                auth_type="SECURITY_TOKEN",
                auth_profile="BoatOc1"
            )

response = embeddings.get_text_embedding_batch(["Hello", "World"])
print(response)

response = embeddings.get_query_embedding("This is a query in English.")
print(response)

[[0.01626587, -0.0013523102, -0.0033874512, -0.003446579, 0.01058197, 0.008651733, 0.0044937134, 0.012504578, 0.010406494, 0.0062408447, 0.027404785, -0.008636475, 0.0095825195, 0.0065841675, -0.01260376, -0.01071167, 0.033325195, 0.0027503967, -0.0064048767, -0.0101623535, 0.0042304993, 0.002670288, -0.01777649, 0.013549805, 0.011260986, 0.018035889, -0.008613586, -0.034698486, 0.02583313, -0.00066423416, -0.0006942749, 0.003129959, 0.027694702, -0.002527237, -0.010391235, -0.014564514, 9.1671944e-05, 0.008735657, -0.0069236755, 0.024475098, 0.001701355, -0.008171082, 0.010597229, -0.00024986267, 0.01374054, 0.009010315, 0.036132812, 0.020950317, -0.017791748, -0.015220642, 0.0048828125, -0.0057678223, 0.0041656494, 0.20019531, -0.007446289, -0.014556885, 0.0016622543, -0.025344849, 0.009407043, -0.02418518, -0.017211914, 0.00617218, 0.0031261444, 0.005584717, 0.025970459, -0.0071411133, -0.0017232895, -0.015464783, 0.004108429, -0.0028686523, -0.009719849, -0.009521484, -0.014541626,

## Configure Model

In [None]:
from llama_index.llms.oci_genai import OCIGenAI

llm = OCIGenAI(
        model="cohere.command", 
        service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="MY_OCID",
)

resp= llm.complete("Paul Graham is ")
print(resp)

## Authentication
The authentication methods supported for LlamaIndex are equivalent to those used with other OCI services and follow the __[standard SDK authentication](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/sdk_authentication_methods.htm)__ methods, specifically API Key, session token, instance principal, and resource principal.

API key is the default authentication method. The following example demonstrates how to use a different authentication method (session token)

In [3]:
from llama_index.llms.oci_genai import OCIGenAI

llm = OCIGenAI(
    model="cohere.command",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
    auth_type="RESOURCE_PRINCIPAL",
    auth_profile="BoatOc1",  # replace with your profile name
)

resp= llm.stream_chat("Paul Graham is ")
print(resp)

ValueError: ('Could not authenticate with OCI client. Please check if ~/.oci/config exists. \n            If INSTANCE_PRINCIPAL or RESOURCE_PRINCIPAL is used, please check the specified \n            auth_profile and auth_type are valid.', OSError('OCI_RESOURCE_PRINCIPAL_VERSION is not defined'))

In [5]:
from llama_index.llms.oci_genai import OCIGenAI

llm = OCIGenAI(
        model="meta.llama-2-70b-chat", 
        service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
        compartment_id="ocid1.tenancy.oc1..aaaaaaaasz6cicsgfbqh6tj3xahi4ozoescfz36bjm3kucc7lotk2oqep47q",
        auth_type="SECURITY_TOKEN", 
        auth_profile="BoatOc1"  # replace the value with the right profile name
)

resp= llm.stream_complete("Paul Graham is ", stop=["person"])
for r in resp:
    print(r.delta, end="")

100% correct. The only way to get good at programming is to program. A lot.

I've been programming for over 20 years, and I can tell you that the only way to get good at it is to keep writing code. The more code you write, the better you'll get. It's just like any other skill - the more you practice, the better you'll become.

I've seen a lot of people try to learn programming by reading books or taking courses, but they never actually write any code. They might understand the theory, but they don't have any practical experience. And that's what matters - being able to apply what you've learned to real-world problems.

So, if you want to get good at programming, don't worry too much about the theory. Just start writing code. Start with simple programs and gradually work your way up to more complex projects. You'll learn by doing, and you'll get better with each line of code you write.

And don't be afraid to make mistakes. Making mistakes is a natural part of the learning process, and 

In [1]:
!pip list

Package                                 Version                  Editable project location
--------------------------------------- ------------------------ ---------------------------------------------------------------------------------------------------
aiohttp                                 3.9.5
aiosignal                               1.3.1
annotated-types                         0.6.0
anyio                                   4.3.0
appnope                                 0.1.4
argon2-cffi                             23.1.0
argon2-cffi-bindings                    21.2.0
arrow                                   1.3.0
astroid                                 2.13.5
asttokens                               2.4.1
async-lru                               2.0.4
attrs                                   23.2.0
Babel                                   2.14.0
backcall                                0.2.0
beautifulsoup4                          4.12.3
black                                   23.9.1
b