# Oracle OCI Generative AI Tracing with Openlayer

This notebook demonstrates how to use Openlayer tracing with Oracle Cloud Infrastructure (OCI) Generative AI service.

## Setup

Before running this notebook, ensure you have:
1. An OCI account with access to Generative AI service
2. OCI CLI configured or OCI config file set up
3. An Openlayer account with API key and inference pipeline ID
4. The required packages installed:
   - `pip install oci`
   - `pip install openlayer`

## Configuration

### Openlayer Setup
Set these environment variables before running:
```bash
export OPENLAYER_API_KEY="your-api-key"
export OPENLAYER_INFERENCE_PIPELINE_ID="your-pipeline-id"
```

### OCI Setup
Make sure your OCI configuration is properly set up. You can either:
- Use the default OCI config file (`~/.oci/config`)
- Set up environment variables
- Use instance principal authentication (when running on OCI compute)


In [None]:
# Install required packages (uncomment if needed)
# !pip install oci openlayer

# Set up Openlayer environment variables
import os

# Configure Openlayer API credentials
os.environ["OPENLAYER_API_KEY"] = "your-openlayer-api-key-here"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "your-inference-pipeline-id-here"

# NOTE: Remember to set your actual Openlayer API key and inference pipeline ID!

In [None]:
import oci
from oci.generative_ai_inference import GenerativeAiInferenceClient
from oci.generative_ai_inference.models import Message, ChatDetails, GenericChatRequest

# Import the Openlayer tracer
from openlayer.lib.integrations import trace_oci_genai

## Initialize OCI Client

Set up the OCI Generative AI client with your configuration.


In [None]:
# Configuration - Update these values for your environment
COMPARTMENT_ID = "your-compartment-ocid-here"  # Replace with your compartment OCID
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"  # Replace with your region's endpoint

# Load OCI configuration
config = oci.config.from_file()  # Uses default config file location
# Alternatively, you can specify a custom config file:
# config = oci.config.from_file("~/.oci/config", "DEFAULT")

# Create the OCI Generative AI client
client = GenerativeAiInferenceClient(config=config, service_endpoint=ENDPOINT)


## Apply Openlayer Tracing

Wrap the OCI client with Openlayer tracing to automatically capture all interactions.

The `trace_oci_genai()` function accepts an optional `estimate_tokens` parameter:
- `estimate_tokens=True` (default): Estimates token counts when not provided by OCI response
- `estimate_tokens=False`: Returns None for token fields when not available in the response

OCI responses can be either CohereChatResponse or GenericChatResponse, both containing usage information when available.


In [None]:
# Apply Openlayer tracing to the OCI client
# With token estimation enabled (default)
traced_client = trace_oci_genai(client, estimate_tokens=True)

# Alternative: Disable token estimation to get None values when tokens are not available
# traced_client = trace_oci_genai(client, estimate_tokens=False)

## Example 1: Non-Streaming Chat Completion

Simple chat completion without streaming.


In [None]:
# Create a chat request
chat_request = GenericChatRequest(
    messages=[Message(role="user", content="Hello! Can you explain what Oracle Cloud Infrastructure is?")],
    model_id="cohere.command-r-plus",
    max_tokens=200,
    temperature=0.7,
    is_stream=False,  # Non-streaming
)

chat_details = ChatDetails(compartment_id=COMPARTMENT_ID, chat_request=chat_request)

# Make the request - the tracer will automatically capture it
response = traced_client.chat(chat_details)
response

## Example 2: Streaming Chat Completion

Chat completion with streaming enabled to see tokens as they're generated.


In [None]:
# Create a streaming chat request
streaming_chat_request = GenericChatRequest(
    messages=[
        Message(role="system", content="You are a helpful AI assistant that provides concise, informative answers."),
        Message(role="user", content="Tell me a short story about cloud computing and AI working together."),
    ],
    model_id="meta.llama-3.1-70b-instruct",
    max_tokens=300,
    temperature=0.8,
    is_stream=True,  # Enable streaming
)

streaming_chat_details = ChatDetails(compartment_id=COMPARTMENT_ID, chat_request=streaming_chat_request)

# Make the streaming request
streaming_response = traced_client.chat(streaming_chat_details)

# Process the streaming response
full_content = ""
for chunk in streaming_response:
    if hasattr(chunk, "data") and hastr(chunk.data, "choices"):
        if chunk.data.choices and hasattr(chunk.data.choices[0], "delta"):
            delta = chunk.data.choices[0].delta
            if hasattr(delta, "content") and delta.content:
                full_content += delta.content

full_content

## Example 3: Custom Parameters and Error Handling

Demonstrate various model parameters and how tracing works with different scenarios.


In [None]:
# Advanced parameters example
advanced_request = GenericChatRequest(
    messages=[Message(role="user", content="Write a creative haiku about artificial intelligence.")],
    model_id="meta.llama-3.1-70b-instruct",
    max_tokens=100,
    temperature=0.9,  # High creativity
    top_p=0.8,
    frequency_penalty=0.2,  # Reduce repetition
    presence_penalty=0.1,
    stop=["\n\n"],  # Stop at double newline
    is_stream=False,
)

advanced_details = ChatDetails(compartment_id=COMPARTMENT_ID, chat_request=advanced_request)

response = traced_client.chat(advanced_details)
response