# OpenAI & OCI LLMs with LangChain: Educational Demo

Welcome! This notebook demonstrates the use of OpenAI-compatible LLMs (OpenAI, OCI, Meta, Groq, etc.) via LangChain.

If you have questions, use the following Slack channels for support and collaboration:
- `#generative-ai-users`
- `#igiu-innovation-lab` (for peer interaction)
- `#igiu-ai-learning` (if you encounter errors)

*This notebook is designed for education: code cells are annotated with instructional blocks, and there are exercises throughout for active learning.*

## Supported LLM Models

Here are model IDs you can use with the OCI OpenAI-compatible API (see [Oracle Docs](https://docs.oracle.com/en-us/iaas/Content/generative-ai/chat-models.htm) for updates):

| Family | Example Models |
|--------|----------------|
| **OpenAI**   | openai.gpt-4.1, openai.gpt-4o, openai.gpt-5 |
| **Meta (Llama)** | meta.llama-4-maverick-17b-128e-instruct, meta.llama-4-scout-17b-16e-instruct |
| **Cohere** | cohere.command-a-03-2025, cohere.command-plus-latest |
| **Groq/XAI** | xai.grok-3, xai.grok-4 |

> **TIP:** Experiment with different models to compare results.

## Environment & Configuration

Prepare your `sandbox.yaml` (or `sandbox.json`) config with the appropriate OCI credentials and compartment settings. Place it in your working directory or adjust the code below to point to its location.

If you have trouble loading credentials or get permission errors, check the file path, format, and profile settings. For help, reach out on the Slack channels above.

In [5]:
# Load OCI + Model Configuration
import os,sys,time
from envyaml import EnvYAML
from dotenv import load_dotenv

SANDBOX_CONFIG_FILE = "sandbox.yaml"

load_dotenv()
def load_config(config_path):
    """Load configuration from a YAML file."""
    try:
        return EnvYAML(config_path)
    except FileNotFoundError:
        print(f"Error: Configuration file '{config_path}' not found.")
        return None

scfg = load_config(SANDBOX_CONFIG_FILE)
if scfg is not None:
    print("Config loaded successfully! Profile:", scfg['oci']['profile'])

Config loaded successfully! Profile: INNOLAB-LEARNING


### Set up your LLM client

- Specify your target model (`LLM_MODEL`).
- Choose the appropriate service endpoint.
- Use the config (`scfg`) loaded above for credentials.

You may need to re-instantiate the client when switching models.

In [6]:
# Add the parent directory of the notebook to sys.path
from langChain.oci_openai_helper import OCIOpenAIHelper

LLM_MODEL = "openai.gpt-4.1"
# Other ideas: "meta.llama-4-maverick-17b-128e-instruct", "xai.grok-4"
llm_service_endpoint = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"

llm_client = OCIOpenAIHelper.get_client(
    model_name=LLM_MODEL,
    config=scfg
)

print(f"Ready to send prompts to model: {LLM_MODEL}")

Ready to send prompts to model: openai.gpt-4.1


## Basic Chat Completion

Send a single prompt and get a response from your selected LLM. Try different models for different styles and strengths.

In [7]:
MESSAGE = """
    why is the sky blue? explain in 2 sentences like I am 5
"""

response = llm_client.invoke(MESSAGE)
print(response)

content='The sky looks blue because sunlight gets scattered by the air, and blue light scatters more than other colors. That’s why, when you look up, you see a blue sky!' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 26, 'total_tokens': 63, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'openai.gpt-4.1', 'system_fingerprint': 'fp_d38c7f4fa7', 'id': 'chatcmpl-CZAoGCb1faSp1P1X38YKopSSwusR7', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='run--a79e3444-3150-4d83-9f30-46b8d7accc7d-0' usage_metadata={'input_tokens': 26, 'output_tokens': 37, 'total_tokens': 63, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}


### Model Selection Demo (Loop)

Compare how different LLMs respond to the same input. This helps understand model behaviors and strengths.

*Try adding/removing models from the list to test others!*

In [8]:
selected_llms = [
    "openai.gpt-4.1",
    "openai.gpt-5",
    "meta.llama-4-maverick-17b-128e-instruct-fp8",
    "meta.llama-4-scout-17b-16e-instruct",
    "xai.grok-4",
    "xai.grok-4-fast-non-reasoning"
]

for llm_id in selected_llms:
    print(f"\n\n***** Chat Result for {llm_id} *****")
    llm_client.model_name = llm_id 
    start_time = time.time()
    response = llm_client.invoke(MESSAGE)
    end_time = time.time()
    print(response)
    print(f"\n Time taken for {llm_id}: {end_time - start_time:.2f} seconds\n\n")




***** Chat Result for openai.gpt-4.1 *****
content='The sky looks blue because sunlight gets spread out in the air, and blue light scatters more than other colors. So, when you look up, you see more blue than any other color!' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 39, 'prompt_tokens': 26, 'total_tokens': 65, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'openai.gpt-4.1', 'system_fingerprint': 'fp_d38c7f4fa7', 'id': 'chatcmpl-CZAoK0alEjFrhddYVwhkbFpc0SUIv', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='run--a9e85b3d-9698-4bf5-95df-02d8d8036296-0' usage_metadata={'input_tokens': 26, 'output_tokens': 39, 'total_tokens': 65, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}

 Tim

## Batching & Output Control

Sending multiple questions at once and managing the output with token limits.

In [9]:
questions = ["why is sky blue?", "why is it dark at night?"]

try:
    responses = llm_client.batch(questions)
    print(responses)
except AttributeError:
    batch_responses = [llm_client.invoke(q) for q in questions]
    for q, r in zip(questions, batch_responses):
        print(f"Q: {q}\nA: {r.content}\n")

# Limit token output
try:
    llm_client.max_tokens = 10
except AttributeError:
    pass
response = llm_client.invoke(MESSAGE)
print(f"\n [with max lenth as 10]{response}")

[AIMessage(content="The sky appears blue during the day due to a phenomenon called Rayleigh scattering. Here's a concise explanation, backed by physics:\n\n### The Science Behind It\n- **Sunlight and Atmosphere**: Sunlight is white light, made up of all colors of the visible spectrum (red, orange, yellow, green, blue, indigo, violet). As it enters Earth's atmosphere, it interacts with air molecules (mostly nitrogen and oxygen), which are much smaller than the wavelengths of visible light.\n  \n- **Scattering Effect**: These molecules scatter the light in all directions. Shorter wavelengths (like blue and violet) scatter much more efficiently than longer ones (like red and orange) because of Rayleigh scattering—the intensity of scattered light is inversely proportional to the fourth power of the wavelength (I ∝ 1/λ⁴). Blue light (wavelength ~450 nm) scatters about 10 times more than red light (~650 nm).\n\n- **Why Blue, Not Violet?**: Violet light scatters even more than blue, but our e

## Prompting: System and User Roles

LLMs can take structured conversation input. Use a `system` role to guide model behavior and a `user` role for the query.

In [10]:
system_message = {"role": "system", "content": "You are a poetic assistant who responds in exactly four lines."}
user_message = {"role": "user", "content": "What is the meaning of life?"}
messages = [system_message, user_message]
llm_client.max_tokens = None
response = llm_client.invoke(messages)
print(response)

content="In the vast cosmic dance, life's meaning unfolds,  \nA quest for connection, where hearts and stars entwine.  \nIt's love's quiet whisper in the soul's hidden folds,  \nAnd purpose we craft, one breath at a time." additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 48, 'prompt_tokens': 187, 'total_tokens': 235, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 160, 'text_tokens': 187, 'image_tokens': 0}, 'num_sources_used': 0}, 'model_name': 'xai.grok-4-fast-non-reasoning', 'system_fingerprint': 'fp_040427a672', 'id': '585a6fb0-f190-f2e4-3a21-0a295fa834b9', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None} id='run--56bb76f4-c580-4c07-b47b-c86380291487-0' usage_metadata={'input_tokens': 187, 'output_tokens': 48, 'total_tokens': 235, 'input_token_details': {'audio': 0, 'cache_

## Streaming Responses

Some models and APIs support real-time (token-by-token) streaming. Use streaming for large outputs or real-time applications.

Below is a demonstration. (You may need to adapt for your client class.)

In [11]:
# Streaming example – requires stream support in your OciOpenAILangChainClient.
# For a complete version, see openai_oci_stream.py
try:
    for chunk in llm_client.stream(MESSAGE):
        print(chunk.content, end="", flush=True)
    print()
except AttributeError:
    print("Streaming is not supported in this client. See openai_oci_stream.py for details.")

The sky looks blue because sunlight is made up of all colors, but when it hits tiny bits of air, the blue part scatters everywhere more than the other colors. That's why we see mostly blue when we look up on a clear day!


## Async LLM Calls

For power users: Invoke multiple LLMs concurrently using Python async/await. Useful for batch processing large numbers of queries.

> For advanced usage, see `openai_oci_async.py`.

In [None]:
import asyncio

# Example: Asynchronously query multiple LLMs (pseudo-code)
async def ask_model(model, prompt):
    client = OCIOpenAIHelper.get_async_native_client(config=load_config(SANDBOX_CONFIG_FILE))
    resp = await  client.responses.create(model="xai.grok-4", input=prompt)
    return (model, resp)

async def main():
    prompts = ["Summarize the news today.", "Write a haiku about data science."]
    tasks = [ask_model(m, prompts[i % len(prompts)]) for i, m in enumerate(selected_llms)]
    results = await asyncio.gather(*tasks)
    for model, resp in results:
        print(f"Model: {model}\nResponse: {resp}\n")

# Uncomment to run (Jupyter support required):
# asyncio.run(main())

## Conversation History

Maintaining message history allows LLMs to handle context and multi-turn conversations (see `openai_oci_history.py`).

In [12]:
past_messages = [
    {"role": "user", "content": "Tell me something about Oracle."},
    {"role": "assistant", "content": "Oracle is one of the largest vendors in enterprise IT, best known for its flagship database."}
]

current_query = {"role": "user", "content": "What is its flagship product?"}

full_history = past_messages + [current_query]

response = llm_client.invoke(full_history)
print(response.content)

Oracle's flagship product is its relational database management system, Oracle Database (often just called Oracle DB). It's a powerful, scalable platform used for storing, managing, and retrieving large volumes of data in enterprise environments.


## Structured Output (JSON)

Some LLMs support structured outputs, returning their answers as valid JSON. See `openai_oci_structured_output.py` for advanced schemas. Below is an example prompt designed to elicit a JSON response:

> **Note:** Not all models/clients support output schemas or enforced JSON.

In [13]:
from pydantic import BaseModel, Field
from typing import List

from langChain.openai_oci_client import OciOpenAILangGraphClient

# Pydantic class schema
class BookInventory(BaseModel):
    """Information about a book in a store or library inventory."""
    # Requires always a description

    # Name of the field : type = Field description to help model generate
    title: str = Field(description="Title of the book")
    author: str = Field(description="Author of the book")
    publication_year: int = Field(description="Publication year of the book")
    in_stock: bool = Field(description="True if the book is currently available in stock")
    copies_available: int = Field(ge=0, description="Number of copies available")
    price_usd: float = Field(ge=0.0, description="Price of the book in USD")

#llm_client = OciOpenAILangChainClient(
llm_client = OciOpenAILangGraphClient(
        profile=scfg['oci']['profile'],
        compartment_id=scfg['oci']['compartment'],
        model=LLM_MODEL,
        service_endpoint=llm_service_endpoint
    )
composed_pydantic_model = llm_client.with_structured_output(BookInventory)

MESSAGE = """
  Give me the information about the current science fiction books.
"""

response = composed_pydantic_model.invoke(MESSAGE)
print(response)
print(type(response))

title='The Three-Body Problem' author='Liu Cixin' publication_year=2014 in_stock=True copies_available=5 price_usd=16.99
<class '__main__.BookInventory'>


# Resoaning
OpenAI responses api provides insights into LLM's reasoning

In [15]:
llm = OCIOpenAIHelper.get_client(
    model_name="openai.gpt-5",
    config=scfg,
    use_responses_api=True,
    reasoning={"effort": "low", "summary": "auto"},
)

messages = [
    (
        "system",
        "You are an expert reasoning assistant. Explain your steps clearly, then provide a brief summary."
    ),
    (
        "human",
        "If there are 21 apples and they are split equally among 3 friends, how many apples does each friend receive?"
    ),
]
res = llm.invoke(messages)

# Answer
print("\n=== Answer ===")
print(getattr(res, "content", res))

# Reasoning summary (if available)
ak = getattr(res, "additional_kwargs", {}) or {}
summary = ak.get("reasoning_summary") or ak.get("summary") or (ak.get("reasoning") or {}).get("summary")
print("\n=== Reasoning Summary ===")
print(summary if summary else "N/A")



=== Answer ===
[{'type': 'text', 'text': 'Steps:\n- Identify the operation: splitting equally among 3 friends means division.\n- Compute 21 ÷ 3 = 7.\n- Check: 7 apples per friend × 3 friends = 21 apples, which matches.\n\nSummary: Each friend receives 7 apples.', 'annotations': []}]

=== Reasoning Summary ===
[{'text': '**Explaining division steps**\n\nI need to follow the system instructions and explain the steps for a simple division problem clearly. I’ll break it down for 21 divided by 3, which equals 7 apples each. I should outline the steps: first, identify the operation, then perform the division, and finally, do a quick check. After that, I’ll provide a brief summary sentence. It’s important to keep the formatting straightforward, maybe even using a bullet list for clarity.', 'type': 'summary_text'}]


## Exercises

- Try different models from the table above and compare their style, accuracy, and creativity.
- Modify the prompt or system message to guide the LLM's behavior (e.g., “Reply in JSON”, “Summarize as bullet points”).
- Use batching and token limits to control response length.
- Build a short multi-turn conversation and observe how history impacts results.
- Try outputting structured JSON (if supported). 
- Share your findings or questions in the Slack channels above!

# Future Work Ideas

- Integrate the LLM with Slack for notifications or chatbots.
- Explore OpenAI function calling and tool integrations.
- Build pipelines or agents that chain together multiple LLM invocations.
- Test performance and differences between streaming and batch modes.
- Experiment with prompt engineering challenges (few-shot, chain-of-thought, etc).
- Explore multimodal capabilities (images, audio, documents) using OCI GenAI.

*Keep exploring and sharing!*