# OpenAI & OCI LLMs with LangChain: Educational Demo

### What this notebook does:

Demonstrates comprehensive usage of OpenAI-compatible LLMs (OpenAI, OCI, Meta, Groq, etc.) via LangChain. Covers basic chat, model performance comparison, batching, streaming, conversation history, structured output, and reasoning capabilities.

**Documentation to reference:**

- OCI Gen AI Chat Models: https://docs.oracle.com/en-us/iaas/Content/generative-ai/chat-models.htm
- OCI OpenAI Compatible SDK: https://github.com/oracle-samples/oci-openai
- LangChain Overview: https://docs.langchain.com/oss/python/langchain/overview
- OpenAI API Reference: https://platform.openai.com/docs/api-reference

**Relevant slack channels:**

- #generative-ai-users: *for questions on OCI Gen AI*
- #igiu-innovation-lab: *general discussions on your project*
- #igiu-ai-learning: *help with sandbox environment or help with running this code*
- configure cwd for jupyter match your workspace python code: 
    -  vscode menu -> Settings > Extensions > Jupyter > Notebook File Root
    -  change from `${fileDirname}` to `${workspaceFolder}`


**Env setup:**

- sandbox.yaml: Contains OCI config, compartment details.
- .env: Load environment variables (e.g., API keys if needed).

**How to run in notebook:**

- Make sure your runtime environment has all dependencies and access to required config files.
- Run the notebook cells in order from top to bottom.

## Supported LLM Models

Here are model IDs you can use with the OCI OpenAI-compatible API (see [Oracle Docs](https://docs.oracle.com/en-us/iaas/Content/generative-ai/chat-models.htm) for updates):

| Family | Example Models |
|--------|----------------|
| **OpenAI**   | openai.gpt-4.1, openai.gpt-4o, openai.gpt-5 |
| **Meta (Llama)** | meta.llama-4-maverick-17b-128e-instruct, meta.llama-4-scout-17b-16e-instruct |
| **Cohere** | cohere.command-a-03-2025, cohere.command-plus-latest |
| **Groq/XAI** | xai.grok-3, xai.grok-4 |

> **TIP:** Experiment with different models to compare results.

## Environment & Configuration

Prepare your `sandbox.yaml` (or `sandbox.json`) config with the appropriate OCI credentials and compartment settings. Place it in your working directory or adjust the code below to point to its location.

If you have trouble loading credentials or get permission errors, check the file path, format, and profile settings. For help, reach out on the Slack channels above.

In [None]:
# Load OCI + Model Configuration
import os,sys,time
from envyaml import EnvYAML
from dotenv import load_dotenv

SANDBOX_CONFIG_FILE = "sandbox.yaml"

load_dotenv()
def load_config(config_path):
    """Load configuration from a YAML file."""
    try:
        return EnvYAML(config_path)
    except FileNotFoundError:
        print(f"Error: Configuration file '{config_path}' not found.")
        return None

scfg = load_config(SANDBOX_CONFIG_FILE)
if scfg is not None:
    print("Config loaded successfully! Profile:", scfg['oci']['profile'])

### Set up your LLM client

- Specify your target model (`LLM_MODEL`).
- Choose the appropriate service endpoint.
- Use the config (`scfg`) loaded above for credentials.

You may need to re-instantiate the client when switching models.

In [None]:
# Add the parent directory of the notebook to sys.path
from langChain.oci_openai_helper import OCIOpenAIHelper

LLM_MODEL = "openai.gpt-4.1"
# Other ideas: "meta.llama-4-maverick-17b-128e-instruct", "xai.grok-4"


llm_client = OCIOpenAIHelper.get_langchain_openai_client(
    model_name=LLM_MODEL,
    config=scfg
)

print(f"Ready to send prompts to model: {LLM_MODEL}")

## Basic Chat Completion

Send a single prompt and get a response from your selected LLM. Try different models for different styles and strengths.

In [None]:
MESSAGE = """
    why is the sky blue? explain in 2 sentences like I am 5
"""

response = llm_client.invoke(MESSAGE)
print(response)

### Model Performance Comparison Demo (Loop)

Compare how different LLMs respond to the same input. This helps understand model behaviors and strengths.

*Try adding/removing models from the list to test others!*

In [None]:
selected_llms = [
    "openai.gpt-4.1",
    "openai.gpt-5",
    "meta.llama-4-maverick-17b-128e-instruct-fp8",
    "meta.llama-4-scout-17b-16e-instruct",
    "xai.grok-4",
    "xai.grok-4-fast-non-reasoning"
]

for llm_id in selected_llms:
    print(f"\n\n***** Chat Result for {llm_id} *****")
    llm_client.model_name = llm_id 
    start_time = time.time()
    response = llm_client.invoke(MESSAGE)
    end_time = time.time()
    print(response)
    print(f"\n Time taken for {llm_id}: {end_time - start_time:.2f} seconds\n\n")


## Batching & Output Control

Sending multiple questions at once and managing the output with token limits.

In [None]:
questions = ["why is sky blue?", "why is it dark at night?"]

try:
    responses = llm_client.batch(questions)
    print(responses)
except AttributeError:
    batch_responses = [llm_client.invoke(q) for q in questions]
    for q, r in zip(questions, batch_responses):
        print(f"Q: {q}\nA: {r.content}\n")

# Limit token output
try:
    llm_client.max_tokens = 10
except AttributeError:
    pass
response = llm_client.invoke(MESSAGE)
print(f"\n [with max lenth as 10]{response}")

## Prompting: System and User Roles

LLMs can take structured conversation input. Use a `system` role to guide model behavior and a `user` role for the query.

In [None]:
system_message = {"role": "system", "content": "You are a poetic assistant who responds in exactly four lines."}
user_message = {"role": "user", "content": "What is the meaning of life?"}
messages = [system_message, user_message]
llm_client.max_tokens = None
response = llm_client.invoke(messages)
print(response)

## Streaming Responses

Some models and APIs support real-time (token-by-token) streaming. Use streaming for large outputs or real-time applications.

Below is a demonstration. (You may need to adapt for your client class.)

In [None]:
# Streaming example – requires stream support in your OciOpenAILangChainClient.
# For a complete version, see openai_oci_stream.py
try:
    for chunk in llm_client.stream(MESSAGE):
        print(chunk.content, end="", flush=True)
    print()
except AttributeError:
    print("Streaming is not supported in this client. See openai_oci_stream.py for details.")

## Async LLM Calls

For power users: Invoke multiple LLMs concurrently using Python async/await. Useful for batch processing large numbers of queries.

> For advanced usage, see `openai_oci_async.py`.

In [None]:
import asyncio

# Example: Asynchronously query multiple LLMs (pseudo-code)
async def ask_model(model, prompt):
    client = OCIOpenAIHelper.get_async_native_client(config=load_config(SANDBOX_CONFIG_FILE))
    resp = await  client.responses.create(model="xai.grok-4", input=prompt)
    return (model, resp)

async def main():
    prompts = ["Summarize the news today.", "Write a haiku about data science."]
    tasks = [ask_model(m, prompts[i % len(prompts)]) for i, m in enumerate(selected_llms)]
    results = await asyncio.gather(*tasks)
    for model, resp in results:
        print(f"Model: {model}\nResponse: {resp}\n")

# Uncomment to run (Jupyter support required):
# asyncio.run(main())

## Conversation History

Maintaining message history allows LLMs to handle context and multi-turn conversations (see `openai_oci_history.py`).

In [None]:
past_messages = [
    {"role": "user", "content": "Tell me something about Oracle."},
    {"role": "assistant", "content": "Oracle is one of the largest vendors in enterprise IT, best known for its flagship database."}
]

current_query = {"role": "user", "content": "What is its flagship product?"}

full_history = past_messages + [current_query]

response = llm_client.invoke(full_history)
print(response.content)

## Structured Output (JSON)

Some LLMs support structured outputs, returning their answers as valid JSON. See `openai_oci_structured_output.py` for advanced schemas. Below is an example prompt designed to elicit a JSON response:

> **Note:** Not all models/clients support output schemas or enforced JSON.

In [None]:
from pydantic import BaseModel, Field
from typing import List

from langChain.openai_oci_client import OciOpenAILangGraphClient

# Pydantic class schema
class BookInventory(BaseModel):
    """Information about a book in a store or library inventory."""
    # Requires always a description

    # Name of the field : type = Field description to help model generate
    title: str = Field(description="Title of the book")
    author: str = Field(description="Author of the book")
    publication_year: int = Field(description="Publication year of the book")
    in_stock: bool = Field(description="True if the book is currently available in stock")
    copies_available: int = Field(ge=0, description="Number of copies available")
    price_usd: float = Field(ge=0.0, description="Price of the book in USD")

#llm_client = OciOpenAILangChainClient(
llm_client = OciOpenAILangGraphClient(
        profile=scfg['oci']['profile'],
        compartment_id=scfg['oci']['compartment'],
        model=LLM_MODEL,
        service_endpoint=llm_service_endpoint
    )
composed_pydantic_model = llm_client.with_structured_output(BookInventory)

MESSAGE = """
  Give me the information about the current science fiction books.
"""

response = composed_pydantic_model.invoke(MESSAGE)
print(response)
print(type(response))

# Reasoning
OpenAI responses api provides insights into LLM's reasoning

In [None]:
llm = OCIOpenAIHelper.get_langchain_openai_client(
    model_name="openai.gpt-5",
    config=scfg,
    use_responses_api=True,
    reasoning={"effort": "low", "summary": "auto"},
)

messages = [
    (
        "system",
        "You are an expert reasoning assistant. Explain your steps clearly, then provide a brief summary."
    ),
    (
        "human",
        "If there are 21 apples and they are split equally among 3 friends, how many apples does each friend receive?"
    ),
]
res = llm.invoke(messages)

# Answer
print("\n=== Answer ===")
print(getattr(res, "content", res))

# Reasoning summary (if available)
ak = getattr(res, "additional_kwargs", {}) or {}
summary = ak.get("reasoning_summary") or ak.get("summary") or (ak.get("reasoning") or {}).get("summary")
print("\n=== Reasoning Summary ===")
print(summary if summary else "N/A")


## Exercises

- Try different models from the table above and compare their style, accuracy, and creativity.
- Modify the prompt or system message to guide the LLM's behavior (e.g., “Reply in JSON”, “Summarize as bullet points”).
- Use batching and token limits to control response length.
- Build a short multi-turn conversation and observe how history impacts results.
- Try outputting structured JSON (if supported). 
- Share your findings or questions in the Slack channels above!

# Future Work Ideas

- Integrate the LLM with Slack for notifications or chatbots.
- Explore OpenAI function calling and tool integrations.
- Build pipelines or agents that chain together multiple LLM invocations.
- Test performance and differences between streaming and batch modes.
- Experiment with prompt engineering challenges (few-shot, chain-of-thought, etc).
- Explore multimodal capabilities (images, audio, documents) using OCI GenAI.

*Keep exploring and sharing!*