
<center>
    <p style="text-align:center">
    <img alt="arize logo" src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="300"/>
        <br>
        <a href="https://docs.arize.com/arize/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/client_python">GitHub</a>
        |
        <a href="https://arize-ai.slack.com/join/shared_invite/zt-11t1vbu4x-xkBIHmOREQnYnYDH1GDfCg">Community</a>
    </p>
</center>

# <center>Tracing Llama 3.2 with the OpenAI API </center>
This guide demonstrates how to use trace open-source models like Llama 3.2, utilizing the OpenAI API.

To instrument an open-source Llama model, Ollama has built-in compatibility with the OpenAI [Chat Completions API](https://github.com/ollama/ollama/blob/main/docs/openai.md), making it possible to use more tooling and applications with open-source models locally.

In [3]:
!pip install -q "arize-otel>=0.7.0" "openinference-instrumentation-openai>=0.1.18" 

!pip install -q colab-xterm==0.2.0 ollama==0.4.4 openai==1.57.1 opentelemetry-sdk==1.28.2 opentelemetry-exporter-otlp==1.28.2

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
opentelemetry-instrumentation-fastapi 0.49b0 requires opentelemetry-instrumentation==0.49b0, but you have opentelemetry-instrumentation 0.49b2 which is incompatible.
opentelemetry-instrumentation-fastapi 0.49b0 requires opentelemetry-semantic-conventions==0.49b0, but you have opentelemetry-semantic-conventions 0.49b2 which is incompatible.
opentelemetry-instrumentation-fastapi 0.49b0 requires opentelemetry-util-http==0.49b0, but you have opentelemetry-util-http 0.49b2 which is incompatible.
chromadb 0.5.23 requires tokenizers<=0.20.3,>=0.13.2, but you have tokenizers 0.21.0 which is incompatible.
opentelemetry-instrumentation-asgi 0.49b0 requires opentelemetry-instrumentation==0.49b0, but you have opentelemetry-instrumentation 0.49b2 which is incompatible.
opentelemetry-instrumentation-asgi 0.49b0 requires op

### Installing Ollama

Download and execute the installation script from the Ollama website. The script will handle the installation process automatically, including downloading and installing necessary dependencies.

In [None]:
!curl https://ollama.ai/install.sh | sh

### Launching Xterm


Launch the xterm terminal within the Colab.

In [None]:
%load_ext colabxterm

### Launch Terminal & Start the Ollama Server
Once Ollama is installed and the terminal is running, we can start the server using the following command. Be sure to run this in the `xterm` terminal below!

```shell
ollama serve &
```

The `&` at the end runs the command in the background, allowing you to continue using your terminal.

In [None]:
%xterm

## Import Libraries




In [None]:
from getpass import getpass

import ollama
from arize.otel import register
from openai import OpenAI

# OpenInference - Instrumentation
from openinference.instrumentation.openai import OpenAIInstrumentor
from tqdm import tqdm

### Download Llama 3.2

Using the `ollama` library , we can request the `llama3.2:1b` model to run in Colab.

In [None]:
LLAMA_MODEL_NAME = "llama3.2:1b"

PROJECT_NAME = f"arize_{LLAMA_MODEL_NAME}_openai"

In [None]:
ollama.pull(LLAMA_MODEL_NAME)

### Register OTEL

In [None]:
SPACE_ID = getpass("🔑 Enter your Arize Space ID: ")
API_KEY = getpass("🔑 Enter your Arize API key: ")

In [None]:
tracer_provider = register(
    space_id=SPACE_ID,  # in app space settings page
    api_key=API_KEY,  # in app space settings page
    project_name=PROJECT_NAME,  # name this to whatever you would like
)

In [None]:
# Instrument OpenAI calls in your application
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

### Create OpenAI Client

In [None]:
oai_client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",  # required, but unused
)

### Run Queries

Run queries against `llama3.2:1b`, using the OpenAI API

In [None]:
def ollama_query(oai_client: OpenAI, model_name: str, query: str):
    response = oai_client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": query},
        ],
    )

    return response

In [None]:
lst_questions = [
    "What are Large Language Models?",
    "How do large language models work?",
    "How are LLMs trained, and what data is used?",
    "In a large language model, what is a hallucination?",
    "What are the main applications of large language models?",
]

In [None]:
for question in tqdm(lst_questions):
    llm_response = ollama_query(
        oai_client=oai_client, model_name=LLAMA_MODEL_NAME, query=question
    )