# <center>Tracing Llama 3.2 with the OpenAI API </center>
This guide demonstrates how to use trace open-source models like Llama 3.2, utilizing the OpenAI API.

To instrument an open-source Llama model, Ollama has built-in compatibility with the OpenAI [Chat Completions API](https://github.com/ollama/ollama/blob/main/docs/openai.md), making it possible to use more tooling and applications with open-source models locally.

In [None]:
!pip install arize-otel colab-xterm ollama "openai>=1.26" openinference-instrumentation-openai opentelemetry-sdk opentelemetry-exporter-otlp

### Installing Ollama

Download and execute the installation script from the Ollama website. The script will handle the installation process automatically, including downloading and installing necessary dependencies.

In [None]:
!curl https://ollama.ai/install.sh | sh

### Launching Xterm


Launch the xterm terminal within the Colab.

In [None]:
%load_ext colabxterm

### Launch Terminal & Start the Ollama Server
Once Ollama is installed and the terminal is running, we can start the server using the following command. Be sure to run this in the `xterm` terminal above!

```shell
ollama serve &
```

The `&` at the end runs the command in the background, allowing you to continue using your terminal.

In [None]:
%xterm

## Import Libraries




In [None]:
import os

import ollama
import pandas as pd
from arize_otel import Endpoints, register_otel
from openai import OpenAI

# OpenInference - Instrumentation
from openinference.instrumentation.openai import OpenAIInstrumentor
from tqdm import tqdm

### Download Llama 3.2

Using the `ollama` library , we can request the `llama3.2:1b` model to run in Colab.

In [None]:
LLAMA_MODEL_NAME = "llama3.2:1b"

ARIZE_MODEL_ID = f"arize_{LLAMA_MODEL_NAME}_openai"

In [None]:
ollama.pull(LLAMA_MODEL_NAME)

### Register OTEL

In [None]:
register_otel(
    endpoints=Endpoints.ARIZE,
    space_id="U3BhY2U6NTI3MjpoVDJX",  # in app space settings page
    api_key="2b8d908de9c051dbd1f",  # in app space settings page
    model_id=ARIZE_MODEL_ID,  # name this to whatever you would like
    model_version="1",
)

In [None]:
# Instrument OpenAI calls in your application
OpenAIInstrumentor().instrument()

### Create OpenAI Client

In [None]:
oai_client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",  # required, but unused
)

### Run Queries

Run queries against `llama3.2:1b`, using the OpenAI API

In [None]:
def ollama_query(oai_client: OpenAI, model_name: str, query: str):
    response = oai_client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": query},
        ],
    )

    return response

In [None]:
lst_questions = [
    "What are Large Language Models?",
    "How do large language models work?",
    "How are LLMs trained, and what data is used?",
    "In a large language model, what is a hallucination?",
    "What are the main applications of large language models?",
]

In [None]:
for question in tqdm(lst_questions):
    llm_response = ollama_query(
        oai_client=oai_client, model_name=LLAMA_MODEL_NAME, query=question
    )