# Evaluating Unreleased Models with Ollama

This notebook shows how to evaluate models not yet available through API providers by running them locally with Ollama.

We'll use [Nemotron-3-Nano-30B-A3B](https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF) as an example.

## Install Ollama

**macOS:**

In [None]:
!curl -L -o /tmp/Ollama.zip "https://ollama.com/download/Ollama-darwin.zip" && \
    unzip -o /tmp/Ollama.zip -d /Applications/ && \
    rm /tmp/Ollama.zip

**Linux:** `curl -fsSL https://ollama.com/install.sh | sh`

## Start Ollama

In [None]:
!open /Applications/Ollama.app  # macOS

## Download and Register Model

In [None]:
import os

model_dir = "/tmp/nemotron"
os.makedirs(model_dir, exist_ok=True)

# Download IQ4_XS quantization (18.2 GB)
!curl -L -o /tmp/nemotron/Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf \
    "https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF/resolve/main/Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf"

In [None]:
# Create Modelfile
modelfile = '''FROM /tmp/nemotron/Nemotron-3-Nano-30B-A3B-IQ4_XS.gguf

TEMPLATE """{{- if .System }}{{ .System }}
{{- end }}
{{- range .Messages }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{ .Content }}<|eot_id|>
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{ .Content }}<|eot_id|>
{{- end }}
{{- end }}
<|start_header_id|>assistant<|end_header_id|>
"""

PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"
'''

with open("/tmp/nemotron/Modelfile", "w") as f:
    f.write(modelfile)

In [None]:
!ollama create nemotron-nano -f /tmp/nemotron/Modelfile

In [None]:
!ollama list

## Configure Environment

In [None]:
import os

os.environ["OLLAMA_BASE_URL"] = "http://localhost:11434/v1"

## Load Model

In [None]:
# Warm up the model (keeps it loaded for 60 minutes)
!ollama run nemotron-nano --keepalive 60m "hi"

## Run Evaluation

In [None]:
MODEL = "ollama/nemotron-nano"
LIMIT = 1

In [None]:
!~/.local/bin/uv run inspect eval \
    src/open_telco/teleqna/teleqna.py \
    src/open_telco/telemath/telemath.py \
    src/open_telco/telelogs/telelogs.py \
    src/open_telco/three_gpp/three_gpp.py \
    --model {MODEL} \
    --limit {LIMIT}

## View Results

In [None]:
!~/.local/bin/uv run inspect view start --log-dir logs/