# Ollama Environment Setup

This cell sets up the Python environment and connects to your local Ollama server.
It installs required packages, sets model and host variables, and prepares the Ollama client for use.

**Purpose:**
- Ensures all dependencies are installed and variables are set.
- Initializes the Ollama client so you can send prompts to the model.
- Used as a setup step in other tutorial notebooks via `%run 00_Tutorial_How-To.ipynb`.
- Run this cell first before using any prompt engineering or model interaction cells.

Refer to the project [README](../readme.md) for instructions on installing the local Ollama server.

---

## Usage Notes & Tips 💡
- This tutorial uses Qwen 2.5 7B Instruct with temperature 0. You can change the model name to any available in the [Ollama model library](https://ollama.com/search).
- Use `Shift + Enter` to execute the cell and move to the next one.

### The Ollama Python Library
We will be using the [Ollama Python Library](https://github.com/ollama/ollama-python) throughout this tutorial.

In [None]:
# Install required dependencies
%pip install -U ollama tqdm pickleshare nbformat --quiet
%pip install python-dotenv --quiet

# Load environment variables from .env file
from dotenv import load_dotenv
import os

load_dotenv()  # This loads variables from .env into the environment

# Now you can use OLLAMA_HOST from the .env file
OLLAMA_HOST = os.getenv('OLLAMA_HOST')
print(f'Using Ollama host: {OLLAMA_HOST}')

# Set up your model name
MODEL_NAME = 'qwen2.5:14b-instruct'
# Stores the MODEL_NAME variable for use across notebooks within the IPython store
%store MODEL_NAME

# Connect to Ollama Service
from ollama import Client
if 'client' not in globals():
    client = Client(host=OLLAMA_HOST)
    print(f'Connected to Ollama at {OLLAMA_HOST}')
    print(f'Using model: {MODEL_NAME}')
else:
    print("Ollama client already initialized.")

# Helper function to send a prompt or conversation to Ollama and get the response
def get_completion(prompt_or_messages, system_prompt="", max_tokens=2000, temperature=0.0):
    from ollama import Options
    if isinstance(prompt_or_messages, str):
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt_or_messages}
        ]
    else:
        # If system_prompt is provided, ensure it is the first message
        messages = prompt_or_messages
        if system_prompt:
            # Remove any existing system message to avoid duplicates
            messages = [m for m in messages if m.get("role") != "system"]
            messages = [{"role": "system", "content": system_prompt}] + messages
    response = client.chat(
        model=MODEL_NAME,
        options=Options(max_tokens=max_tokens, temperature=temperature),
        messages=messages
    )
    return response

# Pretty print function for Ollama response objects
def pretty_print_response(response):
    from pprint import pprint  # For pretty-printing fallback
    from datetime import datetime  # For friendly timestamp formatting
    def ns_to_sec(ns):  # Convert nanoseconds to seconds for readability
        return f"{ns/1e9:.2f} s" if ns is not None else None
    # Print model name used for the response
    print("Model:", getattr(response, 'model', None))
    # Print timestamp when the response was created, formatted for readability
    created_at = getattr(response, 'created_at', None)
    if created_at:
        try:
            # Try to parse and format ISO 8601 timestamp
            dt = datetime.fromisoformat(created_at.replace('Z', '+00:00'))
            print("Created at:", dt.strftime('%Y-%m-%d %H:%M:%S UTC'))
        except Exception:
            print("Created at:", created_at)
    else:
        print("Created at:", created_at)
    # Print whether the response is finished
    print("Done:", getattr(response, 'done', None))
    # Print the reason why the response finished
    print("Done reason:", getattr(response, 'done_reason', None))
    # Print total duration in seconds
    print("Total duration:", ns_to_sec(getattr(response, 'total_duration', None)))
    # Print model load duration in seconds
    print("Load duration:", ns_to_sec(getattr(response, 'load_duration', None)))
    # Print number of prompt tokens evaluated
    print("Prompt eval count:", getattr(response, 'prompt_eval_count', None))
    # Print prompt evaluation duration in seconds
    print("Prompt eval duration:", ns_to_sec(getattr(response, 'prompt_eval_duration', None)))
    # Print number of tokens evaluated in response
    print("Eval count:", getattr(response, 'eval_count', None))
    # Print evaluation duration in seconds
    # 'Evaluation' here means the process of generating tokens for the model's output (the response)
    print("Eval duration:", ns_to_sec(getattr(response, 'eval_duration', None)))
    # Print the message content from the assistant
    print("Message:")
    msg = getattr(response, 'message', None)
    if msg:
        # Print the role of the message (e.g., 'assistant')
        print("  Role:", getattr(msg, 'role', None))
        # Print the actual content of the message
        print("  Content:", getattr(msg, 'content', None))
    else:
        # Fallback: pretty-print the whole response object if no message found
        pprint(response)