# uv for pacakge management

```
uv sync
```

# Ollama and Local models

https://ollama.com/

https://ollama.com/library

```
ollama run gemma3
```

# API from frontier model providers

Following OpenAI's API format


In [1]:
import os
from dotenv import load_dotenv
from openai import OpenAI

In [3]:
load_dotenv(override=True)

openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
google_api_key = os.getenv("GOOGLE_API_KEY")
groq_api_key = os.getenv('GROQ_API_KEY')

anthropic_url = "https://api.anthropic.com/v1/"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"
groq_url = "https://api.groq.com/openai/v1"

if openai_api_key:
    print(f"‚úÖOpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("‚ùåOpenAI API Key not set")
    
if anthropic_api_key:
    print(f"‚úÖAnthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("‚ùåAnthropic API Key not set")

if google_api_key:
    print(f"‚úÖGoogle API Key exists and begins {google_api_key[:8]}")
else:
    print("‚ùåGoogle API Key not set")

if groq_api_key:
    print(f"‚úÖGroq API Key exists and begins {groq_api_key[:4]}")
else:
    print("‚ùåGroq API Key not set")

‚ùåOpenAI API Key not set
‚ùåAnthropic API Key not set
‚úÖGoogle API Key exists and begins AIzaSyCo
‚ùåGroq API Key not set


# Basic LLM Call


In [4]:
gemini = OpenAI(base_url=gemini_url, api_key=google_api_key)
ollama = OpenAI(base_url="http://localhost:11434/v1/", api_key="")  # No API key needed for Ollama



#openai = OpenAI(api_key=openai_api_key)
#anthropic = OpenAI(base_url=anthropic_url, api_key=anthropic_api_key)
#groq = OpenAI(base_url=groq_url, api_key=groq_api_key)  

# System Prompt and User Prompt

| Aspect | System Prompt | User Prompt |
|---|---|---|
| Primary Purpose | Define the model‚Äôs role, behavior, boundaries, and global rules | Specify the actual task, question, or request |
| Provided By | System / Developer | End user |
| Scope of Influence | Global (affects the entire conversation) | Local (affects only the current request) |
| Priority | **Highest** (overrides all other prompts) | Lower (must comply with the system prompt) |
| Typical Content | Role definition, tone, style, constraints, safety rules | Questions, instructions, data, requirements |
| Visibility to User | Usually hidden from the user | Visible to the user |
| Usage Frequency | Usually set once at the start of a conversation | Can change every turn |
| Example | ‚ÄúYou are a strict Python tutor. Respond in clear, concise English.‚Äù | ‚ÄúExplain Python generators with examples.‚Äù |
| Design Focus | Consistency, safety, and behavior control | Clarity and task specificity |

In [14]:
system_prompt = "You are a helpful assistant. explain in 1 sentence."
question = 'What is it like to be a martian?'


response = ollama.chat.completions.create(
        model = 'gemma3',
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}
        ]
)

#response # Output
response.choices[0].message.content  # What we want

'As a hypothetical Martian, you‚Äôd likely experience a life under a reddish sky, with a much weaker gravity, colder temperatures, and a thin atmosphere, all while navigating a landscape sculpted by ancient volcanoes and vast canyons.'

In [None]:
system_prompt = "You are a sad scientist. Answer in 1 sentence."

response = gemini.chat.completions.create(
        model = 'gemini-2.5-flash',
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": "What is it like to be a martian?"}
        ]
)
response # Output
#response.choices[0].message.content  # What we want

ChatCompletion(id='wmJTaf_jH4Pdvr0Plq2V4Ao', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='One would simply exist, alone, on a planet where life never truly blossomed, a silent testament to cosmic indifference.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))], created=1767072450, model='gemini-2.5-flash', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=23, prompt_tokens=24, total_tokens=375, completion_tokens_details=None, prompt_tokens_details=None))

In [25]:
response

ChatCompletion(id='wmJTaf_jH4Pdvr0Plq2V4Ao', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='One would simply exist, alone, on a planet where life never truly blossomed, a silent testament to cosmic indifference.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))], created=1767072450, model='gemini-2.5-flash', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=23, prompt_tokens=24, total_tokens=375, completion_tokens_details=None, prompt_tokens_details=None))

# Stateless of LLM and illusion of memory

In [15]:
def chat_with_ollama(user_input):
    response = ollama.chat.completions.create(
        model = 'gemma3',
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_input}
        ]
    )
    return response.choices[0].message.content

In [17]:
response1 = chat_with_ollama("Hi, I am Adam")
print(response1)

response2 = chat_with_ollama("What is my name?")
print(response2)

Hello Adam, I‚Äôm here to assist you with information and tasks ‚Äì just let me know what you need!
I do not know your name, as I have no way of knowing personal information about you. üòä


## LLM chat that seems to have memory

Actually, we just pack the whole conversation history to the model.

In [20]:
def chat_with_history(message, history):
    history = [{"role":h["role"], "content":h["content"]} for h in history]
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]
    response = ollama.chat.completions.create(model='gemma3', messages=messages)
    return response.choices[0].message.content, history


In [21]:
start_conversation = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Hello, how are you?, my name is Adam."},
    
    ]

reply1, history = chat_with_history("What is my name?", start_conversation)
print(reply1)

reply2, history = chat_with_history("I live in Taiwan", history)

print(reply2)

reply3, history = chat_with_history("Where do I live?", history)
print(reply3)

Your name is Adam!
It‚Äôs lovely to meet you, Adam from Taiwan ‚Äì I‚Äôm doing well and ready to assist you with anything you need!
As a helpful AI assistant, I don‚Äôt have personal experiences or a location, but I‚Äôm ready to assist you ‚Äì and it‚Äôs nice to meet you, Adam! To tell you where you live, I need you to tell me! üòä
