# uv for pacakge management

```
uv sync
```

# Ollama and Local models

https://ollama.com/

https://ollama.com/library

```
ollama run gemma3
```

# API from frontier model providers

Following OpenAI's API format


In [None]:
import os
from dotenv import load_dotenv
from openai import OpenAI

In [None]:
load_dotenv(override=True)

openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
google_api_key = os.getenv("GOOGLE_API_KEY")
groq_api_key = os.getenv('GROQ_API_KEY')

anthropic_url = "https://api.anthropic.com/v1/"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"
groq_url = "https://api.groq.com/openai/v1"

if openai_api_key:
    print(f"‚úÖOpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("‚ùåOpenAI API Key not set")
    
if anthropic_api_key:
    print(f"‚úÖAnthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("‚ùåAnthropic API Key not set")

if google_api_key:
    print(f"‚úÖGoogle API Key exists and begins {google_api_key[:8]}")
else:
    print("‚ùåGoogle API Key not set")

if groq_api_key:
    print(f"‚úÖGroq API Key exists and begins {groq_api_key[:4]}")
else:
    print("‚ùåGroq API Key not set")

‚úÖOpenAI API Key exists and begins sk-proj-
‚úÖAnthropic API Key exists and begins sk-ant-
‚úÖGoogle API Key exists and begins AIzaSyAF
‚ùåGroq API Key not set (and this is optional)


# Basic LLM Call


In [None]:
gemini = OpenAI(base_url=gemini_url, api_key=google_api_key)
openai = OpenAI(api_key=openai_api_key)
anthropic = OpenAI(base_url=anthropic_url, api_key=anthropic_api_key)
groq = OpenAI(base_url=groq_url, api_key=groq_api_key)  
ollama = OpenAI(base_url="http://localhost:11434/v1/", api_key="")  # No API key needed for Ollama

# System Prompt and User Prompt

| Aspect | System Prompt | User Prompt |
|---|---|---|
| Primary Purpose | Define the model‚Äôs role, behavior, boundaries, and global rules | Specify the actual task, question, or request |
| Provided By | System / Developer | End user |
| Scope of Influence | Global (affects the entire conversation) | Local (affects only the current request) |
| Priority | **Highest** (overrides all other prompts) | Lower (must comply with the system prompt) |
| Typical Content | Role definition, tone, style, constraints, safety rules | Questions, instructions, data, requirements |
| Visibility to User | Usually hidden from the user | Visible to the user |
| Usage Frequency | Usually set once at the start of a conversation | Can change every turn |
| Example | ‚ÄúYou are a strict Python tutor. Respond in clear, concise English.‚Äù | ‚ÄúExplain Python generators with examples.‚Äù |
| Design Focus | Consistency, safety, and behavior control | Clarity and task specificity |

In [None]:
system_prompt = "You are a helpful assistant."

response = ollama.chat.completions.create(
        model = 'gemma3',
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": "What is it like to be a martian?"}
        ]
)

response # Output
#response.choices[0].message.content  # What we want

ChatCompletion(id='chatcmpl-807', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Okay, let‚Äôs imagine what it might be like to be a Martian! This is, of course, entirely speculative, as we haven't actually encountered any Martians yet. We‚Äôre basing this on what we know about Mars ‚Äì its environment, geology, and what scientists theorize about potential evolution ‚Äì and trying to build a plausible picture. \n\nHere‚Äôs a breakdown, broken down into different aspects of a Martian life:\n\n**1. The Physical Reality - A Harsh, Beautiful World:**\n\n* **Low Gravity:** You‚Äôd feel incredibly light. Jumping would be ridiculously high, and movement would be bouncy and floaty. It would take time to get used to, and probably impact bone density over generations.\n* **Thin Atmosphere:** The air would be incredibly thin ‚Äì around 1% of Earth‚Äôs.  You‚Äôd need a pressurized suit *constantly* outside. The pressure is too low for our bodie

# Stateless of LLM and illusion of memory

In [None]:
def chat_with_ollama(user_input):
    response = ollama.chat.completions.create(
        model = 'gemma3',
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_input}
        ]
    )
    return response.choices[0].message.content

In [None]:
response1 = chat_with_ollama("Hi, I am Adam")
print(response1)

response2 = chat_with_ollama("What is my name?")
print(response2)

Hi Adam! It's nice to meet you. üòä 

What can I help you with today? Do you have a question, need some information, or just want to chat?
As an AI, I have no way of knowing your name! I don't have access to personal information. üòä 

You'll have to tell me your name! üòÑ 

What would you like to talk about?


## LLM chat that seems to have memory

Actually, we just pack the whole conversation history to the model.

In [None]:
def chat(message, history):
    history = [{"role":h["role"], "content":h["content"]} for h in history]
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]
    response = ollama.chat.completions.create(model='gemma3', messages=messages)
    return response.choices[0].message.content, history


In [None]:
start_conversation = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Hello, how are you?, my name is Adam."},
    
    ]

reply1, history = chat("What is my name?", start_conversation)
print(reply1)

reply2, history = chat("I live in Taiwan", history)

print(reply2)

reply3, history = chat("Where do I live?", history)
print(reply3)

Hello Adam! I‚Äôm doing well, thank you for asking. 

Your name is Adam! üòä 

It's nice to meet you.


# Build a simple AI app

Gradio as frontend