# End of Week 1 Exercise

A technical Q&A tool built on top of OpenAI's Chat Completions API.

**What this notebook demonstrates:**

- **Stateful conversation** — a `Chat` class that accumulates message history, so the model has full context on every turn
- **Streaming responses**
- **Multi-model comparison**
- **Conversation utilities** — history inspection and reset to start fresh

In [None]:
# imports

import os
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI

In [None]:
# constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [None]:
# set up environment

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key) > 10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")

openai = OpenAI()

## Helper

In [None]:
def display_md(text):
    """Render text as Markdown in the notebook."""
    display(Markdown(text))

## Chat Class — Stateful Conversation Wrapper

The OpenAI Chat Completions API is stateless — it has no memory of previous calls.  
This `Chat` class simulates a multi-turn conversation by maintaining a `history` list  
that grows with each exchange. Every API call sends the full history so the model  
can reference earlier context, just like a real conversation.

In [None]:
class Chat:
    """Stateful chat wrapper around OpenAI's completions API."""

    def __init__(self, client, system="You are a helpful technical assistant.", model=MODEL_GPT):
        self.client = client
        self.model = model
        self.system = system
        self.history = []  # stores user + assistant messages

    def ask(self, question, show=True):
        """Send a question and get a complete response."""
        self.history.append({"role": "user", "content": question})
        messages = [{"role": "system", "content": self.system}] + self.history
        response = self.client.chat.completions.create(
            model=self.model,
            messages=messages
        )
        reply = response.choices[0].message.content
        self.history.append({"role": "assistant", "content": reply})
        if show:
            display_md(reply)
        return reply

    def ask_stream(self, question):
        """Send a question and stream the response token-by-token."""
        self.history.append({"role": "user", "content": question})
        messages = [{"role": "system", "content": self.system}] + self.history
        stream = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True
        )
        chunks = []
        for chunk in stream:
            delta = chunk.choices[0].delta.content
            if delta:
                print(delta, end='', flush=True)
                chunks.append(delta)
        print()
        reply = ''.join(chunks)
        self.history.append({"role": "assistant", "content": reply})
        display_md(reply)

    def show_history(self):
        """Display the full conversation history."""
        display_md(f'**[SYSTEM]**\n\n{self.system}\n\n---')
        for msg in self.history:
            role = msg["role"].upper()
            display_md(f'**[{role}]**\n\n{msg["content"]}\n\n---')

    def reset(self):
        """Clear conversation history to start fresh."""
        self.history = []

## Ask GPT-4o-mini (Streaming)

In [None]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [None]:
gpt_chat = Chat(openai, model=MODEL_GPT)
gpt_chat.ask_stream(question)

## Ask Llama 3.2 (via Ollama)

Ollama exposes an OpenAI-compatible API locally, so we can reuse the same `Chat` class  
with a different client and model — no code changes needed.

In [None]:
ollama_client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
llama_chat = Chat(ollama_client, model=MODEL_LLAMA)
llama_chat.ask(question)

## View History

In [None]:
gpt_chat.show_history()

## Reset

In [None]:
gpt_chat.reset()

## Ask Gemini (via OpenRouter)

OpenRouter also provides an OpenAI-compatible endpoint, so the same `Chat` class  
works seamlessly with Gemini — just swap the client and model string.

In [None]:
openrouter_api_key = os.getenv('OPENROUTER_API_KEY')
GEMINI_MODEL = 'google/gemini-3-flash-preview'

gemini_client = OpenAI(
    api_key=openrouter_api_key,
    base_url="https://openrouter.ai/api/v1/"
)
print("Gemini client initialized successfully.")

gemini_chat = Chat(gemini_client, model=GEMINI_MODEL)

In [None]:
gemini_chat.ask(question)