# Part 1: Building the MVP AI System  

In this first part, we’ll build a **zero-shot classification system** using a local **LLM-powered approach**. Instead of training a model from scratch, we’ll leverage **Gemma 2B**’s in-context learning to classify sports teams as **from the US or Australia**—without any fine-tuning.  

But an AI system is **more than just a model**. We'll also integrate:  
- **Gradio** to build an interactive front end.  
- **SQLite** to store data and results.  
- **Datasette** for observability, allowing us to inspect predictions and iterate effectively.  

By the end of this section, you’ll have a **working MVP AI system**—a functional app with a front end, database, and structured observability to track and refine performance.  


## Getting Started

## Why Gemma 2B and Ollama?

### 🔹 Gemma 2B: A Small but Mighty LLM  

![Alt text](img/gemma2-2B.png)

[Gemma](https://ai.google.dev/gemma) is a family of open-weight models from Google DeepMind, designed for efficiency and strong reasoning capabilities.  
- **2B parameters**: Small enough to run locally but powerful enough for real tasks.  
- **Supports in-context learning**: Like other transformer models, Gemma can classify and generate outputs based on provided examples without additional training.  
- **Works well for zero-shot classification**: Like other modern LLMs, Gemma can classify inputs based on prompts without fine-tuning.  
- **Fast and cost-effective**: Unlike larger models, Gemma 2B runs efficiently on consumer hardware.  

### 🚀 Ollama: A Game Changer for Local LLMs  

![Alt text](img/ollama.svg)

[Ollama](https://ollama.com/) makes running **LLMs locally** seamless, without complex setup.  
- **Pre-configured model execution**: No need to manually set up dependencies.  
- **Efficient GPU/CPU inference**: Optimized for running on local machines.  
- **Fast iteration loop**: Load a model once, then run queries without excessive overhead.  

By combining **Gemma 2B** with **Ollama**, we get a **lightweight, fast, and cost-free AI system** that can perform real-world classification tasks directly on our machines.  

In [22]:
from ollama import chat
from ollama import ChatResponse

model = 'gemma2:2b'

def single_turn(prompt):
    response: ChatResponse = chat(model=model, messages=[
      {
        'role': 'user',
        'content': prompt,
      },
    ])
    return response['message']['content']

prompt = "Say hello to the class"
single_turn(prompt)

'Hello everyone! 👋  😊 \n\nIs there anything I can help you with today? \n'

Let's try our zero-shot classification task!

In [23]:
afl_team = "Carlton Blues"
american_team = "Tennessee Titans"


In [24]:
prompt = "Output if this is an australian or american team, only print australian or american no other output: " + f"{afl_team}"
single_turn(prompt)

'Australian \n'

In [25]:
prompt = "Output if this is an australian or american team, only print australian or american no other output: " + f"{american_team}"
single_turn(prompt)

'American \n'

## Creating our app

## 🏗️ Creating Our Gradio App  

![Alt text](img/gradio.png)

Before we dive into the code, let's talk about **Gradio**—one of the easiest ways to spin up interactive front-ends for AI applications.  

🚀 **Why Gradio?**  
- **Super fast MVP development**: Build an interactive AI demo in just a few lines of code.  
- **No frontend experience required**: Just define a Python function, and Gradio handles the UI.  
- **Part of the 🤗 Hugging Face ecosystem**: Seamlessly integrates with **models, Spaces, and APIs**.  
- **Great for rapid prototyping**: Test AI models with real users before scaling up.  

We'll use **Gradio** to build an interactive app that lets users test **Gemma 2B** for zero-shot classification—without needing a separate web framework. Let's get started! 🚀  


For instruction purposes, we've included the code below, but we'll be running our apps from the command line:

```
import gradio as gr
import ollama

model = 'gemma:2b'

def chat_with_model(prompt):
    response = ollama.chat(model=model, messages=[{'role': 'user', 'content': prompt}])
    return response['message']['content']

iface = gr.Interface(
    fn=chat_with_model,
    inputs=gr.Textbox(lines=2, placeholder="Type your message here..."),
    outputs="text",
    title="Chat with Gemma",
    description="Enter a message and get a response from the Gemma 2B model.",
)

iface.launch()
```

### 📝 What's Happening in This Code?  

- 🔄 **Imports Gradio & Ollama** – We bring in the tools we need to build the UI and interact with the model.  
- 🧠 **Defines the model** – We're using **Gemma 2B** (`gemma:2b`) to power the chatbot.  
- 💬 **Creates a function (`chat_with_model`)** – Sends user input to the model via **Ollama** and returns a response.  
- 🎨 **Builds the Gradio UI (`iface`)** –  
  - **📩 Input**: A text box for user messages.  
  - **🖥️ Output**: The model's response.  
  - **🎭 Title & Description**: A simple interface for chatting with Gemma.  
- 🚀 **Launches the app!** – Runs the interactive chatbot in your browser.  

Now, let’s fire it up and start chatting! 🔥  

## Adding observability with SQLite and Datasette

## 📊 Why Tracing & Observability Matter

Building an AI system isn’t just about **getting a response**—it’s about **understanding and improving how your model behaves over time**.  
- **👀 Observability** helps us track inputs, outputs, and model decisions, making debugging and iteration easier.  
- **🐛 Tracing conversations** lets us spot patterns, catch failure cases, and fine-tune our system for better performance.  
- **📈 Data-Driven Decisions**: Instead of guessing if the model is working well, we can use **real logged interactions** to refine prompts, improve accuracy, and compare models.  

## 🗄️ Why SQLite? A No-Brainer for MVPs  

![Alt text](img/sqlite.png)

For **early-stage apps**, **SQLite** is an **ideal** choice for logging and observability:  
- **🛠️ No Setup Hassle** – It’s a self-contained, file-based database. No server required.  
- **⚡ Fast & Lightweight** – Handles reads and writes efficiently without extra overhead.  
- **📦 Portable & Easy to Share** – Just a single file (`.db`) that works across different environments.  
- **🔗 Overwhelmingly Popular** – Used in everything from **mobile apps (iOS, Android)** to **browsers (Chrome, Firefox)** and even **airplane black boxes**!  

### 🚀 Future Scaling  
Right now, **SQLite is perfect** for logging and inspecting model interactions. Later, if we move to **multi-user or production-scale apps**, we can switch to **PostgreSQL, MySQL, or cloud-based solutions**—but for now, SQLite keeps things simple and effective.  

---

Next, we’ll **log our prompts and responses** so we can start analyzing how our system is performing! 🔍

 As above, we've included the code below, although we'll be running our apps from the command line:

```python
import gradio as gr
import ollama
import sqlite3
import datetime

# SQLite Database Setup
DB_PATH = "chat_log.db"

def setup_database():
    """Create a simple SQLite table if it doesn't exist."""
    conn = sqlite3.connect(DB_PATH)
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS chat_history (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            prompt TEXT,
            response TEXT,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()
    conn.close()

setup_database()  # Ensure the DB is set up before running the app

def chat_with_model(prompt):
    """Send user input to Ollama, get response, and log to SQLite."""
    response = ollama.chat(model="gemma:2b", messages=[{"role": "user", "content": prompt}])["message"]["content"]
    
    # Log the interaction to SQLite
    conn = sqlite3.connect(DB_PATH)
    cursor = conn.cursor()
    cursor.execute("INSERT INTO chat_history (prompt, response) VALUES (?, ?)", (prompt, response))
    conn.commit()
    conn.close()

    return response

# Gradio UI
iface = gr.Interface(
    fn=chat_with_model,
    inputs=gr.Textbox(lines=2, placeholder="Type your message here..."),
    outputs="text",
    title="Chat with Gemma",
    description="Enter a message and get a response from the Gemma 2B model. Your chats are logged in SQLite.",
)

iface.launch()
```

### 📝 What's Happening in This Code?  

- 📦 **Imports required libraries** –  
  - `gradio` for the UI  
  - `ollama` for running **Gemma 2B**  
  - `sqlite3` for logging interactions  
  - `datetime` to track timestamps  

- 🗄️ **Sets up a SQLite database (`chat_log.db`)** –  
  - Creates a **`chat_history`** table (if it doesn’t exist)  
  - Stores **`prompt`**, **`response`**, and **timestamp** for each chat  

- 💬 **Defines `chat_with_model(prompt)`** –  
  - Sends user input to **Ollama (Gemma 2B)**  
  - Logs the chat to **SQLite**  
  - Returns the model’s response  

- 🎨 **Creates a Gradio UI (`iface`)** –  
  - **📝 Input:** A text box for user queries  
  - **🖥️ Output:** The model’s response  
  - **📜 Description:** Informs users that chats are logged  

- 🚀 **Launches the app!** – Runs a browser-based chatbot with full logging  

This setup ensures we can **track every interaction**, making debugging, evaluation, and iteration much easier. Next, let's test it out! 🔍  

## 🔍 Exploring Your Data with Datasette  

![Alt text](img/datasette.png)

Once we’ve logged conversations in **SQLite**, we need an easy way to inspect and analyze them.  
That’s where **[Datasette](https://datasette.io/)** comes in—a powerful tool for **browsing, querying, and exporting SQLite databases** effortlessly.  

### 🚀 Why Datasette?  
- **Instant Database UI** – No SQL knowledge required; just open a browser and explore!  
- **Lightning Fast** – Designed for large-scale data publishing but perfect for small logs too.  
- **Built-in Querying** – Filter, sort, and search directly in a web-based UI.  
- **Easy Exporting** – Convert your database into **CSV**, **JSON**, or other formats with a click.  

### 📤 Exporting Traces to CSV  

We’ll use **Datasette** to **export chat logs to a CSV file**, making it easier to analyze failure cases and refine our AI system.  
This CSV can be used for:  
- **📊 Failure Mode Analysis** – Identify common mistakes by reviewing responses.  
- **👥 Sharing with Subject Matter Experts** – Non-technical teammates can review and give feedback.  
- **✅ Manual Evaluation** – Open in a spreadsheet and score outputs with 👍/👎 + comments.  
- **📈 Starting Systematic Evaluations** – Lay the groundwork for automated performance tracking.  

---

Next, let’s load up **Datasette**, explore our logged chats, and **export them for further analysis!** 🧐📊  

## 🎯 Recap: What We Learned  

In this notebook, we focused on building the **first version of an LLM-powered classification system** that runs **entirely locally**. Here’s what we covered:  

### 🏗️ **Building an MVP AI System**  
- Used **Gemma 2B** + **Ollama** to create a **zero-shot classification model**.  
- Built an interactive **Gradio** UI to test our system.  

### 🔍 **Logging & Observability**  
- Stored model interactions in an **SQLite** database for **tracing and debugging**.  
- Used **Datasette** to **browse logged conversations and export data**.  

### 📤 **Exporting for Further Exploration**  
- Learned how to **export chat logs to CSV** for potential later analysis.  
- Discussed how **structured logs** help track model responses over time.  

### 🚀 **Why This Matters**  
- **AI systems are more than just models—they need observability and traceability.**  
- **Logging interactions** makes debugging, iteration, and improvement possible.  
- This lays the **foundation** for deeper **evaluation techniques** in upcoming sections.  

In the next notebook, we’ll take things further by **evaluating our model using vibes**—testing different prompts and seeing how well they work. 