## 1. Welcome: Setting the Scene  

![LLM SDLC Overview](img/LLM-SDLC_Fig1_edit3-1.png)

In this session, we're building the **first iteration** of an app that queries text files. We'll start with **LlamaIndex** to get something working quickly, but then we will **peel back the layers** to see exactly how it works.

**The Catch:** While LlamaIndex makes building fast, it hides the complexity. You won't see the prompts or retrieval logic. This is "Proof-of-Concept Purgatory."

**What We'll Build:**
- **App 1 & 2:** A query app using LlamaIndex abstractions.
- **App 3 & 4:** A **Vanilla Python** version where we control every line of code.
- **App 5:** Adding **Logging and Observability** with SQLite.
- **App 6:** A **Multimodal** app that can read PDFs directly (images and all).

**By the end:** You'll have built an AI app three different ways: using a framework, using raw Python, and using multimodal capabilities.

ðŸ‘‰ **By the end of this session, you will have:**  
- A working LLM-powered app.
- A **Gradio frontend** for uploading PDFs.
- **Logging** to track every query and response.

## 2. Setup: API Keys  

![AI Studio](img/ai-studio.png)

### Set Up API Keys  
Before running the apps, ensure your **API keys** are configured:

1. **Get your API keys:**
   - Google Gemini: [AI Studio](https://aistudio.google.com/)
   - OpenAI: [Platform](https://platform.openai.com/) (optional)
   - Anthropic: [Console](https://console.anthropic.com/) (optional)

2. **Configure your environment:**
   - Copy `.env.example` to `.env`: `cp .env.example .env`
   - Add your API keys to the `.env` file
   - Source the file: `source .env`

See the main README.md for detailed setup instructions.

## 3. Exploring the First App: Querying Text Files  

![LlamaIndex](img/LlamaIndex.png)

### What You'll Do  
- Start with `1-app-query.py` to build the **core query engine** for LinkedIn text profiles.
- The app will load `.txt` files from the `/data` directory and allow simple querying.  

### How It Works  
We're using **LlamaIndex** to handle the complexity for us. This is the "Magic Box" approach:
1. **Load documents:** `SimpleDirectoryReader` loads text files from the `/data` directory.
2. **Process and index:** LlamaIndex creates embeddings and builds a vector index (we don't see how this works).
3. **Query:** When you ask a question, LlamaIndex retrieves relevant context and sends it to the LLM.

**What We Configure:**
- By default, this uses **OpenAI** (`gpt-3.5-turbo` or `gpt-4o` depending on your key/defaults) for both embeddings and generation.

**What We Can't See:**
- How LlamaIndex constructs the final prompt.
- What context gets retrieved from your files.
- The exact instructions sent to the model.

ðŸ‘‰ **Run the App:**
python apps/1-app-query.py**Alternative Models:**
If you prefer to use other providers, we have set up alternative scripts:
- `1a-app-query-gemini.py` (uses Google Gemini)
- `1b-app-query-claude.py` (uses Anthropic Claude)

**Reflection:**  
You got an answer, but do you know *why*? In the next steps, we will peel back these layers to understand exactly what is happening under the hood.

## 4. Adding Interactivity: Gradio Frontend (App 2)  

![Gradio App](img/gradio-app.png)

Now that the query engine works, let's wrap it in a user interface.  
- We'll use `2-app-front-end.py` to create a **Gradio frontend**.  
- This allows users to upload PDF files and ask questions via a web browser.  

### How It Works  
1. **PDF Upload:** Users upload PDF files through the Gradio interface.
2. **Text Extraction:** The app uses `PyMuPDF` to extract text from the uploaded PDF.
3. **Dynamic Indexing:** A new LlamaIndex vector index is created on the fly for the uploaded file.
4. **Query Interface:** Users type questions and get responses.

**Key Difference:**
Unlike App 1, which pre-loaded files from a folder, this version builds the index *dynamically* when you upload a file.

**Run the App:**

```bash
python apps/2-app-front-end.py
```

*Click the local URL (e.g., http://127.0.0.1:7860) to open the interface.*

**The Problem:**  
While this looks like a complete app, we are still in "Proof of Concept Purgatory." We have no logs, we don't know what the model is actually seeing, and debugging is difficult.

## 5. Peeling Back the Layers: Vanilla Python + LLM API (App 3)

Now we strip away LlamaIndex. We want to see **exactly** what we are sending to the model.

In `3-vanilla-python-query.py`, we implement the "RAG" pattern manually:
1.  **Read the file:** We manually open and read `data/o1.txt`.
2.  **Construct the Prompt:** We use a Python f-string to insert the text directly into a prompt template.
3.  **Call the API:** We send that specific string to Gemini directly.

### Why do this?
This gives us **determinism** and **control**. We know exactly what context the model has.

**Run the App:**
python apps/3-vanilla-python-query.py**Look at the code:** Open `apps/3-vanilla-python-query.py`. Notice how the `prompt` variable is constructed.

## 6. Rebuilding the Frontend:  Vanilla Python + LLM API + Gradio (App 4)

Now let's put our "Vanilla" logic back behind a web interface.

In `4-vanilla-gradio-query.py`, we combine:
- **Gradio** (for the UI)
- **PyMuPDF** (for explicit text extraction)
- **Gemini API** (direct calls)

### How It Works
1. **Upload:** User uploads a PDF.
2. **Extract:** We explicitly loop through pages and extract text using `page.get_text()`.
3. **Prompt:** We stuff that text into our f-string prompt.
4. **Generate:** We get the answer.

**Run the App:**
python `apps/4-vanilla-gradio-query.py`

## 7. Logging and Observability (App 5)

In `5-vanilla-gradio-sqlite.py`, we extend our vanilla Gradio app by adding SQLite logging to track queries and responses.
- This adds observability to the existing web interface.
- Allows iterative improvements based on real usage data.

### How It Works
1. **Database Setup:** SQLite database (`data/interactions.db`) stores each interaction.
2. **Log Each Query:** Every PDF upload, query, and response is logged with metadata.
3. **Rich Metadata:** Logs include PDF name, timestamp, unique interaction ID, query, and response.

**Key Additions:**
- **Interaction tracking:** Each query-response pair gets a unique ID and timestamp.
- **PDF context:** Logs which PDF was queried, helping identify document-specific patterns.

**Run the App:**
`python apps/5-vanilla-gradio-sqlite.py`


## 8. Visualizing with Datasette  

![Datasette](img/datasette.png)

To analyze logs, use **Datasette** to query and filter results. Datasette is a tool for exploring and publishing data that makes SQLite databases easy to browse and query through a web interface.

```bash
datasette data/interactions.db
```  
- This launches a web interface where you can view all logged interactions
- Filter and search queries by PDF name, timestamp, or content
- Export data for further analysis if needed


## 9. Going Multimodal (App 6)

Finally, in `6-multimodal-pdf-extraction.py`, we ditch the manual text extraction entirely.

**The Old Way (Apps 2-5):**
`PDF -> Python Library (PyMuPDF) -> Text -> LLM`
*Problem: We lose charts, images, and layout information.*

**The New Way (App 6):**
`PDF -> LLM (Gemini 2.5)`
*Gemini is **multimodal**, meaning it can "see" the PDF file directly, including images, charts, and layouts, without us needing to turn it into text first.*

**Run the App:**
```bash
python apps/6-multimodal-pdf-extraction.py
```

## 10. Key Takeaways  

- **Beyond the Magic Box:** We started with LlamaIndex ("Proof-of-Concept Purgatory") but then peeled back the layers to see exactly how prompts are constructed.
- **Vanilla Control:** We learned that manual implementation (Vanilla Python) gives us determinism and makes debugging easier compared to black-box abstractions.
- **Observability:** We built a logging system with SQLite and Datasette to see exactly what users are asking and what the model is returning.
- **Multimodal capabilities:** We saw how modern models (like Gemini 1.5/2.5) can natively read PDFs, preserving context like charts and images that simple text extractors miss.