## 1. Welcome â€“ Setting the Scene  

![LLM SDLC Overview](img/LLM-SDLC_Fig1_edit3-1.png)

In this session, we're building the **first iteration** of an app that queries LinkedIn profile text files. We'll use **LlamaIndex** to get something working quickly: it handles embeddings, indexing, and retrieval for us, so we can focus on building the app.

**The Catch:** While LlamaIndex makes building fast, it hides what's happening behind the scenes. You won't see the prompts it constructs or how it retrieves context: you just get answers. This is what we'll call "proof-of-concept purgatory": you have a working app, but you can't see or control the core logic.

**What We'll Build:**
- A working LLM-powered query app
- An interactive **Gradio frontend** for uploading PDFs
- **Logging and observability** with SQLite and Datasette
- Optional: **Deployment to Modal** â€“ Ship your app so it's available even when your laptop is closed


**By the end:** You'll have all the moving parts of an AI app (frontend, backend, logging, visualization), but you'll also experience the frustration of working with abstractions you can't inspect.

**What's Next:** The optional homework will have you rebuild without LlamaIndex to see the actual prompts. In the next session, we'll dive into prompt engineering so you can control what the model sees.

ðŸ‘‰ **By the end of this session, you will have:**  
- A working LLM-powered app that queries text files.  
- A simple **Gradio frontend** for uploading PDFs.  
- Basic **logging** and **observability** using SQLite and Datasette.

## 2. Setup â€“ API Keys  

![AI Studio](img/ai-studio.png)

### Set Up API Keys  
Before running the apps, ensure your **API keys** are configured:

1. **Get your API keys:**
   - Google Gemini: [AI Studio](https://aistudio.google.com/)
   - OpenAI: [Platform](https://platform.openai.com/) (optional)
   - Anthropic: [Console](https://console.anthropic.com/) (optional)

2. **Configure your environment:**
   - Copy `.env.example` to `.env`: `cp .env.example .env`
   - Add your API keys to the `.env` file
   - Source the file: `source .env`

See the main README.md for detailed setup instructions.

## 3. Exploring the First App â€“ Querying Text Files  

![LlamaIndex](img/LlamaIndex.png)

### What You'll Do  
- Start with `1a-app-query-gemini.py` to build the **core query engine** for LinkedIn text profiles using Google Gemini.  
- The app will load `.txt` files from the `/data` directory and allow simple querying.  

### How It Works  
LlamaIndex handles the complexity for us:
1. **Load documents:** `SimpleDirectoryReader` loads text files from the `/data` directory
2. **Process and index:** LlamaIndex creates embeddings and builds a vector index (we don't see how this works)
3. **Query:** When you ask a question, LlamaIndex retrieves relevant context and sends it to the LLM (we don't see the prompt it constructs)

**What We Configure:**
- **Embedding model:** `text-embedding-004` (Gemini) for converting text to vectors
- **LLM:** `gemini-2.5-flash` for generating answers

**What We Can't See:**
- How LlamaIndex constructs the final prompt from your query and retrieved context
- What context gets retrieved and how
- The exact instructions sent to the model

ðŸ‘‰ See `1a-app-query-gemini.py` for the full implementation.

**Alternative Models:**
- Use `1-app-query.py` for OpenAI (default LlamaIndex settings)
- Use `1b-app-query-claude.py` for Anthropic Claude
- **Note:** To use Anthropic, you'll also need an OpenAI API key because Anthropic doesn't provide embeddings. LlamaIndex will use OpenAI embeddings with Claude as the LLM.

**Reflection Question:** What prompt was actually sent to the LLM? Check the code and think about how LlamaIndex constructs the final prompt from your query and the retrieved context.

## 4. Adding Interactivity â€“ Gradio Frontend (App 2)  

![Gradio App](img/gradio-app.png)

Once the query engine works, you'll expand by adding a **Gradio frontend** in `2-app-front-end.py`.  
- Gradio provides a web interface where users can upload PDFs and type queries directly.  
- This makes the app accessible without requiring command-line interaction.  

### How It Works  
1. **PDF Upload:** Users upload PDF files through the Gradio interface
2. **Text Extraction:** PyMuPDF extracts text from the uploaded PDF
3. **Dynamic Indexing:** A new vector index is created for each uploaded PDF (unlike App 1 which pre-loads files)
4. **Query Interface:** Users type questions and get responses through the web UI

**Key Differences from App 1:**
- **File upload handling:** Accepts PDFs directly via web interface
- **Per-document indexing** Creates a fresh index for each upload rather than loading pre-indexed files
- **Web-based UI:** Uses `gr.Blocks()` for a more structured interface with file upload, query input, and response display

ðŸ‘‰ See `2-app-front-end.py` for the full implementation.


## 5. Logging and Observability â€“ App 3  

![SQLite](img/SQLITE.jpeg)

In `3-app-log.py`, you'll extend the Gradio app from App 2 by adding SQLite logging to track queries and responses.  
- This adds observability to the existing web interface.  
- Allows iterative improvements based on real usage data.  

### How It Works  
1. **Database Setup:** SQLite database (`pdf_qa_logs.db`) stores each interaction
2. **Log Each Query:** Every PDF upload, query, and response is logged with metadata
3. **Rich Metadata:** Logs include PDF name, timestamp, unique interaction ID, query, and response
4. **Analysis:** Use Datasette to visualize and analyze patterns in the logged data

**Key Additions:**
- **Interaction tracking:** Each query-response pair gets a unique ID and timestamp
- **PDF context:** Logs which PDF was queried, helping identify document-specific patterns
- **Query logging:** Stores both the query and the LLM's response for analysis

ðŸ‘‰ See `3-app-log.py` for the full implementation.





## 6. Visualizing with Datasette  

![Datasette](img/datasette.png)

To analyze logs, use **Datasette** to query and filter results. Datasette is a tool for exploring and publishing data that makes SQLite databases easy to browse and query through a web interface.

```bash
datasette pdf_qa_logs.db
```  
- This launches a web interface where you can view all logged interactions
- Filter and search queries by PDF name, timestamp, or content
- Export data for further analysis if needed


## 7. Key Takeaways  

- **Built the MVP:** A simple LLM-powered app that queries LinkedIn text profiles.  
- **Added Interactivity:** Introduced Gradio to create a user-friendly interface for querying PDFs.  
- **Started Logging:** Integrated SQLite to track queries and responses, preparing for observability and testing.  
- **Visualized Logs:** Used Datasette to explore logged data, reinforcing observability.

## Optional: Deploying the Gradio App on Modal  

![Modal](img/modal.jpg)

Now that we've built the Gradio app for querying PDFs, let's deploy it to **Modal**. This allows you to run the app from anywhere without managing local infrastructure.  

### Why Deploy to Modal?  
- **Fast Deployment:** Modal simplifies deploying Python apps with minimal configuration.  
- **Scalable:** Deploy ephemeral development servers or durable production apps.  
- **$500 in Modal Credits:** As part of this course, you'll receive **$500 in Modal credits**.  

### Prerequisites:  
- **Modal Account:** Sign up at [modal.com](https://modal.com) and redeem your credits using the provided form.  
- **Modal SDK:** Pre-installed in the Codespace.  

### Deployment Steps:  

0. **Authenticate with Modal (first time only):**  
   ```bash
   modal token new
   ```  
   This will open a browser for authentication.

1. **Set Up Gemini API Key:**  
   Make sure you're logged in to Modal, then go to [https://modal.com/secrets/](https://modal.com/secrets/) and create a new secret:
   - Secret name: `gemini`
   - Add environment variable: `GOOGLE_API_KEY` = your Gemini API key

2. **Navigate to the Deployment Directory:**  
   Go to the `workshops/workshop-1/apps/deploy/` directory where the deployment script is located.  

3. **Run the Gradio App:**  
   For an ephemeral development server:  
   ```bash
   modal serve modal_wrapper
   ```  
   
   For durable deployment:  
   ```bash
   modal deploy modal_wrapper
   ```  

### Testing the Deployment:  
- Open the URL provided by Modal and test the app by uploading PDFs and running queries.
- You can also view your running app and logs at [https://modal.com/apps](https://modal.com/apps) (make sure you're logged in).  


### Homework: Direct API Calls Without Frameworks

As we noted above, frameworks like LlamaIndex obscure the actual prompt sent to the LLM. This exercise will help you see what's really happening and give you direct control over the prompt.

**The Goal:** Build a version of your Gradio app that sends prompts directly to the LLM API without using frameworks. You'll manually construct the prompt using the PDF text and user query, send it to the model, and log the full prompt to your database.

**Steps:**

1. **Set up your API key:** Configure your environment with the API key for your chosen LLM provider (Gemini, OpenAI, or Claude).

2. **Test the Direct API Call:** Before building the full app, verify you can call the LLM API directly. Send a simple query to make sure your API key works and you get a response.
   - For Gemini: [Google AI Python SDK docs](https://ai.google.dev/gemini-api/docs/quickstart?lang=python)
   - For OpenAI: [OpenAI Python SDK docs](https://platform.openai.com/docs/quickstart)
   - For Anthropic Claude: [Anthropic Python SDK docs](https://docs.anthropic.com/en/api/getting-started)

3. **Read in local text data:** Load a text file (like `apps/data/hbaLI.txt` from the workshop) to use as test document content. Print it to see what you're working with.

4. **Combine the document and query in a prompt:** Construct a prompt that includes both your test document text and a user query. Send this to the LLM and see how it responds with context.

5. **Build the Gradio interface:** Create a simple web UI that accepts PDF uploads and user queries. Extract text from the uploaded PDF using PyMuPDF.

6. **Add full prompt logging:** Extend your Gradio app to log every interaction to SQLite, storing the **complete prompt** (not just the user query), the response, and metadata like PDF name and timestamp.

7. **Use Datasette to explore the logs:** View your logged interactions and compare the `query` column (what the user typed) with the `full_prompt` column (what the model actually received). This is the key insight.

This is the moment you stop relying on hidden behavior and start making your own decisions about what the model sees.

**Query vs. Prompt:**
- **Query:** What the user types into your app (e.g., "What is this document about?")
- **Prompt:** What you send to the model (instructions + document + query)

The query is user input. The prompt is model input. When building with LLMs, **the prompt is the logic**. You need to see it, own it, and iterate on it.

**For Reference:**
["F**k you, show me the prompt" by Hamel Husain (Parlance Labs)](https://hamel.dev/blog/posts/prompt/): This blog post captures the mindset shift this homework is about.


Please spend a few minutes providing feedback on this session in the survey shared in Discord.