# LLM Question Answering Tool

This notebook demonstrates how to build a tool that takes a technical question and generates an explanation using either an OpenAI GPT model or a locally hosted LLaMA model via Ollama. The project illustrates prompt processing, model integration, and streaming output to the notebook interface.

### Import Required Libraries

In [1]:
# imports

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

### Define Model Names

In [2]:
# Models as constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

### Load Environment Variables

In [3]:
# Set up environment

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

### Function to Stream Model Responses
This function detects whether to use an OpenAI GPT model or an Ollama LLaMA model and streams the model output directly to the notebook as markdown.

In [4]:
def stream_response(model, question):
    if "gpt" in model.lower():
        # Use OpenAI client for GPT models
        openai = OpenAI()
        stream = openai.chat.completions.create(
            model=model,
            messages=[{"role": "system", "content": question}],
            stream=True
        )
        
        response = ""
        display_handle = display(Markdown(""), display_id=True)
        for chunk in stream:
            response += chunk.choices[0].delta.content or ''
            response = response.replace("```","").replace("markdown", "")
            update_display(Markdown(response), display_id=display_handle.display_id)
    else:
        # For Ollama models, use requests directly       
        response = ""
        display_handle = display(Markdown(""), display_id=True)
        
        stream = requests.post(
            "http://localhost:11434/api/generate",
            json={
                "model": model, 
                "prompt": question, 
                "stream": True},
            stream=True
        )
        
        for line in stream.iter_lines():
            if line:
                json_line = json.loads(line.decode('utf-8'))
                if "response" in json_line:
                    token = json_line["response"]
                    response += token
                    update_display(Markdown(response), display_id=display_handle.display_id)

In [5]:
# Get gpt-4o-mini to answer, with streaming
# Here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

stream_response(MODEL_GPT, question)

The code snippet you've provided utilizes Python's `yield from` statement in conjunction with a set comprehension. Let's break it down step-by-step to understand what it does and why.

### Breakdown of the Code

1. **Set Comprehension**: 
   python
   {book.get("author") for book in books if book.get("author")}
   
   - This part creates a set of authors from a collection of `books`. 
   - It iterates over each `book` in the `books` iterable.
   - It uses `book.get("author")` to retrieve the "author" key's value from each book.
   - The condition `if book.get("author")` ensures that only books with a valid (non-None, non-empty) author are included in the resulting set.
   - The result is a set containing unique author names, as sets automatically handle duplicates.

2. **Yield from**: 
   python
   yield from ...
   
   - The `yield from` expression is used within a generator function. It allows one generator to yield values from another iterable (in this case, the set of authors).
   - This means that when the generator containing this code is called, it will yield each author in the set one at a time.

### Why This Code is Used

- **Production of Unique Results**: The use of a set ensures that the authors yielded are unique; if multiple books have the same author, that author will only be yielded once.
- **Efficiency**: Using `yield from` allows for lazily generating values one at a time as they are needed, rather than generating all values at once. This can be more memory-efficient, especially when dealing with large lists of books.
- **Filtering**: The `if book.get("author")` condition serves to filter out any books that do not have an author listed, thus only valid author names are processed and yielded.

### Example Usage

If you had a list of dictionaries (`books`) like this:

python
books = [
    {"title": "Book One", "author": "Author A"},
    {"title": "Book Two", "author": "Author B"},
    {"title": "Book Three", "author": None},
    {"title": "Book Four", "author": "Author A"},
]


The code would produce the following unique authors:
- "Author A"
- "Author B"

In a generator function, this could be utilized to provide a stream of authors dynamically, which is useful in various applications such as rendering authors in a user interface, processing or filtering data, or simply iterating through them in a loop.

In [6]:
# Get Llama 3.2 to answer
stream_response(MODEL_LLAMA, question)

This line of code uses a combination of Python's yield from statement, dictionary get method, list comprehension, and generator expressions.

Here's a breakdown of what it does:

- `books` is assumed to be an iterable (e.g., a list or dictionary) containing objects with certain attributes.
- The expression `{book.get("author") for book in books if book.get("author")}` generates a new dictionary where each key-value pair corresponds to the author of a book. The key is the string "author" and the value is the actual author name.

Here's why it does that:

- `for book in books` iterates over each item (`book`) in the list.
- `if book.get("author")` filters out items where the key `"author"` doesn't exist. In Python, dictionary keys are case-sensitive (e.g., "Author" would not be matched by `.get("author")`). However, if you want to make it case-insensitive, consider using `.get("author", None)` instead.
- `book.get("author")` returns the author name of a book. This could raise an exception if the key doesn't exist.

Now, here's where `yield from` comes in:

- `yield from { ... }` takes a sequence (like the dictionary comprehension above) and "unpacks" it as if each item was individually yielded by the function itself.
- The outer function then yields each value one by one, producing an iterator over all values in the dictionary.

So, putting it all together, this line of code generates a generator that produces an author's name for each book where the key `"author"` exists. This allows you to iterate over these authors without having to load all books into memory at once.

Example:

```python
books = [
    {"id": 1, "title": "Book A", "author": "John Smith"},
    {"id": 2, "title": "Book B", "author": "Jane Doe"}
]

for author in yield from {book.get("author") for book in books if book.get("author")}:
    print(author)
```

Output:

```python
John Smith
Jane Doe
```

This code snippet is equivalent to the following, but without using dictionary comprehension and `yield from`:

```python
def get_authors(books):
    authors = []
    for book in books:
        if "author" in book:
            author = book["author"]
            # You could also raise an exception here instead of silently dropping books
            authors.append(author)
    return authors

# Usage
books = [
    {"id": 1, "title": "Book A", "author": "John Smith"},
    {"id": 2, "title": "Book B", "author": "Jane Doe"}
]

for author in get_authors(books):
    print(author)
```