# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [None]:
# imports
import os
from openai import OpenAI
from dotenv import load_dotenv
from IPython.display import display, Markdown, update_display

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')


In [5]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [22]:
system_prompt = """
You are a helpful tutor that explains code. You need to provide an answer structured in markdown without code blocksinto the following parts:
- Identify the topic of the question (so the user can look for more info)
- Give an ELI5 explanation of the question.
- Give a step by step explanation of the code.
- Ask the user a follow up question or variation of the question to see if they understand the concept.
- Give the answer to the followup question as a spoiler
"""

In [None]:
# Get gpt-4o-mini to answer, with streaming

from enum import StrEnum

class Enum_Model(StrEnum):
    GPT = 'gpt-4o-mini'
    LLAMA = 'llama3.2:1b'

def ask_tutor(question, model: str | Enum_Model):
    if isinstance(model, str):
        try:
            model = Enum_Model[model.upper()]
        except KeyError:
            raise ValueError(f"Unknown model: {model}")
    if model == Enum_Model.LLAMA:
        LLM_ENDPOINT="http://localhost:11434/v1"
        client = OpenAI(base_url=LLM_ENDPOINT)
    elif model == Enum_Model.GPT:
        client = OpenAI()
    else:
        raise ValueError(f"Unknown model: {model}")
    stream = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}],
        stream=True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        update_display(Markdown(response), display_id=display_handle.display_id)

In [31]:
ask_tutor(question, 'LLAMA')


## Explanation of the Code

This piece of code is using a concept called "generators" in Python. The function `yield from` is used to generate a sequence of values on-the-fly, without needing to store them all in memory at once.

### ELI5 Explanation:

Think of it like this: Imagine you're trying to create a long list of authors for books. If you were to write the entire list out and then return it, you might run into a problem because Python doesn't know where to put it until you actually read it. That process can be slow.

This type of code is like writing the list once and giving it to your friends who can then use it wherever they need. Instead of having all that data inside one giant list in its entirety, this generator creates a small amount of data each time it's needed (like when someone calls `get("author")` on an author object). It's called a "generator" because it generates values rather than computing them and storing them all at once.

### Code Explanation:

This code is defining a function that takes in another function as an argument (`books`) but doesn't use its return value directly. Instead, it uses Python's `yield from` feature to call the inner function (`book.get("author") for book in books`) repeatedly until it encounters a stop signal.

Here's what the code does step by step:
1. It starts with defining an empty sequence variable `result` that will hold our answer (because this is going to be another generator).
2. It opens the `books` collection, which presumably contains a list of book objects.
3. It then uses an expression to iterate over each book in the `books` collection and call the anonymous function (`book.get("author")`) on it.
4. The result of this iteration is what's assigned to our sequence variable `result`.
5. `result` is an iterator, and when we do `yield from`, Python switches control to the generator inside `books`.
6. This happens until someone calls the `stop()` function on the outer generator (perhaps as a way to explicitly exit the generation process) or if there's no more data left in the collection.
7. At that point, Python returns control back to our inner function (`book.get("author")`), which now has its own sequence variable and can keep generating values until you call `stop()` on it.

### Follow-up Question:

How would this code be used if there were already some data stored elsewhere on the computer?

SPOILER ALERT:

The answer is that this type of code doesn't actually store anything in memory. Instead, it's just using a trick to tell Python (and JavaScript) which values we'd like it to generate based on an iterable (`books`), without having to store those ones in its actual database or cache beforehand.

In [32]:
ask_tutor(question, 'GPT')


### Topic: Python Generators and Set Comprehensions

### ELI5 Explanation
Imagine you have a big box of books, and each book has different information like the title and the author. You want to make a special list of all the authors of the books you have, but you only want to write down the names once, even if the same author wrote multiple books. This code snippet is doing just that! It's like saying, "Hey! For all the books in this box, find out who the authors are and tell me their names, but don’t tell me any duplicates."

### Step by Step Explanation of the Code
1. **Book Collection**: The variable `books` is expected to be a collection (like a list) of dictionaries. Each dictionary represents a book with various attributes like "author," "title," etc.
   
2. **List Comprehension**: The code uses a set comprehension `{...}` to create a set of authors.
   - Inside the braces `{}`, it iterates over each `book` in the collection `books`.
   - For each `book`, it retrieves the author's name using `book.get("author")`. The `get` method is safe—it won’t cause an error if the key "author" doesn’t exist; it just returns `None`.

3. **Filtering**: The `if book.get("author")` part ensures that only books with a valid author are considered. It filters out any books that don’t have an author specified.

4. **Yielding Results**: The `yield from` statement is used to yield each element from the set of authors (which automatically handles duplicates by using the properties of a set) back to wherever this generator function is called. This means it will return each author's name one by one when called upon.

### Follow-up Question
What would happen if you used a list comprehension instead of a set comprehension in this code? Would you still get unique authors?

### Answer to the Follow-up Question: 
<details>
<summary>Click to see the answer</summary>
No, using a list comprehension would collect all authors, including duplicates. You would get a list of authors that may have repeated names if multiple books were written by the same author.
</details>