# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [26]:
# imports
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display, update_display
import requests
import json

In [27]:
# constants
MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [14]:
# set up environment
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")

API key looks good so far


In [28]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [19]:
# Get gpt-4o-mini to answer, with streaming
openai = OpenAI()

system_prompt = "You are an assistant that takes a technical question, and responds with an explanation."
user_prompt = "I have a technical question:"
user_prompt += question

def stream_answer():
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

stream_answer()

This line of code is using a combination of Python's generator feature and set comprehension to yield authors from a collection of book dictionaries.

Here's a breakdown of each part:

1. **Set Comprehension**: The `{book.get("author") for book in books if book.get("author")}` part creates a set of authors from the list (or iterable) named `books`. A set is used here to ensure that each author is unique — meaning if there are multiple books by the same author, that author's name will only appear once in the resulting set.

   - `book.get("author")`: This method retrieves the value associated with the key `"author"` from each `book` dictionary. If a book does not have an `"author"` key, `get()` will return `None` (or a specified default value if provided).
   - `for book in books`: This iterates over each `book` in the `books` collection.
   - `if book.get("author")`: This condition filters out any books that do not have an author specified (i.e., those that return `None` when calling `get("author")`).

2. **`yield from`**: The `yield from` syntax is used to yield all values from the given iterable. In this case, it will yield each unique author found in the set created by the comprehension. By using `yield from` instead of just `yield`, the function in which this line is written will act as a generator, allowing it to produce values one at a time and efficiently.

In summary, the entire line of code is a generator expression that will yield each unique author from a collection of book dictionaries, skipping any entries where the author is not specified. The structure helps manage memory efficiently by producing authors one at a time rather than generating a complete list at once.

In [29]:
# Get Llama 3.2 to answer
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}

In [31]:
system_prompt = "You are an assistant that takes a technical question, and responds with an explanation."
user_prompt = "I have a technical question:"
user_prompt += question

def stream_answer():
    payload = {
        "model": MODEL_LLAMA,
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        "stream": True
    }

    response = requests.post(OLLAMA_API, headers=HEADERS, data=json.dumps(payload), stream=True)

    full_response = ""
    display_handle = display(Markdown(""), display_id=True)

    for line in response.iter_lines():
        if line:
            try:
                chunk = json.loads(line.decode("utf-8"))
                delta = chunk.get("message", {}).get("content", "")
                full_response += delta
                cleaned = full_response.replace("```", "").replace("markdown", "")
                update_display(Markdown(cleaned), display_id=display_handle.display_id)
            except Exception as e:
                print("Error decoding line:", line, e)

stream_answer()

**Explanation of the Code**

The given code snippet is written in Python and utilizes several advanced features, including generators and dictionary iteration.

Here's a breakdown of what the code does:

1. `{book.get("author") for book in books if book.get("author")}`: This part is called a **dictionary comprehension** or more accurately, an iterator expression. It iterates over each item in the `books` collection (which can be a list, dictionary, or any other iterable) that contains a key named `"author"`.

2. `yield from`: The `yield from` statement is used to delegate part of the execution of the generator function to another generator. When you use `yield from`, it "yields" each item from the delegated generator, rather than trying to yield them one by one itself. This can significantly improve performance when dealing with large datasets.

**Putting it all Together**

When combined, these two elements create a **generator expression**:

`yield from {book.get("author") for book in books if book.get("author")}`

This generator expression takes each item `book` that contains the key `"author"`, yields its value (`book.get("author")`), and then uses `yield from` to delegate these yielded values to another generator.

However, there is an issue with the given code. The dictionary comprehension `{...}` is not a valid Python syntax. It should be replaced with `({book.get("author")} for book in books if book.get("author"))`.

Here's how you can use this expression correctly:

python
def get_authors(books):
    authors = ({book.get("author")} for book in books if book.get("author"))
    return authors

# Usage:
authors = get_authors([{"title": "Book1", "author": "AuthorA"}, {"title": "Book2", "author": "AuthorB"}])
for author in authors:
    print(author)


In this corrected example, the generator expression correctly yields each unique author's name.

**Why Use `yield from`?**

The use of `yield from` here allows you to avoid creating an intermediate list of all book authors and then using that list. Instead, it directly iterates over each unique author in a lazy manner, which can be more memory-efficient when working with large datasets.