# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [17]:
# imports
import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [18]:
# constants
MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [19]:
# set up environment
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API Key?")

openai = OpenAI()
openai_ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

API key looks good so far


In [20]:

# here is the question; type over this to ask something new
system_prompt="You are a programmer and a technical support engineer. \
You are able to explain complex concepts into easy, digestible explanations \
using layman's terms and, when needed, visual examples"

default_question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

user_question = input("Enter your question: ") or default_question

Enter your question:  


In [21]:
print(user_question)


Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}



In [22]:
# Get gpt-4o-mini to answer, with streaming
stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_question}
          ],
        stream=True
    )

response = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    response += chunk.choices[0].delta.content or ''
    response = response.replace("```","").replace("markdown", "")
    update_display(Markdown(response), display_id=display_handle.display_id)


Let's break down the code snippet you provided:

python
yield from {book.get("author") for book in books if book.get("author")}


### Explanation of Each Component

1. **`books`**: This is assumed to be a collection (like a list) of dictionaries, where each dictionary represents a book and has various attributes, one of which is "author".

2. **`book.get("author")`**: This method retrieves the value associated with the key "author" from each `book` dictionary. If the key does not exist, it returns `None` instead of raising an error.

3. **Set Comprehension**: The expression `{book.get("author") for book in books if book.get("author")}` is a set comprehension. 
   - It iterates over each `book` in `books`.
   - For each `book`, it checks if `book.get("author")` is truthy (i.e., it exists and is not `None` or an empty string).
   - If it is truthy, it adds the author's name to the set.
   - The use of a set ensures that each author's name is unique; duplicates will be automatically removed.

4. **`yield from`**: This is a construct used in a generator function. It allows the generator to yield all values from an iterable (in this case, a set of authors). Essentially, it lets you "flatten" the values so that they can be yielded one by one.

### What the Code Does

- The code snippet collects unique author names from a list of books, filtering out any entries where the author is not specified.
- It then yields each of these unique author names one at a time, allowing them to be processed or used elsewhere in a way that only retrieves the unique values.

### Why Use This Code?

1. **Efficiency**: Using a set automatically removes any duplicate authors, avoiding unnecessary processing or extra work later on.

2. **Readability**: The use of `yield from` makes the generator's intention clear: it will produce a sequence of authors without creating a separate list that holds all author names before yielding them.

3. **Memory Management**: Since it's a generator, this approach allows for potentially lazy evaluation, which means that you don’t need to store all the authors in memory at once. Instead, you only hold one author at a time, which can significantly save memory when dealing with large lists of books.

### Visual Example

Imagine you have the following `books` list:

python
books = [
    {"title": "Book A", "author": "Alice"},
    {"title": "Book B", "author": "Bob"},
    {"title": "Book C", "author": "Alice"},
    {"title": "Book D", "author": None},
    {"title": "Book E", "author": "Charlie"},
    {"title": "Book F", "author": ""},
]


Running the provided code snippet would yield the following unique authors:

- Alice
- Bob
- Charlie

The author "Alice" appears twice, but in the set, it will only be stored once due to its unique nature. Authors with `None` or empty string as their values will be ignored.

In summary, this snippet is a compact and efficient way to extract a unique list of authors from a collection of books while dealing with possible missing or empty fields.

In [23]:
# Get Llama 3.2 to answer
stream = openai_ollama.chat.completions.create(
        model=MODEL_LLAMA,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_question}
          ],
        stream=True
    )

response = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    response += chunk.choices[0].delta.content or ''
    response = response.replace("```","").replace("markdown", "")
    update_display(Markdown(response), display_id=display_handle.display_id)


Let me break it down for you in simple terms.

This line of code is using a concept called "yield from" to extract data from multiple sources. Here's the code again, with some extra context:

python
from collections.abc import Iterable

def get_books_with_author():
    # Some list or generator containing books
    books = [
        {"title": "Book 1", "author": "Author A"},
        {"title": "Book 2", "author": "Author B"},
        {"title": "Book 3", "author": "Author C"}
    ]
    
    for book in books:
        yield from {book.get("author")}  # This is the mysterious part

get_books_with_author()


In this example, `yield from` tells Python to "yields these values" instead of using its own default way of yielding. Think of it like a super-efficient bucket feeder.

Let me explain what's happening inside that inner `{}` part:

python
from collections.abc import Iterable

def get_book_author():
    # A dictionary containing books with their authors
    book Authors = {
        "Book 1": {"author": "Author A"},
        "Book 2": {"author": "Author B"},
        "Book 3": {"author": "Author C"}
    }

    for title, details in book_authors.items():
        yield from details.get("author")  # Yield just the author name if present

get_book_author()


Here's what each part of this line does:

- `for book in books` iterates over a series of **books** (represented as dictionaries).

- `{book.get("author")}` extracts the "author" value from each book.

Now we put them together to create an infinite iterator for all authors:
`.yield from {book.get("author") for book in books if book.get("author")}``

When the function is iterated over, Python will yield each **author's name** one by one. If an author doesn't exist (i.e., some `book.get("author");` returns `None`), then that value won't be included.

Here's how we can do this:
python
from collections.abc import Iterable

def get_books_with_author():
    # Some list or generator containing books
    books = [
        {"title": "Book 1", "author": "Author A"},
        {"title": "Book 2", "author": "Author B"},
        {"title": "Book 3",  # this one is missing the author
          }
    ]
    
    for book in books:
        yield from {book.get("author")} if book.get("author") else ""  # Include empty author name to fix the condition

get_books_with_author()
for value in get_books_with_author():
    print(value)


When you run this, you'll see: "Author A", "Author B"