# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports

from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# constants

MODEL_GPT = 'gpt-5-nano'   # cheaper than gpt-4o-mini; same as day1/day5
# MODEL_LLAMA = 'llama3.2'  # not using Ollama

In [3]:
# set up environment

load_dotenv(override=True)
openai = OpenAI()

In [4]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [5]:
# prompts

system_prompt = "You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs."
user_prompt = "Please give a detailed explanation to the following question: " + question

# messages

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

In [6]:
# Get gpt-4o-mini to answer, with streaming

stream = openai.chat.completions.create(model=MODEL_GPT, messages=messages, stream=True)

response = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    response += chunk.choices[0].delta.content or ''
    update_display(Markdown(response), display_id=display_handle.display_id)

Short answer
- It collects all non-empty author values from a sequence of dicts (books) into a set (which removes duplicates), and then yields each author from that set one by one to the caller.

What each part does
- books is assumed to be an iterable of dictionaries, each representing a book.
- book.get("author"):
  - Retrieves the value for the key "author" from a book dict.
  - If the key is missing, get returns None.
- if book.get("author"):
  - This filters out any books whose author is missing or evaluates to a falsy value (None, empty string, etc.).
- {book.get("author") for book in books if book.get("author")}:
  - This is a set comprehension. It builds a set of all author values (one per book) that pass the filter.
  - Using a set automatically deduplicates authors, so duplicates are removed.
  - Sets are unordered, so the order of authors in the resulting set is not defined.
- yield from { ... }:
  - yield from delegates to the iterable (here, the set of authors) and yields each element to the caller.
  - In effect, the code emits each unique author exactly once.

Why you might use this
- You want to lazily produce a sequence of unique authors from a collection of books.
- The use of a set ensures each author is emitted only once (no duplicates).
- yield from lets the function be a generator and stream authors to the caller, rather than returning a list of authors.

Important nuances and potential improvements
- Order is not guaranteed because sets are unordered. If you care about the order of first appearance, this approach wonâ€™t preserve it.
  Example of preserving order while deduplicating:
  - def iter_authors_ordered(books):
      seen = set()
      for book in books:
          author = book.get("author")
          if author and author not in seen:
              seen.add(author)
              yield author
- If you still want to use a set but preserve insertion order, you could collect authors with a dict (Python 3.7+ dicts preserve insertion order):
  - yield from dict.fromkeys(book.get("author") for book in books if book.get("author"))
  - This yields authors in the order they first appear, without duplicates.

Example
- Given:
  books = [
    {"title": "A", "author": "Alice"},
    {"title": "B", "author": "Bob"},
    {"title": "C", "author": "Alice"},
    {"title": "D"}  # no author
  ]
- The code yields: "Alice" and "Bob" (order is not guaranteed due to set), one time each.

In [None]:
# Get Llama 3.2 to answer