# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

## Approach: The "3 Whats" technique

Inspired by the [5 Whys](https://en.wikipedia.org/wiki/Five_whys) root-cause analysis method from Toyota,
this notebook uses an iterative **3 Whats** variant to deepen technical explanations.
Each follow-up step picks the hardest concept from the previous answer and explains it:

1. **What #1** -- General explanation of the code / concept
2. **What #2** -- Identify the hardest concept from What #1 and explain it
3. **What #3** -- Identify the hardest concept from What #2 and explain it (may reference but not repeat What #2)

We run the same chain on both **GPT** (via OpenRouter) and **Llama 3.2** (via Ollama) to compare how each model deepens its reasoning.

In [15]:
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI

In [16]:
# constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [17]:
load_dotenv(override=True)

openrouter_key = os.getenv("OPENROUTER_API_KEY")
if openrouter_key and len(openrouter_key) > 10:
    print("OpenRouter API key looks good")
else:
    print("Set OPENROUTER_API_KEY in your .env file!")

gpt_client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=openrouter_key,
)

ollama_client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
)

OpenRouter API key looks good


In [18]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [19]:
THREE_WHATS_SYSTEM = (
    "You are a technical educator. Keep every answer to ONE concise paragraph "
    "(no bullet points, no headings, no code blocks unless essential). "
    "When asked to go deeper, name the single hardest concept from your "
    "previous answer and explain only that concept in one paragraph of new "
    "information. You may reference or allude to earlier explanations but "
    "never repeat or rephrase them."
)

FOLLOWUP = (
    "What is the single hardest concept in your last answer? "
    "Explain it in one concise paragraph. You may reference what you said "
    "before but do not repeat it."
)


def three_whats(question: str, model: str, client: OpenAI, depth: int = 3):
    """
    Run the 3 Whats chain: ask a question, then iteratively pick the
    hardest concept from the previous answer and explain it.
    """
    messages = [
        {"role": "system", "content": THREE_WHATS_SYSTEM},
        {"role": "user", "content": question},
    ]

    for i in range(depth):
        if i == 0:
            label = f"## Initial explanation ({model})"
        else:
            label = f"## Deep-dive #{i} -- hardest concept from above"

        display(Markdown(label))

        stream = client.chat.completions.create(
            model=model,
            messages=messages,
            stream=True,
        )

        handle = display(Markdown(""), display_id=True)
        answer = ""
        for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            answer += delta
            handle.update(Markdown(answer))

        messages.append({"role": "assistant", "content": answer})

        if i < depth - 1:
            messages.append({"role": "user", "content": FOLLOWUP})

In [20]:
three_whats(question, model=MODEL_GPT, client=gpt_client)

## Initial explanation (gpt-4o-mini)

This code utilizes a generator expression to yield unique authors from a collection of book dictionaries stored in `books`. It specifically uses the `yield from` statement to yield values from the resulting set, which is created by iterating over each `book` in the `books` list, extracting the "author" value when it exists (as checked by `book.get("author")`). The use of a set ensures that any duplicate author names are eliminated, resulting in a clean iteration of unique authors.

## Deep-dive #1 -- hardest concept from above

The hardest concept in the previous explanation is the use of sets for data deduplication. Sets in Python are unordered collections of unique elements, which means that when you insert items into a set, any duplicate entries are automatically removed, guaranteeing that only distinct items remain. This property makes them ideal for tasks where you need to filter out redundancy, such as extracting unique authors from a list of books. Furthermore, set operations are generally more efficient than list operations in terms of membership testing and adding elements, making them a preferable choice for scenarios where uniqueness is crucial.

## Deep-dive #2 -- hardest concept from above

The hardest concept in the last explanation is the efficiency of set operations compared to list operations. In Python, checking for the existence of an item in a set is, on average, O(1) due to its implementation using a hash table, whereas checking membership in a list is O(n) since it requires scanning through the elements sequentially. This efficiency in sets not only speeds up operations such as membership testing but also enhances the performance of operations like adding or removing elements. As a result, sets are preferred when working with collections that require frequent checks for uniqueness and performance is a consideration.

In [21]:
three_whats(question, model=MODEL_LLAMA, client=ollama_client)

## Initial explanation (llama3.2)

This line of code is used within a generator function to yield each author name, one at a time, as the function iterates over a list of dictionaries (`books`) that meet a specified condition. The dictionary search operation (`book.get("author")` and `for book in books if book.get("author")`) ensures that each value being yielded is only from those books where an "author" key exists. It acts like a filter, to some extent - it won't yield from any dictionaries without the 'author' key, but unlike normal for loops or if statements, you get a generator of individual items instead of just iterating over the remaining boolean outcomes. This can be particularly useful in building lazy iterable data pipelines that require filtering before processing.

## Deep-dive #1 -- hardest concept from above

The hardest concept here is likely to be "yield from". In this context, "yield from" is a syntax for yielding elements from another iterable as if they came directly from the current generator iteration. Essentially, when used with an expression like `{book.get("author") for book in books if book.get("author")}`, it bypasses the usual generator behavior and allows us to directly iterate over the individual "authors", creating an iterable that's built by combining multiple iterables, rather than using lazy evaluation or iteration chaining which might have been implied initially.

## Deep-dive #2 -- hardest concept from above

The hardest concept to grasp here is "bypassing generator behavior". Essentially, when using "yield from", the outer generator (created by `{book.get("author") for book in books if book.get("author")}`) takes care of its own iteration and yields values directly, rather than storing them all-up until its completion. This means that "yield from" doesn't modify an existing iterable but starts a new one from scratch each time, which can be confusing to understand correctly if you're already familiar with the more standard generator syntax.