# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [30]:
# imports
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display, update_display

In [38]:
# constants
MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = "llama3.2:1b"

In [20]:
# set up environment
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')
if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
openai = OpenAI()

API key looks good so far


In [None]:
# pull and specify the model - we try llama3.2:1b because llama3.2 runs very slow on the lappy
!ollama pull llama3.2:1b

In [21]:
# system prompt
system_prompt = "You are a computer science engineer with excellent programming knowlegde and communication skills. \
You receive a technical question and are capable of answering it in short and simple terms to someone with a less technical background."

In [35]:
# user prompt: Question
question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [36]:
# Get gpt-4o-mini to answer, with streaming
def stream_answer_gpt(question):
    stream = openai.chat.completions.create(
        model = MODEL_GPT,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}
        ],
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)

    # # print without markdown
    # for chunk in stream:
    #     print(chunk.choices[0].delta.content or '', end='')

    # print with markdown
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [43]:
# Get Llama 3.2 to answer

# Let's call ollama via the openAI library
ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

def stream_answer_ollama(question, modelname = MODEL_LLAMA):
    print("The answer below is generated by model: " + modelname)
    stream = ollama_via_openai.chat.completions.create(
        model = modelname,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question}
        ],
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    
    # print with markdown
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [34]:
# RUN
stream_answer_gpt(question)

Sure! Let's break down what this code does.

1. **`{book.get("author") for book in books if book.get("author")}`**: This part of the code is a set comprehension. It creates a set (which is a collection of unique items) containing the authors of books. Here's how it works:
   - It goes through each `book` in the `books` collection.
   - For each `book`, it tries to get the value associated with the key `"author"`.
   - If there's an author (i.e., `book.get("author")` is not `None` or empty), it adds that author's name to the set.

2. **`yield from ...`**: The `yield from` statement is used inside generators in Python. It allows you to yield all values from an iterable (like our set of authors) without having to loop through them explicitly. This means it will return each author one by one to whatever function or context is calling this generator.

So, in summary, this code collects and yields unique authors from a list of books, skipping any books that don't have an author specified. It’s a concise way to get the authors while ensuring there are no duplicates.

In [46]:
# RUN
stream_answer_ollama(question, modelname = "deepseek-r1:1.5b")

The answer below is generated by model: deepseek-r1:1.5b


<think>
Alright, I'm looking at this code: `yield from {book.get("author") for book in books if book.get("author")}`.

First, I see that it's using Python. It runs three things:

1. It starts a yield-withfrom loop to get the name of an object.
2. Inside the loop, it creates a new generator expression by reading each item from "books". For each book, it gets its author if available, otherwise skips it.
3. This generator is passed directly into the yield-withfrom loop.

I'm not sure what exact object this code is looping over unless there's more context or defined variables elsewhere in the program where it refers to this loop. 

Looking at how Python evaluates loops, especially `yield from`, it iterates over each item from the iterable and executes all statements in order before returning control, stopping the iteration. So if there's a variable named "books" that contains books objects (like dictionaries or data structures), this code would collect the authors of these books into a generator expression and yield them one by one.

But without more context about where "books" is defined, I can only guess. If "books" were a list containing objects with an "author" attribute, this code loops over each object, tries to get the author for each, collects the ones that exist (or perhaps all?), and yields their authors immediately one after another.

I'm wondering if there's any potential issue here. One possible point of confusion is how Python handles dictionary literals in generator expressions. If "book.get("author")" isn't valid syntax, it should raise an AttributeError, right? But usually, `get` on string keys like `"author"` expects something that returns a dictionary or raises a KeyError. So this line seems correct if the books are dictionaries with "author" as a key.

Another thing is parallel execution and thread safety, but since each book in the generator expression contributes only one value (if any), I don't think concurrency issues apply here unless the code would yield early incorrectly if partial data were returned prematurely by generators or coroutines. But that's not directly relevant to this code because it uses `yield from` without considering such risks.

In summary, the code is fetching the first author from each book in a list of books into a generator expression and yielding those authors one after another.
</think>

This code retrieves the "author" value for each book listed in the variable **books**. Each time it evaluates the loop body, it examines an item, checks if the key "author" is present (or is a dictionary returning such data), and returns that information immediately.

In simpler terms:
1. The loop starts without knowing where to begin.
2. It loops through each book in the list **books**.
3. For each book, it looks for an author field or creates a structure if it's not present.
4. Only the authors are collected into a generator expression.
5. This output is then returned one object at a time.

So, the code retrieves a sequence of "author" fields from the list **books** and yields them sequentially.