# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports
import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [3]:
# constants
MODEL_GEMINI = 'gemini-1.5-flash'
MODEL_LLAMA = 'llama3.2'

In [5]:
# set up environment
load_dotenv()
api_key = os.getenv('GEMINI_API_KEY')   
openai_gemini = OpenAI(api_key=api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
openai_llama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')


In [8]:
# here is the question; type over this to ask something new
system_prompt = "You are an assistant which has experience of more than 10 years of programming experience"
user_prompt = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [16]:
# Get gpt-4o-mini to answer, with streaming

def messages_for():
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

gemini_response = openai_gemini.chat.completions.create(
    model = MODEL_GEMINI,
    messages = messages_for(),
    stream = True
)

response = ""
display_handle = display(Markdown(""), display_id=True)

for chunk in gemini_response:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

This code snippet is a Python generator expression that efficiently yields authors' names from a list of dictionaries, `books`. Let's break it down piece by piece:

1. **`books`:** This is assumed to be a list (or other iterable) where each element is a dictionary representing a book.  Each book dictionary is expected to have at least a key `"author"`, though its value might be `None` or missing.

2. **`if book.get("author")`:** This is a conditional check within a list comprehension. `book.get("author")` safely retrieves the value associated with the key `"author"` from the `book` dictionary. If the key `"author"` is not present, it returns `None`; otherwise, it returns the author's name (presumably a string).  The `if` condition ensures that only books with a defined `"author"` are processed further.  Crucially, this avoids `KeyError` exceptions that would occur if `book["author"]` (without `.get()`) were used and a key is missing.

3. **`{book.get("author") for book in books}`:** This is a set comprehension. It iterates through each `book` in the `books` list. For each book with a non-`None` `"author"`, it adds the author's name to a set.  Using a set automatically removes duplicate author names.

4. **`yield from`:** This is the key to understanding why this is a *generator*.  `yield from` takes an iterable (in this case, the set comprehension) and yields each item from that iterable one at a time. This is significantly more memory-efficient than creating the entire set in memory first and then iterating over it, especially if `books` is a large list.  The generator produces the author names lazilyâ€” only when requested by the consumer of the generator.

**In summary:**

The code efficiently extracts unique authors' names from a list of book dictionaries. It uses a combination of `book.get()` for safe key access, a set comprehension to eliminate duplicates, and `yield from` to create a memory-friendly generator that yields each unique author's name one by one as they are requested,  preventing unnecessary memory allocation.


**Example:**

Let's say:

python
books = [
    {"title": "Book A", "author": "Jane Doe"},
    {"title": "Book B", "author": "John Smith"},
    {"title": "Book C", "author": "Jane Doe"},
    {"title": "Book D"},  # Missing author
    {"title": "Book E", "author": "Peter Jones"}
]

author_generator = (yield from {book.get("author") for book in books if book.get("author")})

for author in author_generator:
    print(author)


This will output:


Jane Doe
John Smith
Peter Jones


Notice that "Jane Doe" appears only once, demonstrating the set's deduplication.  The output order might vary slightly because sets are unordered, but the unique authors are correctly produced.  If the `yield from` was replaced with simply creating and returning the set,  the memory usage would scale linearly with the number of books and authors, while this generator approach has constant memory usage regardless of the size of `books`.


In [20]:
# Get Llama 3.2 to answer

llama_response = openai_llama.chat.completions.create(
    model = MODEL_LLAMA,
    messages = messages_for()
)
llama_response.choices[0].message.content

'This is a Python snippet that uses the `yield from` keyword, which was introduced in Python 3.3. I\'ll break it down for you.\n\n**What is `yield from`?**\n\nIn Python, a generator function is a special type of function that can be used to generate a sequence of values on-the-fly, rather than computing them all at once and returning them in a list or other collection. The `yield from` keyword allows you to define a nested generator, which means you can yield values from another generator.\n\n**The code**\n\nIn this specific case:\n```python\nyield from {book.get("author") for book in books if book.get("author")}\n```\nThis code is using the `{}` syntax to create a new dictionary comprehension. However, it\'s not creating an actual dictionary; instead, it\'s generating values on-the-fly and yielding them.\n\nHere\'s what happens:\n\n1. `books` is assumed to be an iterable (e.g., a list or tuple) containing dictionaries.\n2. For each dictionary `book` in the `books` iterable:\n\t* The e

In [21]:
Markdown(llama_response.choices[0].message.content)

This is a Python snippet that uses the `yield from` keyword, which was introduced in Python 3.3. I'll break it down for you.

**What is `yield from`?**

In Python, a generator function is a special type of function that can be used to generate a sequence of values on-the-fly, rather than computing them all at once and returning them in a list or other collection. The `yield from` keyword allows you to define a nested generator, which means you can yield values from another generator.

**The code**

In this specific case:
```python
yield from {book.get("author") for book in books if book.get("author")}
```
This code is using the `{}` syntax to create a new dictionary comprehension. However, it's not creating an actual dictionary; instead, it's generating values on-the-fly and yielding them.

Here's what happens:

1. `books` is assumed to be an iterable (e.g., a list or tuple) containing dictionaries.
2. For each dictionary `book` in the `books` iterable:
	* The expression `{book.get("author") for book in books if book.get("author")}`}
	* This inner comprehension iterates over the same `books` iterable and selects dictionaries where `book.get("author")` returns a truthy value (i.e., not `None` or an empty string).
3. For each `book` that passes this condition, its `"author"` value is extracted from the dictionary using `book.get("author")`.
4. The resulting sequence of authors is yielded by the outer generator function with `yield from`.

**Why use `yield from`?**

By using `yield from`, we're creating a new generator function that lazily generates the values instead of computing them all at once and storing them in memory. This has several benefits:

* **Memory efficiency**: We don't need to store the entire sequence of authors in memory, which can help with large datasets.
* **Flexibility**: The code can be used in a loop or as part of other iteration constructs, without having to worry about the overhead of storing and processing the entire dataset at once.

**Example usage**

To use this code effectively, you might pass it to a function like `group_by` that takes an iterable of dictionaries and groups them based on a common key (in this case, `"author"`). The resulting grouped data structure could be used for further analysis or processing. For example:
```python
def group_by(iterable, key_fn):
    groups = {}
    for item in iterable:
        k = key_fn(item)
        if k not in groups:
            groups[k] = []
        groups[k].append(item)
    return groups

authors = yield from {book.get("author") for book in books if book.get("author")}
groups = group_by(books, lambda b: authors)

for author, list of groups.items():
    print(f"{author}: {[''.join(item['title'] for item in list)]}")
```
This code would take the `books` iterable and generate an iterator over the unique authors, then group the books by their corresponding authors. The resulting output would display each book title grouped by author.