# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI
from google import genai
from google.genai import types
import ollama

In [2]:
# constants

MODEL_GPT = 'gemini-2.5-flash'
MODEL_LLAMA = 'llama3.2'
MODEL_LLAMA = 'gemma3n:latest'

In [3]:
# set up environment
API_KEY = os.getenv('GEMINI_API_KEY')
client = genai.Client(api_key=API_KEY)

In [4]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [5]:
# prompts

system_prompt = "You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs"
user_prompt = "Please give a detailed explanation to the following question with complete code example: " + question

In [6]:
# messages

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

print(messages)

[{'role': 'system', 'content': 'You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs'}, {'role': 'user', 'content': 'Please give a detailed explanation to the following question with complete code example: \nPlease explain what this code does and why:\nyield from {book.get("author") for book in books if book.get("author")}\n'}]


In [7]:
# Get gpt-4o-mini to answer, with streaming


In [8]:
# Get Llama 3.2 to answer

response = ollama.chat(model=MODEL_LLAMA, messages=messages)
reply = response['message']['content']
display(Markdown(reply))

Okay, let's break down the Python code snippet:

```python
yield from {book.get("author") for book in books if book.get("author")}
```

**What it does:**

This code uses a generator expression combined with the `yield from` statement to efficiently extract author names from a list of book dictionaries.  It's a concise way to iterate through the books and yield the author names that actually exist (i.e., books that have an "author" key with a non-empty value).

**Explanation, step by step:**

1. **`books`:**  We assume `books` is a list of dictionaries. Each dictionary represents a book, and it's expected to have a key called "author" that might contain the author's name.

2. **`for book in books`:** This is a standard `for` loop that iterates through each dictionary (`book`) in the `books` list.

3. **`book.get("author")`:**  The `get()` method is used to safely access the value associated with the key "author" in the current `book` dictionary.  `get()` is preferred over direct dictionary access (e.g., `book["author"]`) because `get()` returns `None` if the key doesn't exist, preventing a `KeyError`.  This is important for robustness.

4. **`if book.get("author")`:** This is a conditional check. It verifies that the value returned by `book.get("author")` is "truthy".  In Python, an empty string (`""`) is considered "falsy", while a non-empty string is "truthy".  Therefore, this condition filters out books where the "author" key exists but has an empty value.

5. **`{... for ... if ...}` (Generator Expression):** This is a generator expression.  It's a compact way to create an iterator.  It's similar to a list comprehension, but it doesn't create the entire list in memory at once. Instead, it generates values on demand.  This is more memory-efficient, especially when dealing with large datasets.  The generator expression does the following:
   - It iterates through the `books` list.
   - For each `book`, it retrieves the author name using `book.get("author")`.
   - It only includes the author name in the resulting sequence if the author name is not empty (due to the `if` condition).

6. **`yield from ...`:** This is the key part that makes this code a generator.  The `yield from` statement yields all the values generated by the expression within the curly braces.  It's a shorthand for iterating through an iterable (in this case, the generator expression) and yielding each item.  It's equivalent to writing a loop that iterates through the generator expression and `yield`s each value.

**Why use a generator expression with `yield from`?**

* **Memory Efficiency:**  Generator expressions are memory-efficient because they produce values one at a time, rather than creating an entire list in memory. This is particularly beneficial when dealing with large datasets.
* **Readability:**  The code is concise and readable, expressing the intent clearly.
* **Lazy Evaluation:** Values are generated only when they are needed, which can improve performance.
* **Handles Missing Keys Gracefully:** The use of `book.get("author")` prevents `KeyError` exceptions if a book dictionary doesn't have an "author" key.

**Complete Code Example:**

```python
books = [
    {"title": "Pride and Prejudice", "author": "Jane Austen"},
    {"title": "To Kill a Mockingbird", "author": "Harper Lee"},
    {"title": "1984", "author": ""},  # Empty author
    {"title": "The Great Gatsby"},  # Missing author key
    {"title": "Moby Dick", "author": "Herman Melville"},
    {"title": "Animal Farm", "author": None} #None author
]

def get_authors(books):
  """
  Extracts author names from a list of book dictionaries, handling missing or empty author values.

  Args:
    books: A list of dictionaries, where each dictionary represents a book and 
           may contain an "author" key.

  Yields:
    The author names of the books, filtered to exclude empty or missing author values.
  """
  yield from {book.get("author") for book in books if book.get("author")}

# Use the generator to get the author names
for author in get_authors(books):
  print(author)
```

**Output of the example:**

```
Jane Austen
Harper Lee
Herman Melville
```

**Key improvements in the example:**

* **Function encapsulation:** The code is wrapped in a function `get_authors` for better organization and reusability.
* **Docstring:**  A docstring is included to explain what the function does, its arguments, and what it yields.
* **Clearer variable names:**  More descriptive variable names are used.
* **Handles `None` author values:** The code correctly handles cases where the "author" value is `None`.  `None` is also considered falsy, so it will be excluded.

**In summary:** The code provides a clean, efficient, and robust way to extract author names from a list of book dictionaries, while gracefully handling missing or empty author values.  The use of a generator expression with `yield from` makes it memory-efficient and readable.  This is a common pattern in Python for data processing tasks.  It's a good example of how to write Python code that is both concise and easy to understand.

In [9]:
books = [
    {"title": "The Lord of the Rings", "author": "J.R.R. Tolkien"},
    {"title": "Pride and Prejudice", "author": "Jane Austen"},
    {"title": "1984", "author": "George Orwell"},
    {"title": "To Kill a Mockingbird"},  # Missing "author" key
    {"title": "The Great Gatsby", "author": None}, #Author is None
    {"title": "Another Book", "author": "Someone"}
]

def xyz(books):
    yield from {book.get("author") for book in books if book.get("author")}

author_list = xyz(books)

for i in author_list:
    print(i)

George Orwell
J.R.R. Tolkien
Someone
Jane Austen


In [10]:
# The input data
books = [
    {"title": "The Hitchhiker's Guide to the Galaxy", "author": "Douglas Adams"},
    {"title": "The Restaurant at the End of the Universe", "author": "Douglas Adams"},
    {"title": "Dune", "author": "Frank Herbert"},
    {"title": "Foundation", "author": "Isaac Asimov"},
    {"title": "I, Robot", "author": "Isaac Asimov"},
    {"title": "Neuromancer", "author": "William Gibson"},
    {"title": None}, # A book with no data
    {"title": "The Martian", "author": "Andy Weir"}
]

# The generator function
def get_unique_authors_generator(books_list):
    """
    A generator that yields unique author names from a list of book dictionaries.
    It uses a set comprehension to efficiently find unique names and
    'yield from' to elegantly delegate to the resulting set.
    """
    yield from {book.get("author") for book in books_list if book.get("author")}

# Instantiate the generator
unique_authors = get_unique_authors_generator(books)
print(unique_authors)
# Iterate through the generator and print the results
print("Unique authors yielded by the generator:")
for author in unique_authors:
    print(author)

# You can also convert the generator to a list to see all the results at once
# Note: The order is not guaranteed because sets are unordered
list_of_authors = list(get_unique_authors_generator(books))
print("\nUnique authors as a list (order is not guaranteed):")
print(list_of_authors)

<generator object get_unique_authors_generator at 0x12ad65620>
Unique authors yielded by the generator:
William Gibson
Frank Herbert
Douglas Adams
Andy Weir
Isaac Asimov

Unique authors as a list (order is not guaranteed):
['William Gibson', 'Frank Herbert', 'Douglas Adams', 'Andy Weir', 'Isaac Asimov']
