# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [4]:
# imports

import os
import requests
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [13]:
# constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

openai = OpenAI()

In [3]:
# set up environment

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")



API key looks good so far


In [8]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [18]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are a python technical tutor for python, data science and LLM. \
                Explain the given question in detail. Also provides the example on the scenario if possible"

In [19]:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Please give the detailed explanation of this question: "+ question}
    ]

In [20]:
# Get gpt-4o-mini to answer, with streaming
def stream_answer(messages):
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=messages,
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [21]:
stream_answer(messages)

The provided code snippet makes use of a Python generator with the `yield from` statement, and it includes a set comprehension. Let’s break down the components to understand what this code does in detail and why it is structured this way.

### Breakdown of the Code

1. **Set Comprehension:**
   The expression `{book.get("author") for book in books if book.get("author")}` is a set comprehension that creates a set of authors from a collection of `book` dictionaries.

   - `books`: This is expected to be an iterable (like a list) containing dictionaries, where each dictionary represents a book.
   - `book.get("author")`: This retrieves the value of the key `"author"` for each `book` dictionary. If the key doesn't exist, it will return `None`.
   - `if book.get("author")`: This condition filters out any books that do not have an author (i.e., where `book.get("author")` is `None` or evaluates to `False`).
   - The result of this comprehension is a set of unique authors since sets automatically eliminate duplicates. 

2. **yield from Statement:**
   The keyword `yield` is used in Python to create a generator. `yield from` is a specific form that allows a generator to yield all values from another iterable. 

   - In this case, `yield from` is used to yield each author from the set created by the set comprehension.
   - When `yield from` is used, it effectively creates a generator that produces each value from the provided iterable (which, in this case, is the set of authors).

### Overall Functionality
The entire line of code can be seen in the context of a generator function that produces a list of unique authors from the given `books` list. Each time the generator function is called, it will generate the next author until there are no more authors left to yield.

### Example Scenario

Let's say we have the following list of book dictionaries:

python
books = [
    {"title": "Book 1", "author": "Author A"},
    {"title": "Book 2", "author": "Author B"},
    {"title": "Book 3", "author": None},  # No author
    {"title": "Book 4", "author": "Author A"},  # Duplicate author
    {"title": "Book 5", "author": "Author C"},
    {"title": "Book 6"}  # No author key
]


In this case, if we use the code snippet you provided in the context of a generator function, it would look like this:

python
def get_unique_authors(books):
    yield from {book.get("author") for book in books if book.get("author")}

# Using the generator function
for author in get_unique_authors(books):
    print(author)


### Output

The expected output of this code would be:


Author A
Author B
Author C


### Explanation of Output

Here’s how the output is derived:

- **Unique Authors:** The authors "Author A", "Author B", and "Author C" are included in the output because they are unique and exist in the `books` list. 
- **Filtered Out:** "None" values and books without the `"author"` key are filtered out, meaning that they don't appear in the final output.

This structured approach using set comprehension combined with `yield from` effectively allows for concise, readable, and efficient extraction of unique authors from a list of books. It's particularly useful in the context of large datasets where unique values are often needed without duplicates.

In [22]:
# Get Llama 3.2 to answer
ollama_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

def stream_ollama_answer(messages):
    stream = ollama_openai.chat.completions.create(        
        model=MODEL_LLAMA,
        messages=messages,
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [23]:
stream_ollama_answer(messages)

**Detailed Explanation**

This line of code uses a combination of Python features to achieve a specific goal. It's part of a larger context called "list comprehension" or more specifically, an iterable expression.

Let's break down what this code does:

1. `{book.get("author") for book in books if book.get("author")}` is an **iterable expression**. This syntax uses the `for` loop to iterate over the elements of a collection (in this case, `books`) and applies a transformation to each element using the `get()` method.

2. `book.get("author")` is the **transformation** applied to each book in the `books` collection. The `get()` method retrieves the value associated with the key `"author"` from each book's dictionary (assuming that's how data is stored). If the key doesn't exist, it returns `None`.

3. `yield from ...` is a **context manager** technique used to **yields an iterator** created by the iterable expression.

So, what does this code actually do?

- It creates a generator of author names.
 
Here's how it works:

  - For each book in `books`, it gets the author using `book.get("author")`.
  
  - While iterating over these authors, if an author is found to be `None` during iteration (for example, some books have missing data), the subsequent code after the iterable expression will be skipped and move on with the next non-None book.
  
It then uses **`yield from`** before an iterable expression.

- It takes any existing generator function in the current namespace, which produces an iterator, and extends it using `yield from ...`.
- Here we’re iterating over each book’s author:
  - If a book is missing an ‘author’ key, skip to the next book
- Yield each result as if they were generated by your existing code – this way you avoid creating any overhead in memory while still being able to use iteration directly with this part of your function.



Here's a step-by-step example:

Let's assume we have a list of books (`books`) and a dictionary that stores book data. Each book is stored under the key "book". Below are two functions, `get_books()` and `extract_authors()`, which represent our code:



python
import json


# Function to get all books with their author

def get_books():
    books = [
        {"title": 'Book1', "author": 'Author 1'},
        {"title": 'Book2', "", },
        {"title": 'Book3', "author": 'Author 4'},
    ]

    # Use list comprehension for filtering and generating an iterator
    authors = {book['author'] for book in books if book.get('author')}
    return (
        yield from {book.get("author")
            for book in books 
            if book.get("author")}
    )


# Generate author names

def main():
    extracted_authors = get_books()

if __name__ == "__main__":
    for x in extracted_authors:
       print(x)
