# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
import ollama
from typing import List
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# set up environment
# Initialize and constants

load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
# constants 
MODEL_GPT = 'gpt-4o-mini'
OLLAMA_API = "http://localhost:11434/api/chat"
MODEL_LLAMA = 'llama3.2:1b'
HEADERS = {"Content-Type": "application/json"}
openai = OpenAI()

API key looks good so far


In [3]:
# Let's just make sure the model is loaded
!ollama pull llama3.2:1b

[?25lpulling manifest â ‹ [?25h[?25l[2K[1Gpulling manifest â ™ [?25h[?25l[2K[1Gpulling manifest â ¹ [?25h[?25l[2K[1Gpulling manifest â ¸ [?25h[?25l[2K[1Gpulling manifest â ¼ [?25h[?25l[2K[1Gpulling manifest 
pulling 74701a8c35f6... 100% â–•â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–� 1.3 GB                         
pulling 966de95ca8a6... 100% â–•â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–� 1.4 KB                         
pulling fcc5a6bec9da... 100% â–•â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–� 7.7 KB                         
pulling a70ff7e570d9... 100% â–•â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–� 6.0 KB                         
pulling 4f659a1e86d7... 100% â–•â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–�  485 B                         
verifying sha256 digest 
writing manifest 
success [?25h


In [4]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [5]:
system_prompt = "You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs only"
user_prompt = "Please give a detailed explanation of the following question:" + question

In [16]:
messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ]

In [17]:
# Get gpt-4o-mini to answer, with streaming
def get_gpt_answer():
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages=messages,
        stream=True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [18]:
get_gpt_answer()

Let's break down the given code snippet: `yield from {book.get("author") for book in books if book.get("author")}`.

### Explanation of the Components

1. **Set Comprehension**: 
   - The expression `{book.get("author") for book in books if book.get("author")}` is a set comprehension. This creates a set of unique authors from a list (or iterable) of `books`.
   - `book.get("author")` attempts to retrieve the value associated with the key "author" from each `book` dictionary.
   - The comprehension iterates through each `book` in the `books` collection (which we can assume is a list of dictionaries or similar structures).
   - The condition `if book.get("author")` filters out any `book` entries where the `"author"` key does not exist or its value is `None`, ensuring that only valid authors are included.

2. **The `yield from` Statement**:
   - `yield from` is a statement used in Python's generator functions. It allows the generator to yield all values from an iterable (in this case, the set of authors) one by one.
   - Using `yield from` is particularly useful for delegating the yielding process to another iterable while maintaining control within the generator function.

### Purpose of the Code

In the context of a generator function, this line of code does the following:

- Collects all unique authors from the `books` list, effectively filtering out any books that do not have an author specified.
- Each unique author is yielded one at a time to whoever is consuming the generator — for instance, it could be used in a loop to process each author separately.

### Why Use This Code?

- **Elimination of Duplicates**: By using a set comprehension, the code automatically ensures that each author is unique in the results. If two or more books are by the same author, only one instance of that author's name will be yielded.
- **Efficiency**: The `yield from` construct is efficient for producing a series of values and is particularly useful when working with large datasets or streams of data, allowing the consumer of the generator to process each author as they are needed without having to load everything into memory at once.

### Example Walkthrough

Suppose you have the following list of `books`:

python
books = [
    {"title": "Book A", "author": "Author 1"},
    {"title": "Book B", "author": "Author 2"},
    {"title": "Book C", "author": "Author 1"},
    {"title": "Book D"},  # No author
    {"title": "Book E", "author": "Author 3"},
]


When you run the comprehension part:

python
{book.get("author") for book in books if book.get("author")}


You would get a set: `{"Author 1", "Author 2", "Author 3"}`.

Then, if this code is part of a generator function and you call that generator, each time the generator is iterated over, it would yield each of these authors:

- "Author 1"
- "Author 2"
- "Author 3"

### Conclusion

This code efficiently extracts unique authors from a collection of book dictionaries while filtering out entries that do not have an associated author. It's an example of combining set comprehensions with Python's generator capabilities.

In [23]:
# Get Llama 3.2 to 
response = ollama.chat(model=MODEL_LLAMA, messages=messages)
reply = response['message']['content'] 
display(Markdown(reply))

This line of code uses Python's built-in `yield from` statement, which is a feature introduced in Python 3.10. Here's a breakdown of what it does:

1. `books`: This variable is expected to be an iterable (such as a list, tuple, or dictionary) that contains objects with the attribute `get("author")`.
2. The expression `{book.get("author") for book in books}` generates a new iterable containing the authors' names.
3. `yield from`: This keyword returns generators (i.e., iterables that can be used to generate values on-the-fly). It takes an iterable (in this case, the result of the previous expression) and returns a generator expression.

So, what's happening here?

- The code first gets all authors' names for each book in `books` using a generator expression.
- It then uses `yield from` to return this generated list as a new iterable. This means that when the outer loop (which iterates over the `books` iterable) finishes, it will yield each author's name one at a time.

Why is this useful?

- The code doesn't have to load all authors' names into memory at once.
- It can handle large datasets without running out of memory.
- It allows for efficient iteration over the data, as the results are generated on-the-fly rather than loaded into memory upfront.

Here's an example to illustrate how it works:

```python
books = [
    {"author": "Author1", "title": "Book1"},
    {"author": "Author2", "title": "Book2"},
    {"author": "Author3", "title": "Book3"}
]

for author in yield from books:
    print(author)

# Output:
# Author1
# Author2
# Author3
```

In this example, we have a list of dictionaries where each dictionary represents a book with an `author` key. We use the generator expression to get the authors' names for each book and then iterate over the resulting iterable.

This code is likely used in a data processing or data analysis pipeline where you need to process large datasets of books. The `yield from` statement makes it efficient and scalable, especially when dealing with large datasets.