# Creating technical explainer

Tool that responds to technical questions, using both Open AI (Paid) and DeepSeek (Free) api's

In [20]:
# imports

import os
import requests
from dotenv import load_dotenv

from IPython.display import Markdown, display, update_display
from openai import OpenAI
import ollama

In [23]:
# constants

MODEL_GPT = "gpt-5-nano"
MODEL_OPEN = "deepseek-r1:8b"

# Constants
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}

In [3]:
# Load environment variables

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please troubleshoot to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [4]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [5]:
# Define our system prompt and user prompt

system_prompt = "You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs"
user_prompt = "Please give a detailed explanation to the following question: " + question


In [7]:
# Create the message structure
def create_message():
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

## Open AI function


In [18]:
# Create an openai object instance
openai = OpenAI()

# And now: call the OpenAI API to perform the task.
def respond_with_openai():
    
    # create message format to the LLM and request response
    response = openai.chat.completions.create(
        model = MODEL_GPT,
        messages = create_message(),
        stream=True
    )
    return response


In [30]:
# A function to display this nicely in the Jupyter output, using markdown
def display_response_openai():    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in respond_with_openai():
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)
    

In [21]:
# Get gpt-4.1-nano to answer, with streaming
display_response_openai()

Certainly! Let's dissect this code snippet step-by-step to understand what it does and why:

python
yield from {book.get("author") for book in books if book.get("author")}


### 1. **Overall Structure**

This line involves several key components:
- A **set comprehension**: `{book.get("author") for book in books if book.get("author")}`
- The `yield from` statement

### 2. **Understanding the Set Comprehension**

python
{book.get("author") for book in books if book.get("author")}


- This creates a **set** of authors from a collection called `books`.
- `books` is assumed to be an iterable (like a list) containing **dictionaries** representing books.
- For each `book` in `books`, it performs:
  
  python
  if book.get("author")
  
  
  - This checks whether the `"author"` key exists and has a truthy value in the current `book` dictionary.
  - If the author exists and is not `None` or empty, it gets added to the set.
  
- The **value** added to the set is obtained by:

  python
  book.get("author")
  
  
  - The `.get()` method retrieves the value for `"author"` with the safety of not raising a `KeyError` if the key is missing—returning `None` instead.
  - Because of the `if` condition, only non-`None` (and truthy) author values are included.

**Result**: this set will contain **unique authors** from the list of books, ignoring duplicates and any books without an author.

---

### 3. **Why a Set?**

- Sets automatically eliminate duplicates, so this comprehension yields **a collection of unique authors**.
- This is useful when you want to process or yield a list of distinct authors.

---

### 4. **The `yield from` Statement**

python
yield from ...


- `yield from` is used in generator functions to **delegate** part of the generator’s operations to an **iterable**.
- It takes an **iterable** (or, more generally, any iterator, including a set, which is iterable) and yields its items one by one.

**In this case**, the set comprehension produces a set, which is iterable.

### 5. **Putting it all together**

- The code **generates (yields)** each author from the set of unique authors derived from the `books`.
- Effectively, it yields **each author exactly once**, since the set ensures distinct values.

### 6. **Why use `yield from` here?**

- Instead of writing a loop explicitly:

  python
  for author in {book.get("author") for book in books if book.get("author")}:
      yield author
  

  the `yield from` syntax provides a **more concise** way to yield all items from an iterable.

---

### **Summary**

- The code creates a set of unique authors from a collection of `books`, ignoring entries without an author.
- It then **yields each author** one by one using `yield from`, making this code suitable within a generator function.

---

### **Example in context**

python
def get_unique_authors(books):
    yield from {book.get("author") for book in books if book.get("author")}

books = [
    {"title": "Book 1", "author": "Author A"},
    {"title": "Book 2", "author": "Author B"},
    {"title": "Book 3", "author": "Author A"},
    {"title": "Book 4"}  # No author key
]

for author in get_unique_authors(books):
    print(author)


**Output:**


Author A
Author B


In conclusion, the code efficiently yields **all unique, non-empty author names** from a list of books within a generator function.

## DeepSeek Function

In [25]:
import subprocess

# Check if the model is already downloaded
try:
    result = subprocess.run(['ollama', 'list'], capture_output=True, text=True, check=True)
    if MODEL_OPEN in result.stdout:
        print(f"Model '{MODEL_OPEN}' found.")
    else:
        print(f"Model '{MODEL_OPEN}' not found. Downloading...")
        # Download the model
        subprocess.run(['ollama', 'pull', MODEL_OPEN], check=True)
        print(f"Model '{MODEL_OPEN}' downloaded successfully.")
except subprocess.CalledProcessError as e:
    print(f"Error checking or downloading Ollama model: {e}")
    print(f"Stderr: {e.stderr}")
except FileNotFoundError:
    print("Ollama command not found. Please ensure Ollama is installed and in your system's PATH.")


Model 'deepseek-r1:8b' found.


In [26]:
# Create a payload dictionary to pass to the model
def create_payload(model):
    return {
        "model": model,
        "messages": create_message(),
        "stream": False
    }

In [27]:
# Call the model
def respond_with_ollama():
    
    # Create payload
    payload = create_payload(MODEL_OPEN)
    
    # create message format to the LLM and request response
    response = requests.post(
        OLLAMA_API, 
        json=payload, 
        headers=HEADERS
    )
    return response.json()['message']['content']


In [31]:
# A function to display this nicely in the Jupyter output, using markdown
def display_response_deepseek():
    # Call the summary function that creates the message format and get a response from OpenAI
    response = respond_with_ollama()
    
    # Fetch the final response, ignoring the model thinking
    summary = response.split("</think>")[-1]

    # Convert response into markdown in jupyter notebook
    display(Markdown(summary))


In [32]:
# Call the function to respond to question and display in markdown format in the notebook
display_response_deepseek()


The provided line of code is:

```python
yield from {book.get("author") for book in books if book.get("author")}
```

This is an **enhanced syntax** introduced in Python 3.8 called a **set comprehension**, which creates a set (a collection of unique elements) using the values extracted from each `book` dictionary where the "author" key exists and has a truthy value.

### Explanation:
- **Set Comprehension**: `{expression for item in iterable if condition}`  
  This is similar to list comprehensions but builds an unordered collection of unique items (a set). It iterates over each element (`book`) in `books` (which should be a list or sequence of dictionaries), checks the condition, and adds elements that meet it.

- **`book.get("author")`**:  
  This retrieves the value associated with the key `"author"` from the dictionary `book`. Using `.get()` ensures no KeyError if the key doesn't exist—it returns `None` (or nothing if a default is provided). Here, it checks for existence implicitly because `if book.get("author")` evaluates to `True` only if the value exists and is truthy. If `"author"` is missing or has a falsy value (`""`, `0`, `False`, etc.), it skips that entry.

- **Condition**:  
  The condition `if book.get("author")` filters out books where:
  - The key "author" doesn't exist (returns `None`, which evaluates to `False`).
  - Or the author's name is falsy (e.g., an empty string, `0`, or `False`). 

- **Generator Behavior**:  
  Since it's used with `yield from`, this set becomes a generator that produces each unique author exactly once. However, note that building a set first means all authors are collected into memory before yielding.

### Example:
Given:

```python
books = [
    {"title": "Book1", "author": "Author A"},
    {"title": "Book2", "author": "Author B"},
    {"title": "Book3", "author": ""},
    {"title": "Book4", "author": None},
]

# The set comprehension creates a set: {'Author A', 'Author B'} (skips empty string and missing/None authors)
unique_authors = {book.get("author") for book in books if book.get("author")}  # Set of unique, truthy authors

yield from unique_authors
```

This generator will yield:
- "Author A"
- "Author B"

### Important Notes:
1. **Memory Consideration**:  
   The set is built entirely first (all distinct authors), then the elements are yielded one by one.
2. **Use Case**: This might be intended to avoid duplicates and skip falsy values, but it's not efficient for very large lists where memory is a concern.

### Improved Version (if you want true uniqueness without building a set first):
You can use `filter` + `.add()` in a generator function:

```python
for author in book.get("author") or "Anonymous" :  # This syntax isn't valid; let me correct it.
```

But wait, the original code uses a condition on truthiness. Here's a corrected and clearer version:
- **Corrected Code** (with set comprehension) builds all unique authors first.

### Common Pitfalls:
1. **Unnecessary Set Construction**:  
   The `yield from` syntax is used with a set, so it will build the entire set in memory before yielding.
2. **Falsies Filtered Out**:  
   If an author's name is `"0"` or `"False"`, they are skipped because these evaluate to `False`. Use `.get("author")` explicitly checks for existence; use `if book.get('author') is not None` if you want to include all authors regardless of falsy values.

### Correction:
The code likely has a syntax error (unmatched quotes). It should be:

```python
yield from {book.get("author") for book in books if book.get("author")}
```

But I'll explain the intended functionality as written. If you have more questions, feel free to ask!