# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [168]:
# imports
import pandas as pd
import numpy as np
import sys
import os

from openai import OpenAI
from IPython.display import Markdown, display
from dotenv import load_dotenv
import time
from tqdm.notebook import tqdm

In [169]:
print(sys.executable)

C:\Users\Prithvi\anaconda3\envs\udemyllm\python.exe


In [170]:
os.environ.get('CONDA_DEFAULT_ENV')

'udemyllm'

In [207]:
# print(os.environ.get('OPENAI_API_KEY'))

In [141]:
load_dotenv()

True

In [208]:
# os.environ.get('OPENAI_API_KEY')

In [142]:
# constants
MODEL_GPT = 'gpt-4o-mini'
MODEL_GPT = 'gpt-4.1-nano'
MODEL_LLAMA = 'tinyllama:latest'

In [143]:
# set up environment
os.chdir(r"C:\Users\Prithvi\Downloads\Practice\Udemy - LLM Engineering\llm_engineering")
os.getcwd()

'C:\\Users\\Prithvi\\Downloads\\Practice\\Udemy - LLM Engineering\\llm_engineering'

In [206]:
api_key = os.getenv('OPENAI_API_KEY')

In [145]:
if api_key.startswith('sk-proj') and len(api_key) > 10:
    print("API key looks correct")
else:
    print("API key is not correct")

API key looks correct


In [161]:
# here is the question; type over this to ask something new

question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [171]:
system_prompt = "Please explain what this Python code does and why. Please format output in \
Markdown and with nice emojis like chatgpt output wherever appropriate to explain"

In [172]:
user_prompt = """yield from {i.get("author") for i in dictionary1 if i.get("author")}"""

#### Getting the response using OPEN model

In [201]:
def get_response_from_llm(MODEL_ID, user_prompt):
    # Get gpt-4o-mini to answer, with streaming
    if MODEL_ID.startswith('tinyllama') or 'mistral' in MODEL_ID or "llama" in MODEL_ID:
        openai = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
    else:
        openai = OpenAI()
    response = openai.chat.completions.create(model = MODEL_ID,
                              messages = [
                                  {"role": "system", "content": system_prompt},
                                  {"role": "user", "content": user_prompt}
                              ],
                                         stream = True)
    output_md = None
    full_content = ""
    counter = 0
    pbar = tqdm(total=500, desc="Streaming", ncols=100)
    for i in response:
        if counter < 500:
            delta = i.choices[0].delta
            
            if delta.content:
                full_content = full_content + delta.content
    
                if output_md is None:
                    output_md = display(Markdown(full_content), display_id = True)
                
                else:
                    output_md.update(Markdown(full_content))
    
                counter = counter + 1
                if counter < 500:
                    pbar.update(1)
                time.sleep(0.1)
    
    pbar.close()            

In [202]:
MODEL_GPT

'gpt-4.1-nano'

In [203]:
MODEL_LLAMA

'tinyllama:latest'

In [205]:
get_response_from_llm(MODEL_ID = MODEL_LLAMA, user_prompt = user_prompt)

Streaming:   0%|                                                            | 0/500 [00:00<?, ?it/s]

In the Python code snippet provided, the "yield" statement (yielded) is executed for each entry (item) from the nested list `dictionary1` that satisfies the condition specified by the condition statement (`i.get("author")` in this case). The output generated by PyCharm's IntelliSense feature will have an arrow pointing to each item, indicating its position in the list; the output will be well-formed Markdown with emojis used accordingly to explain what is being explained, like in chatGPT.

In [183]:
# Get gpt-4o-mini to answer, with streaming
openai = OpenAI()
response = openai.chat.completions.create(model = MODEL_GPT,
                              messages = [
                                  {"role": "system", "content": system_prompt},
                                  {"role": "user", "content": user_prompt}
                              ],
                                         stream = True)

In [184]:
output_md = None

In [185]:
full_content = ""
counter = 0
pbar = tqdm(total=500, desc="Streaming", ncols=100)
for i in response:
    if counter < 500:
        delta = i.choices[0].delta
        
        if delta.content:
            full_content = full_content + delta.content

            if output_md is None:
                output_md = display(Markdown(full_content), display_id = True)
            
            else:
                output_md.update(Markdown(full_content))

            counter = counter + 1
            if counter < 500:
                pbar.update(1)
            time.sleep(0.1)

pbar.close()            

Streaming:   0%|                                                            | 0/500 [00:00<?, ?it/s]

Certainly! Let's break down this Python code:

```python
yield from {i.get("author") for i in dictionary1 if i.get("author")}
```

---

### What does this code do? 🤔

This line **generates values** from a **set** of authors extracted from `dictionary1`. Here's a step-by-step explanation:

### Step-by-step explanation 🧩

1. **Set comprehension**:  
   ```python
   {i.get("author") for i in dictionary1 if i.get("author")}
   ```
   - This creates a **set** of authors.
   - For each element `i` in `dictionary1` (which is presumably an iterable of dictionaries),  
     - It extracts the value associated with the key `"author"` **only if** that value exists and is truthy (`if i.get("author")`).

2. **Set**:  
   - Using a set (`{...}`) ensures **unique** authors—no duplicates.

3. **`yield from`** 🚀  
   - This statement **yields each element** of the set **one by one** to the caller.  
   - It's a **generator** operation, meaning this code is part of a generator function.

---

### Why? 🤔

- To **generate all unique authors** from the `dictionary1` list of dictionaries.
- 📚 This might be used in situations where you want to **iterate over authors** lazily (one at a time), such as in a larger data processing pipeline.

---

### Summary

✅ **The code:**

- Extracts all authors from a list of dictionaries (`dictionary1`)
- Filters out any entries without an author
- Ensures authors are unique (via set)
- Yields each author one by one to the caller 

---

### Example 📊

Suppose:

```python
dictionary1 = [
    {"author": "Alice", "title": "Book 1"},
    {"author": "Bob", "title": "Book 2"},
    {"author": "Alice", "title": "Book 3"},
    {"title": "Untitled"}  # No author key
]
```

Executing:

```python
yield from {i.get("author") for i in dictionary1 if i.get("author")}
```

Will **yield**:

```
'Alice', 'Bob'
```

**each one individually**.

---

Let me know if you'd like to see

In [None]:
# Get Llama 3.2 to answer

#### Getting the reponse using OLLAMA models

In [192]:
openai = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

In [193]:
MODEL_LLAMA = 'tinyllama:latest'

In [194]:
# Get gpt-4o-mini to answer, with streaming
response = openai.chat.completions.create(model = MODEL_LLAMA,
                              messages = [
                                  {"role": "system", "content": system_prompt},
                                  {"role": "user", "content": user_prompt}
                              ],
                                         stream = True)

In [195]:
output_md = None

In [196]:
full_content = ""
counter = 0
pbar = tqdm(total=500, desc="Streaming", ncols=100)
for i in response:
    if counter < 500:
        delta = i.choices[0].delta
        
        if delta.content:
            full_content = full_content + delta.content

            if output_md is None:
                output_md = display(Markdown(full_content), display_id = True)
            
            else:
                output_md.update(Markdown(full_content))

            counter = counter + 1
            if counter < 500:
                pbar.update(1)
            time.sleep(0.1)
pbar.close()            

Streaming:   0%|                                                            | 0/500 [00:00<?, ?it/s]

The code above is Python-based and uses a loop named "for" to retrieve and display some data (in this case, the names of authors) from a dictionary called "dictionary1", filtering by "author". The main output looks like the following:

```markdown
![](https://user-images.githubusercontent.com/68987014/131622959-bbc7d1f0-2eac-4adb-a69b-daebdbdf38ee.png)

Each image includes an author's name in a chatgpt style 🤖 underneath the picture - this is achieved by passing parameters to ChatGPT, namely the name of the author, and using a dedicated tokenizer and the provided model, "large", to generate random sentences with information about each author. The resulting output is a fun and informative way to explore some data associated with our dictionaries!