# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [13]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI
import ollama

In [4]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [5]:

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [6]:
# here is the question; type over this to ask something new

my_question = input("Please enter your detailed question for an appropriate answer:")

Please enter your detailed question for an appropriate answer: yield from {book.get("author") for book in books if book.get("author")}


In [7]:
system_prompt = "You are a helpful technical tutor that answers questions related to python code, \
                software engineering, AI implementation and Large language Models."
user_prompt = " Please give a detailed explaination to the following question :" + my_question


In [8]:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

In [10]:
stream = openai.chat.completions.create(
        model=MODEL_GPT,messages=messages,stream=True
    )

response = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    response += chunk.choices[0].delta.content or ''
    response = response.replace("```","").replace("markdown", "")
    update_display(Markdown(response), display_id=display_handle.display_id)

In the expression you provided: 

python
yield from {book.get("author") for book in books if book.get("author")}


we're looking at a combination of Python's `yield from` statement, set comprehension, and the use of dictionary methods to filter and yield authors from a collection of "books". Let's break this down step by step.

### Components of the Expression

1. **Set Comprehension**:
   - The part inside the curly braces:
     python
     {book.get("author") for book in books if book.get("author")}
     
   - This is a *set comprehension*, which is a concise way to create a set from an iterable. 
   - This specific comprehension does the following:
     - Iterates over `books`, which is expected to be an iterable (like a list) of dictionaries.
     - For each `book`, it calls `book.get("author")` to retrieve the value associated with the key `"author"`.
     - The `if book.get("author")` condition ensures that only non-`None` or non-empty author values are included in the set. This filters out any books that either do not have an "author" key or have an empty or `None` value for "author".

   - The result is a set containing unique authors of the books because sets inherently do not allow duplicate values.

2. **`yield from`**:
   - The `yield from` statement is used in a generator function. It allows you to yield all values from an iterable, in this case, the set created by the comprehension.
   - This means that the generator will yield each unique author found in the set, one by one, when it is iterated over.

### Complete Context

To understand how this code snippet would be used in context, let's combine everything into a sample generator function:

python
def get_unique_authors(books):
    yield from {book.get("author") for book in books if book.get("author")}


In this function:
- `books` is a collection (like a list) of dictionaries where each dictionary represents a book with various details, including an "author".
- The function `get_unique_authors` will yield each unique author present in the provided `books` list when the generator is iterated.

### Example

Let's consider a practical example for clarity:

python
books = [
    {"title": "Book One", "author": "Author A"},
    {"title": "Book Two", "author": "Author B"},
    {"title": "Book Three", "author": None},
    {"title": "Book Four", "author": "Author A"},  # Duplicate author
    {"title": "Book Five", "author": ""},
    {"title": "Book Six"},  # No author key here
]

def get_unique_authors(books):
    yield from {book.get("author") for book in books if book.get("author")}

# Using the function
for author in get_unique_authors(books):
    print(author)


### Expected Output

The output of the above code would be:


Author A
Author B


### Summary

- `yield from`: This allows the function to yield values from another iterable.
- Set comprehension: Creates a set of unique authors from the books list while filtering out any None or empty author values.
- Combined together, this efficient approach provides a clean and effective way to gather and yield unique authors from a collection of book records.

In [14]:
# Get LLama 3.2 to answer the question

response = ollama.chat(model=MODEL_LLAMA, messages=messages)
reply = response['message']['content']
display(Markdown(reply))

Let's break down this line of Python code step by step.

**Overview**

This line is using a combination of list comprehension and the `yield from` syntax, which is used to delegate iteration over a sub-iterator. We're also dealing with dictionaries (`book`) and another iterable (`books`).

**Line-by-Line Explanation**

```python
{book.get("author") for book in books if book.get("author")}
```

This is a dictionary comprehension (also known as a dict comprehension), which allows us to create a new dictionary based on the elements of an existing dictionary.

Here's what happens inside this comprehension:

* `for book in books`: Iterate over each item (`book`) in the `books` iterable.
* `if book.get("author")`: Filter out any items that don't have the key `"author"` in their dictionary. The `.get()` method returns `None` if the key is not present, so this condition effectively skips those items.

Now, let's replace the dictionary comprehension with its expanded form:

```python
{value for book in books if book.get("author")}
```

This will create a new dictionary where each key-value pair consists of an author name (the value returned by `book.get("author")`) and an empty string (`""`).

**Now, let's incorporate the yield from syntax**

```python
yield from {book.get("author") for book in books if book.get("author")}
```

At this point, we've created a dictionary comprehension that generates a sequence of author names. However, when you use `yield from`, Python doesn't create an iterator immediately; instead, it creates an iterator that delegates to the sub-iterator (in this case, the dictionary comprehension).

The resulting iterator yields each author name one by one, allowing you to iterate over them without loading all the authors into memory at once.

**What's happening behind the scenes**

When Python encounters `yield from`, it does the following:

1. Creates an iterator for the sub-iterator (the dictionary comprehension).
2. Delegates iteration to this sub-iterator.
3. Yields each item from the sub-iterator, one by one.

In our case, this means that the outer loop (`for book in books`) will iterate over each `book`, and for each `book`:

1. The inner list comprehension (`{book.get("author") for book in books if book.get("author")}`) generates a sequence of author names.
2. `yield from` delegates iteration to this sequence, so each author name is yielded one by one.

**Example Use Case**

Suppose you have a list of dictionaries representing books:
```python
books = [
    {"title": "Book 1", "author": "John Doe"},
    {"title": "Book 2", "publisher": "ABC Inc."},
    {"title": "Book 3", "author": "Jane Smith"}
]
```
To extract all author names from these books, you could use the `yield from` version of the original line:
```python
def get_authors(books):
    yield from {book.get("author") for book in books if book.get("author")}

# Example usage:
for author in get_authors(books):
    print(author)  # Output: John Doe, Jane Smith
```
This code will iterate over each author name without loading all the authors into memory at once.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you move to Week 2 (which is tons of fun)</h2>
            <span style="color:#900;">Please see the week1 EXERCISE notebook for your challenge for the end of week 1. This will give you some essential practice working with Frontier APIs, and prepare you well for Week 2.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">A reminder on 3 useful resources</h2>
            <span style="color:#f71;">1. The resources for the course are available <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">here.</a><br/>
            2. I'm on LinkedIn <a href="https://www.linkedin.com/in/eddonner/">here</a> and I love connecting with people taking the course!<br/>
            3. I'm trying out X/Twitter and I'm at <a href="https://x.com/edwarddonner">@edwarddonner<a> and hoping people will teach me how it's done..  
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../thankyou.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#090;">Finally! I have a special request for you</h2>
            <span style="color:#090;">
                My editor tells me that it makes a MASSIVE difference when students rate this course on Udemy - it's one of the main ways that Udemy decides whether to show it to others. If you're able to take a minute to rate this, I'd be so very grateful! And regardless - always please reach out to me at ed@edwarddonner.com if I can help at any point.
            </span>
        </td>
    </tr>
</table>