# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [None]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [3]:
# Initialize and constants

load_dotenv(override=True)
api_key = "ollama"    
MODEL = 'gemma3:4b'
openai = OpenAI(base_url='http://localhost:11434/v1', api_key=api_key)

In [4]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
ed.links

['https://edwarddonner.com/',
 'https://edwarddonner.com/connect-four/',
 'https://edwarddonner.com/outsmart/',
 'https://edwarddonner.com/about-me-and-about-nebula/',
 'https://edwarddonner.com/posts/',
 'https://edwarddonner.com/',
 'https://news.ycombinator.com',
 'https://nebula.io/?utm_source=ed&utm_medium=referral',
 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html',
 'https://patents.google.com/patent/US20210049536A1/',
 'https://www.linkedin.com/in/eddonner/',
 'https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/',
 'https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/',
 'https://edwarddonner.com/2024/12/21/llm-resources-superdatascience/',
 'https://edwarddonner.com/2024/12/21/llm-resources-superdatascience/',
 'https://edwarddonner.com/2024/11/13/llm-engineering-resources/',
 'https://edwarddonner.com/2024/11/13/llm-engineering-resources/',
 'ht

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [8]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [9]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2024/12/21/

In [10]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [11]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

huggingface = Website("https://huggingface.co")
huggingface.links[:4]

['/', '/models', '/datasets', '/spaces']

In [12]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': '/'},
  {'type': 'models', 'url': '/models'},
  {'type': 'datasets', 'url': '/datasets'},
  {'type': 'spaces', 'url': '/spaces'},
  {'type': 'blog', 'url': '/blog'},
  {'type': 'documentation', 'url': '/docs'},
  {'type': 'enterprise', 'url': '/enterprise'},
  {'type': 'endpoints', 'url': 'https://ui.endpoints.huggingface.co'},
  {'type': 'community', 'url': 'https://discuss.huggingface.co'},
  {'type': 'status', 'url': 'https://status.huggingface.co/'},
  {'type': 'github', 'url': 'https://github.com/huggingface'},
  {'type': 'social media', 'url': 'https://twitter.com/huggingface'},
  {'type': 'linkedin',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [16]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [None]:
hf_details = get_all_details("https://huggingface.co")

In [18]:
print(hf_details)

Landing page:
Webpage Title:
Hugging Face – The AI community building the future.
Webpage Contents:
Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
NEW
Welcome Cohere on the Hub 🔥
Welcome Hyperbolic, Nebius AI Studio, and Novita on the Hub 🔥
Welcome Fireworks.ai on the Hub 🎆
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 1M+ models
Trending on
this week
Models
microsoft/bitnet-b1.58-2B-4T
Updated
3 days ago
•
7.31k
•
479
HiDream-ai/HiDream-I1-Full
Updated
4 days ago
•
22.8k
•
613
agentica-org/DeepCoder-14B-Preview
Updated
10 days ago
•
30.4k
•
579
moonshotai/Kimi-VL-A3B-Thinking
Updated
about 3 hours ago
•
25k
•
366
microsoft/MAI-DS-R1
Updated
3 days ago
•
113
•
140
Browse 1M+ models
Spaces
Running
4.81k
4.81k
DeepSite
🐳
Generate any application with DeepSeek
Running
on
Zero
435
435
UNO FLUX
⚡
Generate customized images using text and m

In [20]:
# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."


In [26]:
def get_brochure_user_prompt(company_name, url_prompt):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += url_prompt
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [None]:
get_brochure_user_prompt("HuggingFace", hf_details)

In [23]:
def create_brochure(company_name, url_prompt):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url_prompt)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [25]:
hf_brochure = create_brochure("HuggingFace", hf_details)

Okay, here’s a humorous brochure for Hugging Face, built from the information you provided. Let’s aim for a tone that’s enthusiastic, slightly nerdy, and highlights the collaborative spirit.

---

**(Brochure - Front Panel - Image: A stylized, slightly chaotic image of interconnected nodes and neural networks)**

**Hugging Face: Where Ideas Get Trained (and Sometimes Explode)**

**(Smaller Text):** Join the world's largest AI community and help build the future of machine learning! 

---

**(Inside Left Panel - Image: A person excitedly pointing at a screen filled with code)**

**Level Up Your ML Game**

*   **Massive Model Library:** Seriously, 1M+ models –  We’ve got everything from tiny, adorable models to behemoths that'll make your GPU weep (in a good way!).
*   **Collaborate Like a Boss:**  The Hugging Face Hub is where the magic happens.  Share your creations, learn from others, and contribute to the open-source revolution. 
*   **Experimentation Station:**  Spaces lets you deploy and test your models—perfect for quick prototyping and seeing results fast.
*   **It's a Party:** We have a huge and engaged community of users. 

**(Small Callout Box):**  "Don’t just build AI, *orchestrate* it!"



**(Inside Right Panel - Image: A hand adjusting knobs on a complex machine)**

**Hugging Face - Options For Everyone**

*   **For the Data Scientists & Researchers:**  Build and share models, datasets, and applications.  Run custom Spaces. Access enterprise-grade security.
*   **For the Developers:** Utilize our Python library,  Transformers.js, and other tools to accelerate your development.
*   **For Businesses:** Deploy solutions with Inference Endpoints and Scale efficiently. 

**Quick Facts:**

*   **Models:** 1M+ and counting!
*   **Community:**  Thousands of builders and learners.
*   **Tools:** We have the open source tools to get you started. 

**(Small section - quick links):**

*   [Hugging Face Hub](https://huggingface.co/hub) – “Your one-stop shop for AI models.”
*   [HuggingChat](https://huggingface.co/chat) – “Talk to an AI - and help build it”



**(Back Panel - Image: The Hugging Face Logo)**

**Hugging Face – Powered by Community, Driven by Innovation.**

**(Small print):**  “Warning: May cause excessive excitement about machine learning.”

**(Contact Information):**  [Website](https://huggingface.co), [Twitter](https://twitter.com/huggingface), [Discord](https://discord.com/invite/huggingface)


---

**Notes on why I made the choices:**

*   **Humor:**  I injected a bit of self-aware humor (“May cause excessive excitement"). Acknowledge the geeky nature of the field.
*   **Emphasis on Community:** The brochure heavily emphasizes the collaborative spirit, which is a core part of Hugging Face.
*   **Clear Benefits:** I focused on what users get out of joining the Hugging Face ecosystem.
*   **Visuals:** I described the types of images that would be effective alongside the text.

To help me tailor this brochure even more, tell me:

*   Who is the primary audience for this brochure? (e.g., researchers, developers, businesses)
*   What is the main goal of the brochure? (e.g., attract new users, promote a specific product/feature)

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [27]:
def stream_brochure(company_name, url_prompt):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {
                "role": "user",
                "content": get_brochure_user_prompt(company_name, url_prompt),
            },
        ],
        stream=True,
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ""
        response = response.replace("```", "").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [28]:
hf_brochure_2 = stream_brochure("HuggingFace", "https://huggingface.co")

Okay, here's a humorous brochure for Hugging Face, built from the website content. Let’s call it “Unlock the Hug”:

---

**(Brochure Cover - Image: A cartoon hugging face with a giant, smiling neural network for an arm)**

**Hugging Face: Let's Get… Hugged.**

**(Inside Left Panel - Image: A diverse team of people excitedly looking at computer screens)**

**What *is* Hugging Face?**

Let’s be honest. You've probably heard the name. It's not just a cute face. It's a *massive* ecosystem for building, training, and deploying cutting-edge AI models. Think of us as the playground where all the coolest generative AI is being built. 

**For Our Customers:**

* **Stop Building From Scratch:**  We've got pre-trained models for everything from text generation (writing poems, scripts, marketing copy - you name it!), image creation, audio processing, and more.  It's like getting a super-smart assistant who already knows what you're trying to do. 
* **Hugging Face Hub:** Our central hub is *the* place to collaborate, share your models, datasets, and demos.  It’s where the AI community comes to play, learn, and build together. (Seriously, try it out!)
* **Easy Deployment:** Don’t get bogged down in infrastructure nightmares. We provide tools and services to help you deploy your models quickly and efficiently. 

**Customer Types:**  From startups to Fortune 500’s, we're serving a *very* diverse range of customers.  We recently worked with [mention a recognizable customer if available - e.g., Sephora], and we're powering innovation in industries like finance, healthcare, retail, and… well, pretty much everywhere AI is going.



**(Inside Right Panel - Image: A cartoon flowchart showing the Hugging Face workflow: Data -> Model -> Deployment)**

**For Investors:**

* **The Future of AI is Here:**  We're at the forefront of the generative AI revolution.  Demand for our tools and services is *explosive*. 
* **A Thriving Community:**  Our open-source ecosystem – fueled by passionate developers – is growing rapidly, creating huge network effects.
* **Massive Market Opportunity:** We’re tackling the biggest challenges in AI, and we’re doing it with a team of brilliant, collaborative people. (And we’re making a whole lot of people *really* happy.)

**For Potential Recruiters:**

* **Join the Huggle!** We're building the best team in AI. We value curiosity, collaboration, and a willingness to learn. 
* **Superb Culture:** We're committed to a flexible, inclusive, and supportive environment. (We even have a dedicated "Brain Fuel" pantry!) 
* **Career Paths:** We have roles in Machine Learning Engineering, Data Science, Research, Sales & Marketing, and more. Our engineers are involved in building some of the world’s most advanced AI models, and our data science team are helping to train them. 


**(Back Cover - Image: The Hugging Face logo with a small tagline: "Building the future of AI, one hug at a time.")**

**Learn More & Get Hugged:** [https://huggingface.co](https://huggingface.co)

---

**Notes on Building this Brochure:**

*   **Tone:** I've aimed for a playful, approachable, and slightly self-aware tone to reflect the company's brand.
*   **Specificity:** I've used placeholders like “[mention a recognizable customer if available]” to allow you to fill in specific details as they become available.  Adding real customer examples would significantly strengthen the brochure.
*   **Imagery:**  While I’ve described the images, high-quality visuals will be *crucial* for a printed brochure.
*   **Call to Action:**  The brochure ends with a clear call to action – visiting the Hugging Face website.

To refine this further, could you provide me with specific details about:

*   Any well-known customers Hugging Face is working with?
*   Specific areas of focus for their products (e.g., particular model types)?

In [None]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this exercise we extended the Day 1 code to make multiple LLM calls, and generate a document.

This is perhaps the first example of Agentic AI design patterns, as we combined multiple calls to LLMs. This will feature more in Week 2, and then we will return to Agentic AI in a big way in Week 8 when we build a fully autonomous Agent solution.

Generating content in this way is one of the very most common Use Cases. As with summarization, this can be applied to any business vertical. Write marketing content, generate a product tutorial from a spec, create personalized email content, and so much more. Explore how you can apply content generation to your business, and try making yourself a proof-of-concept prototype. See what other students have done in the community-contributions folder -- so many valuable projects -- it's wild!</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you move to Week 2 (which is tons of fun)</h2>
            <span style="color:#900;">Please see the week1 EXERCISE notebook for your challenge for the end of week 1. This will give you some essential practice working with Frontier APIs, and prepare you well for Week 2.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">A reminder on 3 useful resources</h2>
            <span style="color:#f71;">1. The resources for the course are available <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">here.</a><br/>
            2. I'm on LinkedIn <a href="https://www.linkedin.com/in/eddonner/">here</a> and I love connecting with people taking the course!<br/>
            3. I'm trying out X/Twitter and I'm at <a href="https://x.com/edwarddonner">@edwarddonner<a> and hoping people will teach me how it's done..  
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../thankyou.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#090;">Finally! I have a special request for you</h2>
            <span style="color:#090;">
                My editor tells me that it makes a MASSIVE difference when students rate this course on Udemy - it's one of the main ways that Udemy decides whether to show it to others. If you're able to take a minute to rate this, I'd be so very grateful! And regardless - always please reach out to me at ed@edwarddonner.com if I can help at any point.
            </span>
        </td>
    </tr>
</table>