# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

In [1]:
import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [4]:
# A class to represent a Webpage

headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
ed.links

['https://edwarddonner.com/',
 'https://edwarddonner.com/connect-four/',
 'https://edwarddonner.com/outsmart/',
 'https://edwarddonner.com/about-me-and-about-nebula/',
 'https://edwarddonner.com/posts/',
 'https://edwarddonner.com/',
 'https://news.ycombinator.com',
 'https://nebula.io/?utm_source=ed&utm_medium=referral',
 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html',
 'https://patents.google.com/patent/US20210049536A1/',
 'https://www.linkedin.com/in/eddonner/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/04/21/the-complete-agentic-ai-engineering-course/',
 'https://edwarddonner.com/2025/04/21/the-

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [8]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [9]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddo

In [10]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [11]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

huggingface = Website("https://huggingface.co")
huggingface.links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 'inference/get-started',
 '/spaces',
 '/models',
 '/zai-org/GLM-4.5',
 '/tencent/HunyuanWorld-1',
 '/black-forest-labs/FLUX.1-Krea-dev',
 '/Qwen/Qwen3-30B-A3B-Instruct-2507',
 '/Qwen/Qwen3-Coder-30B-A3B-Instruct',
 '/models',
 '/spaces/enzostvs/deepsite',
 '/spaces/Qwen/Qwen3-Coder-WebDev',
 '/spaces/zumjoy/Multi-Style_Video-to-Anime_Generator',
 '/spaces/smola/higgs_audio_v2',
 '/spaces/Wan-AI/Wan-2.2-5B',
 '/spaces',
 '/datasets/Kratos-AI/KAI_handwriting-ocr',
 '/datasets/Kratos-AI/airline-customersupport-englishaudio',
 '/datasets/Kratos-AI/korean-voice-emotion-dataset',
 '/datasets/Kratos-AI/KAI_speech-recognition-data',
 '/datasets/Kratos-AI/medical-prescription-english-audio',
 '/datasets',
 '/join',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/allenai'

In [12]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': 'https://huggingface.co'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'blog page', 'url': 'https://huggingface.co/blog'},
  {'type': 'GitHub page', 'url': 'https://github.com/huggingface'},
  {'type': 'LinkedIn page',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [13]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [14]:
print(get_all_details("https://huggingface.co"))

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/about'}, {'type': 'company page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}]}
Landing page:
Webpage Title:
Hugging Face – The AI community building the future.
Webpage Contents:
Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
NEW
Get started with Inference in seconds 🚀
Reachy Mini: The Open Robot for AI Builders
Welcome Cohere on the Hub 🔥
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 1M+ models
Trending on
this week
Models
zai-org/GLM-4.5
Updated
6 days ago
•
9.66k
•
959
tencent/

In [15]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
 and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
 Include details of company culture, customers and careers/jobs if you have the information."


In [16]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [17]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/about'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community page', 'url': 'https://discuss.huggingface.co'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface'}]}


"You are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHugging Face – The AI community building the future.\nWebpage Contents:\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nNEW\nGet started with Inference in seconds 🚀\nReachy Mini: The Open Robot for AI Builders\nWelcome Cohere on the Hub 🔥\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\nzai-org/GLM-4.5\nUpdated\n6 days ago\n•\n9.66k\n•\n959\ntencent/HunyuanWorld-1\nUpdated\n3 days ago\n•\n10k\n•\n515\nblack-forest-labs/FLUX.1-Krea-dev\nUpdated\n3 days ago\n•\n29.9k\n•\n365\nQwen/Qwen3-30B-A3B-Instruct-2507\nUpdated\n4 days ago\n•\n51.3k\n•\n363\nQwen/Qwe

In [18]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [19]:
create_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/about'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'company page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'join page', 'url': 'https://huggingface.co/join'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'docs page', 'url': 'https://huggingface.co/docs'}]}


```markdown
# 😊 Welcome to the Hugging Face Brochure! 😊

**Hugging Face – The AI Community Building the Future!**

---

**🤖 What is Hugging Face?**

At Hugging Face, we are not just a company; we’re a movement! If you've ever wished to hug an intelligent bot, you’ve come to the right place! We’re the friendly folk behind a platform where the machine learning community can **collaborate, create, and caffeinate** over cutting-edge models, datasets, and applications.

---

**🔥 What’s Trending?**

- **Models? Oh, we’ve got millions!** Over **1 Million models** and counting! From **HunyuanWorld-1** by Tencent to weirdly specific **Multi-Style Video-to-Anime Generators**, if it exists, it’s probably on our platform. 

- **Spaces?** We’ve got lots of them! From coding to generating adorable anime-style videos from your family vacation footage! (Disclaimer: Results may vary. Your family might not be so cute in anime.)

---

**💼 Who’s Using Us?**

Just to name-drop a few of our VIP customers:
- **Google** (they’re kind of a big deal 🤷)
- **Microsoft** (yes, that Microsoft!)
- **Grammarly** (because sometimes we need a little help "their" with "there" 😉)

Over **50,000 organizations** have joined our hugging revolution! 

---

**👨‍💻 Career Opportunities**

Thinking of joining the Hugging Face family? We have **job openings** that welcome anyone with a love for AI, a knack for collaboration, and a utopian vision for the future! Who knows, you might just be our next superstar! 

- **Perks:**  
  - Work alongside brilliant minds who might just beat you at chess! 
  - Contribute to open-source projects and help make the world a little hug-gier!

---

**💖 Company Culture**

At Hugging Face, you won't find stuffy cubicles or suits—because, let’s be real, we’re too busy hugging it out! (Metaphorically, of course... unless you’re into that). We believe in open-source minds, kindness, and interdisciplinary team-ups that would make even the Avengers jealous!

---

**📈 Join Us!**

Whether you’re here to **delve into AI**, **explore models**, or just **see what’s the deal with the cute logo**; we’re thrilled to have you! 

**PS:** Don’t forget to bring your best ideas and maybe some donuts—you know how the AI community loves their treats! 🍩

---

**🚀 Get Started with Us Today** on our website and explore a world of **AI Applications** that’ll have you believing in the future—or at least laughing a little at our jokes. 

**Hugging Face: Where AI Meets Heart!**

---

*This brochure is powered by AI love and a sprinkle of human absurdity. Because who wouldn’t want to hug it out?*
```


## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [20]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [21]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog', 'url': 'https://huggingface.co/blog'}, {'type': 'community forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub', 'url': 'https://github.com/huggingface'}, {'type': 'LinkedIn', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'Twitter', 'url': 'https://twitter.com/huggingface'}]}


# 🤗 Welcome to the Hugging Face Brochure! 🤗

## Hugging Face: Where AI Dreams Come True! 

Are you tired of your regular *hug* not being intelligent enough? Well, fret not! Hugging Face is the AI community on a mission to provide the world with super-smart, huggable AI solutions! Whether you’re a developer, an investor, or just someone looking for a good time (with or without robots), we’ve got you covered!

---

## 🚀 What's Our Deal?

### Models, Datasets, and Spaces – Oh My! 
- **Models**: Browse over 1 million models, including our popular trending models such as **GLM-4.5** and **HunyuanWorld-1**! They may not help you win at hide-and-seek, but they’ll definitely give your projects a fighting chance!
  
- **Datasets**: Grab from 250,000+ datasets – our magical treasure trove! Want to know how many people panic when their Wi-Fi goes down? We might just have the data for you!

- **Spaces**: Run and create applications that are more fun than a barrel of robots! Check out wild experiments like generating anime-style videos – no confused octopuses are involved...yet!

---

## 👥 Meet Our Happy Customers

Over **50,000 organizations** believe in the magic of Hugging Face! We’ve ropes of loyal fans from companies like:
- **Meta**: They love models.
- **Amazon**: They’ve modeled on something.
- **Google**: Personally still searching for the best AI dessert recipe.

Our customers range from savvy enterprises to casual developers looking for their next big break (or just good memes to share). 

---

## 🌈 Join Our Hugtastic Community!

At Hugging Face, collaboration is our jam! Our playful and energetic culture revolves around open-source contributions, friendly community projects, and, of course, an occasional robot dance-off 🤖💃. 

### How to Join? 
- **Developers & Engineers**: We're actively looking for individuals who want to unleash AI magic! 
- **Investors**: Come and invest while wearing your favorite AI-themed socks. We promise to keep our moody robots in check!
- **Curious Minds**: If you just want to hang around and get hugs (the virtual kind), we’re open!

---

## 💼 Work with Us!

Why take a stuffy office job when you can join our team? Our perks include:
- **Flexible Working Hours**: We're more flexible than a yoga instructor at a pretzel factory. 
- **Transformative Projects**: You can actually make a difference! Make friends with both humans and robots. 
- **Community Events**: Get together for bot-building parties or gaming nights – where AI and fun multiply exponentially. 

---

## 🤝 You're Just One Sign-Up Away!

So whether you want to help build the future, invest wisely, or land that dream job, join us at Hugging Face! We're the best hugs AI has to offer without the awkwardness of a regular hug!

---

**Hugging Face – Where AI Meets Human-Like Warmth!**  

🌟 Sign up now and let’s build something magnificent together! 🌟

---

For more information, visit [Hugging Face](https://huggingface.co) and follow us on all social networks! Because, let’s face it, we even let AI do our marketing!

In [22]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'company blog', 'url': 'https://huggingface.co/blog'}, {'type': 'community forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'social media link', 'url': 'https://twitter.com/huggingface'}, {'type': 'social media link', 'url': 'https://www.linkedin.com/company/huggingface/'}]}



# 🎉 Welcome to Hugging Face! 🎉

**Where Big Ideas & Cuddly Community Meet!**

## 🚀 The AI Community Building the Future

At Hugging Face, we’re not just creating AI; we’re crafting a whole new world! Imagine a universe where you can collaborate on models, datasets, and applications. Spoiler alert: IT. IS. AMAZING.

### What Do We Offer? 

- **Models?** Over 1 Million! And no, that’s not a typo. We’re practically swimming in them! 🏊‍♂️
- **Datasets?** More than 250k! We’ve lost track of how many times people have “dataset” puns. 😂
- **Spaces?** Sure, we have those too! Run your cool AI apps in our designated *Spaces.* 

### Meet Our Customers

We’ve got a fan club of over 50,000 organizations including:
- **The Big Four**: Amazon, Google, Microsoft, and Meta. They visit more than your relatives during the holidays!
- **Cool Startups**: Like Ai2 and Grammarly, because even AI needs a little grammar checking! 📚

### 🏢 Work Culture 

At Hugging Face, we’re all about collaboration without the awkward small talk. Our culture revolves around teamwork, openness, and sharing (yes, we're like a good ol’ friendship circle but WAY cooler). 

- **Diversity is Key:** Different backgrounds create better AI. Bring your unique flavor!
- **Open Source Love**: Join the party that’s all about sharing and caring. Our tools are open source, just like grandma's cookie recipe. 🍪

### 👩‍💼 Careers 

Looking for a job? Well, you’re in luck! At Hugging Face, we’re always on the lookout for enthusiastic individuals who are ready to embrace the challenge of building the future!

- **Perks**: Competitive salaries, unlimited snacks (but really, who’s keeping track?), and a chance to work with the brightest minds in AI. 🌟
- **Positions**: From engineers to outreach specialists, everyone joins in this AI dance!

### Join Us!

Whether you're looking to create the next revolutionary AI model or just want to hang out in our free spaces, there's a place for you here at Hugging Face. 

Check out our [jobs page](https://huggingface.co/jobs) – because that's where the magic begins! ✨

---

💡 **Fun Fact**: Our mascot is a friendly face that hugs! Okay, it's not real. But wouldn’t that be adorable? 

**Hugging Face** – Where AI Meets Its Best Buddy! 🤗  

