# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [1]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [3]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [4]:
ed = Website("https://edwarddonner.com")
ed.links

['https://edwarddonner.com/',
 'https://edwarddonner.com/connect-four/',
 'https://edwarddonner.com/outsmart/',
 'https://edwarddonner.com/about-me-and-about-nebula/',
 'https://edwarddonner.com/posts/',
 'https://edwarddonner.com/',
 'https://news.ycombinator.com',
 'https://nebula.io/?utm_source=ed&utm_medium=referral',
 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html',
 'https://patents.google.com/patent/US20210049536A1/',
 'https://www.linkedin.com/in/eddonner/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/04/21/the-complete-agentic-ai-engineering-course/',
 'https://edwarddonner.com/2025/04/21/the-

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [5]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [6]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [7]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [8]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddo

In [9]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [10]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

huggingface = Website("https://huggingface.co")
huggingface.links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/moonshotai/Kimi-K2-Instruct',
 '/mistralai/Voxtral-Mini-3B-2507',
 '/mistralai/Voxtral-Small-24B-2507',
 '/Chain-GPT/Solidity-LLM',
 '/LGAI-EXAONE/EXAONE-4.0-32B',
 '/models',
 '/spaces/enzostvs/deepsite',
 '/spaces/llamameta/Grok-4-heavy-free',
 '/spaces/black-forest-labs/FLUX.1-Kontext-Dev',
 '/spaces/akhaliq/anycoder',
 '/spaces/multimodalart/wan2-1-fast',
 '/spaces',
 '/datasets/NousResearch/Hermes-3-Dataset',
 '/datasets/common-pile/caselaw_access_project',
 '/datasets/fka/awesome-chatgpt-prompts',
 '/datasets/microsoft/rStar-Coder',
 '/datasets/snorkelai/agent-finance-reasoning',
 '/datasets',
 '/join',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google',
 '/Intel',
 '/microsoft',
 '/gram

In [11]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': 'https://huggingface.co'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'company page', 'url': 'https://huggingface.co/enterprise'},
  {'type': 'blog', 'url': 'https://huggingface.co/blog'},
  {'type': 'join page', 'url': 'https://huggingface.co/join'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [12]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [13]:
print(get_all_details("https://huggingface.co"))

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'company page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}
Landing page:
Webpage Title:
Hugging Face – The AI community building the future.
Webpage Contents:
Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 1M+ models
Trending on
this week
Models
moonshotai/Kimi-K2-Instruct
Updated
3 days ago
•
145k
•
1.56k
mistralai/Voxtral-Mini-3B-2507
Updated
2 days ago
•
14.3k
•
349
mistralai/Voxtral-Small-24B-2507
Updated
2 days ago
•
552
•
297
Chain-GPT/Solidity-LLM
Updated
27 days ago
•
31
•
270
LGAI-EXAONE/EXAONE-4.0-32B
Updated
2 days ago
•
272k

In [14]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [28]:
def get_brochure_image_prompt(company_name, brochure):
    user_prompt = f"Generate an image best showcasing company called {company_name} based on brochure:\n\n"
    user_prompt += brochure
    return user_prompt

In [45]:
import base64
from IPython.display import Image
def get_brochure_image(company_name, brochure):
    prompt = get_brochure_image_prompt(company_name, brochure)
    response = openai.responses.create(
        model="gpt-4.1-mini",
        input=prompt,
        tools=[{"type": "image_generation"}],
    )
    image_data = [
        output.result
        for output in response.output
        if output.type == "image_generation_call"
    ]
    image = base64.b64decode(image_data[0])
    display(Image(data=image, format="png"))

In [25]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [16]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'company page', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'docs page', 'url': 'https://huggingface.co/docs'}]}


'You are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHugging Face – The AI community building the future.\nWebpage Contents:\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\nmoonshotai/Kimi-K2-Instruct\nUpdated\n3 days ago\n•\n145k\n•\n1.56k\nmistralai/Voxtral-Mini-3B-2507\nUpdated\n2 days ago\n•\n14.3k\n•\n349\nmistralai/Voxtral-Small-24B-2507\nUpdated\n2 days ago\n•\n552\n•\n297\nChain-GPT/Solidity-LLM\nUpdated\n27 days ago\n•\n31\n•\n271\nLGAI-EXAONE/EXAONE-4.0-32B\nUpdated\n2 days ago\n•\n272k\n•\n174\nBrowse 1M+ models\nSpaces\nRunning\n10.5k\n10.5k\nDeep

In [41]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))
    return result

In [42]:
brochure = create_brochure("South Park", "https://www.southparkstudios.com/")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.southparkstudios.com/welcome'}, {'type': 'news page', 'url': 'https://www.southparkstudios.com/news'}, {'type': 'creator bios', 'url': 'https://southpark.cc.com/info/jveh1j/creator-bios'}, {'type': 'cast and crew', 'url': 'https://southpark.cc.com/legal/tkdxo6/cast-and-crew'}, {'type': 'contact page', 'url': 'https://southpark.cc.com/info/z67g4e/contact'}]}


```markdown
# Welcome to South Park

**The Right Kind of Humour**  
South Park is not just an animated television show; it stands as a cornerstone of animation, renowned for its razor-sharp satire and social commentary. Bringing you the misadventures of Stan, Kyle, Cartman, and Kenny, South Park expertly weaves humor with insights about contemporary culture, politics, and social dynamics. 

---

## Our Mission

At South Park Studios, we believe in creating content that challenges norms and provides a unique perspective on serious issues while keeping audiences entertained. Our Peabody and Emmy® Award-winning series continues to push boundaries, taking on topics that others might shy away from. 

---

## What We Offer

- **Full Episodes**: Stream uncensored episodes of your favorite show anytime on South Park Studios Global. 
- **Collections**: Explore curated video collections that celebrate iconic moments and themes, such as the "Tegridy Day" and "Best of Jesus." 
- **Events and Forums**: Engage with fans in forums, and stay updated on upcoming events and new collections. 

You can access a free trial on Paramount+ to dive straight into the South Park universe!

---

## Customers & Community

Our fans are a diverse community that ranges from casual viewers to cultural commentators. We cater to those who appreciate biting satire and the art of storytelling through animation.

Stay connected with us! Subscribe for regular updates, exclusive offers, and event news delivered directly to your inbox. Our community thrives on interaction, so don’t hold back—join the conversation!

---

## Join Our Team

We are always on the lookout for talented and creative individuals who share our passion for storytelling and humor. Careers at South Park Studios offer a dynamic and engaging work environment where innovation is encouraged. 

### Current Opportunities:
- Animation Artists
- Writers & Script Editors
- Marketing and Community Managers
- Production Assistants

If you’re looking to be part of a team that values creativity, wit, and collaboration, South Park Studios might just be the perfect place for you!

---

## Company Culture

At South Park, we foster a culture of creativity, resilience, and laughter. We encourage bold ideas, celebrate diverse perspectives, and appreciate all contributions that help us grow and improve. Collaboration is key, and we value an open environment where team members can express themselves freely.

---

**Contact Us**  
For inquiries, partnerships, or recruitment opportunities, please refer to our [Contact Page](#).

**Stay Connected**  
Follow us on our social media platforms for up-to-date news and interactions!

---

*South Park and all its related titles, logos, and characters are trademarks of Comedy Partners. © 2025 South Park Digital Studios LLC. All Rights Reserved.*
```


In [46]:
get_brochure_image("South Park", brochure)

PermissionDeniedError: Error code: 403 - {'error': {'message': 'Your organization must be verified to use the model `gpt-image-1`. Please go to: https://platform.openai.com/settings/organization/general and click on Verify Organization. If you just verified, it can take up to 15 minutes for access to propagate.', 'type': 'invalid_request_error', 'param': None, 'code': None}}

In [43]:
brochure

'```markdown\n# Welcome to South Park\n\n**The Right Kind of Humour**  \nSouth Park is not just an animated television show; it stands as a cornerstone of animation, renowned for its razor-sharp satire and social commentary. Bringing you the misadventures of Stan, Kyle, Cartman, and Kenny, South Park expertly weaves humor with insights about contemporary culture, politics, and social dynamics. \n\n---\n\n## Our Mission\n\nAt South Park Studios, we believe in creating content that challenges norms and provides a unique perspective on serious issues while keeping audiences entertained. Our Peabody and Emmy® Award-winning series continues to push boundaries, taking on topics that others might shy away from. \n\n---\n\n## What We Offer\n\n- **Full Episodes**: Stream uncensored episodes of your favorite show anytime on South Park Studios Global. \n- **Collections**: Explore curated video collections that celebrate iconic moments and themes, such as the "Tegridy Day" and "Best of Jesus." \n-

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [23]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [24]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'company page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Brochure

## About Us
Hugging Face is a pioneering community-driven platform committed to democratizing machine learning (ML) and artificial intelligence (AI). Our mission is to build the future of AI by fostering collaboration among developers, researchers, and enthusiasts through an extensive library of models, datasets, and applications.

## Our Offerings
- **Models**: Access and collaborate with over 1 million machine learning models, including state-of-the-art transformers, diffusion models, and more.
- **Datasets**: Explore an extensive collection of over 250,000 datasets tailored for various ML tasks.
- **Spaces**: Utilize our cloud-based environments to run applications and collaborate seamlessly with the community.

## Features
- **Inclusive Collaboration**: The Hugging Face platform encourages users to contribute and share their work, fostering a supportive and innovative environment.
- **Open-Source Tools**: We provide cutting-edge open-source libraries and frameworks such as Transformers, Diffusers, and Tokenizers to facilitate ML development.
- **Enterprise Solutions**: With dedicated support, enterprise-grade security, and optimized compute infrastructure, we serve over 50,000 organizations, including tech giants like Google, Amazon, and Microsoft.

## Community Engagement
Hugging Face is built on the contributions of an active community of over 49,737 members. Our users collaborate daily, updating models, datasets, and sharing insights through articles and discussions. We value open-source principles and continually strive to empower each individual in their ML journey.

## Company Culture
At Hugging Face, we cultivate an inclusive and innovative workplace. Our culture thrives on creativity, collaboration, and shared knowledge. We are passionate advocates for ethical AI practices and strive to build a workplace where diverse perspectives are valued and ideas can flourish.

## Careers
Join us in our mission! We are always on the lookout for talented individuals who are excited about AI and machine learning. We offer a range of positions across various departments with opportunities for growth and learning. If you're passionate about shaping the future of technology, consider becoming a part of our team.

## Contact Us
Discover more about our community and offerings by visiting our website [Hugging Face](https://huggingface.co). For inquiries about partnerships, careers, or other queries, feel free to reach out or sign up for our newsletter. Together, we can build the future of AI! 

---

*Hugging Face – The AI community building the future.*

In [None]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this exercise we extended the Day 1 code to make multiple LLM calls, and generate a document.

This is perhaps the first example of Agentic AI design patterns, as we combined multiple calls to LLMs. This will feature more in Week 2, and then we will return to Agentic AI in a big way in Week 8 when we build a fully autonomous Agent solution.

Generating content in this way is one of the very most common Use Cases. As with summarization, this can be applied to any business vertical. Write marketing content, generate a product tutorial from a spec, create personalized email content, and so much more. Explore how you can apply content generation to your business, and try making yourself a proof-of-concept prototype. See what other students have done in the community-contributions folder -- so many valuable projects -- it's wild!</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you move to Week 2 (which is tons of fun)</h2>
            <span style="color:#900;">Please see the week1 EXERCISE notebook for your challenge for the end of week 1. This will give you some essential practice working with Frontier APIs, and prepare you well for Week 2.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">A reminder on 3 useful resources</h2>
            <span style="color:#f71;">1. The resources for the course are available <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">here.</a><br/>
            2. I'm on LinkedIn <a href="https://www.linkedin.com/in/eddonner/">here</a> and I love connecting with people taking the course!<br/>
            3. I'm trying out X/Twitter and I'm at <a href="https://x.com/edwarddonner">@edwarddonner<a> and hoping people will teach me how it's done..  
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../thankyou.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#090;">Finally! I have a special request for you</h2>
            <span style="color:#090;">
                My editor tells me that it makes a MASSIVE difference when students rate this course on Udemy - it's one of the main ways that Udemy decides whether to show it to others. If you're able to take a minute to rate this, I'd be so very grateful! And regardless - always please reach out to me at ed@edwarddonner.com if I can help at any point.
            </span>
        </td>
    </tr>
</table>