# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [1]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants


load_dotenv(override=True)

os.environ['OPENAI_API_KEY'] = 'sk-proj-****'

api_key = os.getenv('OPENAI_API_KEY')



if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [3]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
ed.get_contents()

'Webpage Title:\nHome - Edward Donner\nWebpage Contents:\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,\nacquired in 2021\n.\nWe work with groundbreaking, proprietary LLMs verticalized for talent, we’ve\npatented\nour matching model, and our award-winning platfor

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [9]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [10]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2024/12/21/

In [11]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [12]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

huggingface = Website("https://huggingface.co")
huggingface.links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/posts',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/deepseek-ai/DeepSeek-V3-0324',
 '/Qwen/Qwen2.5-Omni-7B',
 '/Qwen/Qwen2.5-VL-32B-Instruct',
 '/ds4sd/SmolDocling-256M-preview',
 '/manycore-research/SpatialLM-Llama-1B',
 '/models',
 '/spaces/enzostvs/deepsite',
 '/spaces/ByteDance/InfiniteYou-FLUX',
 '/spaces/3DAIGC/LHM',
 '/spaces/starvector/starvector-1b-im2svg',
 '/spaces/Qwen/Qwen2.5-Omni-7B-Demo',
 '/spaces',
 '/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset-v1',
 '/datasets/glaiveai/reasoning-v1-20m',
 '/datasets/a-m-team/AM-DeepSeek-R1-Distilled-1.4M',
 '/datasets/PixelAI-Team/TalkBody4D',
 '/datasets/FreedomIntelligence/medical-o1-reasoning-SFT',
 '/datasets',
 '/join',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google

In [None]:
get_links("https://huggingface.co")

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [13]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [14]:
print(get_all_details("https://huggingface.co"))

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'company profile on LinkedIn', 'url': 'https://www.linkedin.com/company/huggingface/'}]}
Landing page:
Webpage Title:
Hugging Face – The AI community building the future.
Webpage Contents:
Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 1M+ models
Trending on
this week
Models
deepseek-ai/DeepSeek-V3-0324
Updated
4 days ago
•
71.2k
•
2.05k
Qwen/Qwen2.5-Omni-7B
Updated
3 days ago
•
35.8k
•
893
Qwen/Qwen2.5-V

In [15]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [17]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [18]:
get_brochure_user_prompt("Anthropic", "https://anthropic.co")

Found links: {'links': [{'type': 'company page', 'url': 'https://anthropic.co'}, {'type': 'anthropic labs', 'url': 'https://anthropiclabs.com'}, {'type': 'anthropic labs', 'url': 'http://anthropiclabs.com/'}]}


'You are looking at a company called: Anthropic\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nAnthropic | Reaching the human element.\nWebpage Contents:\nCall Sales: +1 972 432 7040\ncontact@anthropiclabs.com\nMon - Fri 10:00 - 16:00 CT\nSat and Sun - CLOSED\nHome\nWe build prototypes.\nHardware is hard. We can help.\nUnique solutions are our specialty.\nConsult with us on your next project.\nResearch & Development\nWe aggregate data and known solutions from multiple industries to compile a fresh perspective on the challenge at hand. Balancing innovation and evolution is key.\nDesign Consulting\nWe reverse engineer design from the desired result. Comprehensive analysis of the final requirements ensures a seamless project strategy and leads to the most complete execution plan.\nPrototype Execution\nBalancing speed, accuracy, and cost are crucial. We iterate q

In [19]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [20]:
create_brochure("Anthropic", "https://anthropic.co")

Found links: {'links': [{'type': 'company page', 'url': 'https://anthropic.co'}, {'type': 'company page', 'url': 'https://anthropiclabs.com'}, {'type': 'company page', 'url': 'http://anthropiclabs.com'}]}


```markdown
# Welcome to Anthropic

**Reaching the Human Element**

---

## About Us

At Anthropic, we specialize in transforming complex hardware challenges into innovative prototypes. Based in Dallas, TX, our team is dedicated to providing unique solutions tailored to meet our clients' needs. 

---

## Our Services

### Research & Development
We aggregate data and insights from multiple industries to gather fresh perspectives on challenges. Our focus is on balancing innovation with evolution.

### Design Consulting
We reverse engineer designs based on desired outcomes. By analyzing final requirements comprehensively, we ensure seamless project strategies and execution plans.

### Prototype Execution
Focusing on speed, accuracy, and cost, we work closely with our clients, iterating quickly to keep the build process efficient and aligned with goals.

---

## Recent Projects

- **Autonomous Drone System**
- **Wireless Carrier CPE Enclosure**
- **Computer Vision Mounting System**
- **Robotic Track System**
- **Interactive Product Surface**
- **Racing Seat Mounts**
- **Variable Tension Hinges**
- **Automotive Part Prototypes**
- **Architectural Models**
- **Manifold Mounts**

---

## Client Testimonials

> "The Anthropic team has proven invaluable consulting for our launch into IoT. We especially appreciate the work ethic behind each deliverable. I highly recommend Anthropic’s services if true innovation is one of your goals."  
— **Skip Howard**, Co-founder & CEO, Spacee

---

## Company Culture 

At Anthropic, we foster a collaborative and innovative environment where creativity thrives. Our team is passionate about pushing the boundaries of technology and design. We believe in the power of diversity and inclusion to spark new ideas and solutions.

---

## Careers

We are always looking for talented individuals who are eager to take on challenges and drive innovation. If you're interested in joining our team, reach out via the contact information below.

---

## Contact Us

**Phone:** +1 972 432 7040  
**Email:** contact@anthropiclabs.com  
**Hours:** Mon - Fri: 10:00 AM - 4:00 PM CT  
**Address:** 17217 Waterview Pkwy, Dallas, TX 75252   

---

Join us in reaching new heights with innovative hardware solutions.
```


## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [21]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [22]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'company page', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}]}


# Hugging Face: The AI Community Building the Future

**Welcome to Hugging Face!** We are more than just a company; we are a vibrant AI community dedicated to innovation and collaboration in the realm of machine learning. 

## Who We Are

At **Hugging Face**, we provide a platform where the machine learning community can come together to collaborate on models, datasets, and applications. Our mission is to democratize AI by building open-source tools and creating a welcoming environment for researchers, developers, and enthusiasts alike.

### Our Offerings
- **Models**: Access over 1 million high-quality models, including state-of-the-art architectures for text, image, video, audio, and even 3D applications.
- **Datasets**: Discover and share more than 250,000 datasets tailored for various machine learning tasks.
- **Spaces**: Engage with a variety of applications—742 are currently running and waiting for you to explore!
- **Enterprise Solutions**: Offering advanced features aimed at organizations, including enterprise-grade security and dedicated support.

## Our Customers

More than **50,000 organizations** use Hugging Face as their go-to platform, including industry giants like **Google**, **Amazon**, **Microsoft**, and **Meta**. Our community spans across various sectors—from non-profits to for-profit enterprises—showcasing the versatility and scalability of our solutions.

## Company Culture

Our culture is rooted in **collaboration**, **innovation**, and **inclusivity**. We are committed to fostering a supportive environment that encourages creativity and learning. Team members often engage in community-driven projects and share their findings through blog posts and forums, enabling knowledge transfer and continual growth.

### Join Us

We are continuously looking for passionate and talented individuals to join our team! At Hugging Face, you will have the opportunity to:

- **Grow Your Career**: Work alongside some of the brightest minds in AI and contribute to impactful projects.
- **Shape the Future of AI**: Play a critical role in advancing technology that will reshape the world.
- **Flexible Work Environment**: Embrace a culture that values work-life balance.

Explore our **[Careers Page](https://huggingface.co/jobs)** to see current openings and become part of our innovative journey!

## Connect with Us

Join our community to share knowledge, collaborate, and build the future of AI together:
- **GitHub** | **Twitter** | **LinkedIn** | **Discord**

---

### Start your journey with Hugging Face today!
**Explore our platform, dive into our resources, and help us shape the future of AI.**
**[Sign Up Now](https://huggingface.co/signup)**

For more information, visit our [website](https://huggingface.co).

In [23]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'models page', 'url': 'https://huggingface.co/models'}, {'type': 'datasets page', 'url': 'https://huggingface.co/datasets'}, {'type': 'spaces page', 'url': 'https://huggingface.co/spaces'}, {'type': 'forum page', 'url': 'https://discuss.huggingface.co'}, {'type': 'status page', 'url': 'https://status.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Brochure

## Welcome to Hugging Face
### The AI Community Building the Future

At Hugging Face, we are creating a collaborative platform where the machine learning (ML) community can come together to innovate, share, and advance AI technologies. Since our inception, we have positioned ourselves at the forefront of the AI revolution, enabling enthusiasts and professionals alike to contribute to and benefit from a vibrant ecosystem of models, datasets, and applications.

---

## Our Offerings

### Explore AI Innovations
- **1M+ Models**: Discover, develop, and train cutting-edge AI models across various applications. 
- **250k+ Datasets**: Access a diverse range of datasets tailored for any ML task.
- **Spaces**: Host and collaborate on unlimited public models and applications.

### Create and Collaborate
Join a community of over 50,000 organizations, including major tech leaders like Google, Microsoft, and Amazon, using our platform for their machine learning needs. With options for both individual users and enterprises, we ensure that everyone from hobbyists to large teams can find the right tools to accelerate their projects.

---

## Company Culture

### Empowering Innovation
Our culture at Hugging Face reflects the values of openness and collaboration. We believe that the best innovations come from diverse perspectives and shared knowledge. Whether you are an engineer, researcher, or creative, you’ll find a welcoming space to experiment, learn, and grow.

### Community Engagement
We actively encourage our users to contribute back to the community. Through forums, blogs, and collaborative projects, we promote continuous learning and collective progress. 

---

## Careers at Hugging Face

### Join Our Team
We are always looking for passionate individuals who are excited about AI and have a desire to make a meaningful impact. Our team of over 200 members is dedicated to building state-of-the-art open-source tools and fostering a culture of innovation. 

### Current Opportunities
We invite you to explore job openings on our [Careers Page](https://huggingface.co/jobs) and consider becoming a part of our mission to transform AI through community collaboration.

---

## Why Choose Hugging Face?

- **Open Source Leadership**: We are pioneers in open-source tools for machine learning, offering a suite of libraries including **Transformers**, **Diffusers**, and **Tokenizers** among others.
- **Enterprise Solutions**: With options for dedicated support and enterprise-grade security, we cater to organizations that require robust solutions tailored to their needs.
- **Continuous Growth**: With frequent updates and a commitment to evolving technologies, you will always have access to the latest advancements in AI.

---

Join us in building the future of AI. Together, we can achieve remarkable things!

[Visit Our Website](https://huggingface.co) | [Follow Us on Twitter](https://twitter.com/huggingface) | [Join Our Community on Discord](https://discord.com/invite/huggingface)

--- 

*Hugging Face: Your partner in AI innovation.*

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this exercise we extended the Day 1 code to make multiple LLM calls, and generate a document.

This is perhaps the first example of Agentic AI design patterns, as we combined multiple calls to LLMs. This will feature more in Week 2, and then we will return to Agentic AI in a big way in Week 8 when we build a fully autonomous Agent solution.

Generating content in this way is one of the very most common Use Cases. As with summarization, this can be applied to any business vertical. Write marketing content, generate a product tutorial from a spec, create personalized email content, and so much more. Explore how you can apply content generation to your business, and try making yourself a proof-of-concept prototype. See what other students have done in the community-contributions folder -- so many valuable projects -- it's wild!</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you move to Week 2 (which is tons of fun)</h2>
            <span style="color:#900;">Please see the week1 EXERCISE notebook for your challenge for the end of week 1. This will give you some essential practice working with Frontier APIs, and prepare you well for Week 2.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">A reminder on 3 useful resources</h2>
            <span style="color:#f71;">1. The resources for the course are available <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">here.</a><br/>
            2. I'm on LinkedIn <a href="https://www.linkedin.com/in/eddonner/">here</a> and I love connecting with people taking the course!<br/>
            3. I'm trying out X/Twitter and I'm at <a href="https://x.com/edwarddonner">@edwarddonner<a> and hoping people will teach me how it's done..  
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../thankyou.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#090;">Finally! I have a special request for you</h2>
            <span style="color:#090;">
                My editor tells me that it makes a MASSIVE difference when students rate this course on Udemy - it's one of the main ways that Udemy decides whether to show it to others. If you're able to take a minute to rate this, I'd be so very grateful! And regardless - always please reach out to me at ed@edwarddonner.com if I can help at any point.
            </span>
        </td>
    </tr>
</table>