# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [1]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants

selected_model = "local"

load_dotenv(override=True)
api_key = os.getenv('PERPLEXITY_API_KEY')

MODEL,api_key,base_url = ('sonar-pro',api_key,"https://api.perplexity.ai") if selected_model=="perplexity" else("gemma3",'ollama','http://localhost:11434/v1')

openai = OpenAI(base_url=base_url,api_key=api_key)


In [3]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [4]:
ed = Website("https://edwarddonner.com")
ed.links

['https://edwarddonner.com/',
 'https://edwarddonner.com/connect-four/',
 'https://edwarddonner.com/outsmart/',
 'https://edwarddonner.com/about-me-and-about-nebula/',
 'https://edwarddonner.com/posts/',
 'https://edwarddonner.com/',
 'https://news.ycombinator.com',
 'https://nebula.io/?utm_source=ed&utm_medium=referral',
 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html',
 'https://patents.google.com/patent/US20210049536A1/',
 'https://www.linkedin.com/in/eddonner/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/05/18/2025-ai-executive-briefing/',
 'https://edwarddonner.com/2025/04/21/the-complete-agentic-ai-engineering-course/',
 'https://edwarddonner.com/2025/04/21/the-

In [5]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [6]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [7]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [8]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddo

In [9]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [10]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

huggingface = Website("https://huggingface.co")
huggingface.links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/openai/gpt-oss-120b',
 '/openai/gpt-oss-20b',
 '/Qwen/Qwen-Image',
 '/tencent/Hunyuan-1.8B-Instruct',
 '/rednote-hilab/dots.ocr',
 '/models',
 '/spaces/Qwen/Qwen-Image',
 '/spaces/enzostvs/deepsite',
 '/spaces/black-forest-labs/FLUX.1-Krea-dev',
 '/spaces/Qwen/Qwen3-Coder-WebDev',
 '/spaces/Wan-AI/Wan-2.2-5B',
 '/spaces',
 '/datasets/fka/awesome-chatgpt-prompts',
 '/datasets/HuggingFaceH4/Multilingual-Thinking',
 '/datasets/nvidia/Nemotron-Post-Training-Dataset-v1',
 '/datasets/spatialverse/InteriorGS',
 '/datasets/spatialverse/InteriorAgent',
 '/datasets',
 '/join',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google',
 '/Intel',
 '/microsoft',
 '/grammarly',
 '/Writer',
 '/docs/transformers',


In [11]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': 'https://huggingface.co/'},
  {'type': 'models', 'url': 'https://huggingface.co/models/'},
  {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'},
  {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'models', 'url': 'https://huggingface.co/models/'},
  {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'},
  {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
  {'type': 'documentation', 'url': 'https://huggingface.co/docs/'},
 

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [63]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [64]:
print(get_all_details("https://huggingface.co"))

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/diffusers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/safetensors/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/huggingface_hub/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/tokenizers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/trl/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers.js/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/smolagents/'}, {'type': 'documentation', 'url': 'https://hu

In [71]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information.Do not ask any follow up questions ."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [72]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [73]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/diffusers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/safetensors/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/huggingface_hub/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/tokenizers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/trl/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers.js/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/smolagents/'}, {'type': 'documentation', 'url': 'https://hu

'You are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHugging Face – The AI community building the future.\nWebpage Contents:\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nNEW\nGet started with Inference in seconds 🚀\nReachy Mini: The Open Robot for AI Builders\nWelcome Cohere on the Hub 🔥\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\nopenai/gpt-oss-120b\nUpdated\nless than a minute ago\n•\n37.8k\n•\n2.36k\nopenai/gpt-oss-20b\nUpdated\nless than a minute ago\n•\n91.2k\n•\n1.99k\nQwen/Qwen-Image\nUpdated\nabout 10 hours ago\n•\n21.7k\n•\n1.11k\nblack-forest-labs/FLUX.1-Krea-dev\nUpdated\n6 days ago\n•\n

In [74]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [75]:
create_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'docs', 'url': 'https://huggingface.co/docs/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'docs', 'url': 'https://huggingface.co/docs/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'docs', 'url': 'https://huggingface.co/docs/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'docs', 'url': 'https://huggingface.co/docs/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'},

## Hugging Face: Building the Future of AI Together

**(Image: A dynamic graphic showcasing diverse AI applications - text generation, image creation, robotics - all powered by Hugging Face.)**

**Hugging Face** is the leading collaborative platform for the machine learning community, empowering developers and researchers to build, share, and deploy state-of-the-art AI models and applications.

**What We Offer:**

* **A Massive Model Ecosystem:** Discover and utilize over **1 Million+ pre-trained models** across diverse modalities including text, image, video, and audio.  Popular models include OpenAI’s GPT-OSS, Qwen/Qwen-Image, and many more.
* **Collaborative Spaces:** Our **Hub** brings together developers, researchers, and industry professionals to share knowledge, contribute to open-source projects, and accelerate innovation. 
* **Easy Deployment:** Quickly deploy your models with **Inference Endpoints** – optimized for speed and efficiency.  Start with a $0.60/hour GPU price!
* **Enterprise Solutions:**  Scale your AI initiatives with our Enterprise solutions, offering security, access controls, dedicated support, and priority services.  Starting at $20/user/month.
* **Community Driven:**  Join a vibrant community exploring cutting edge technology.

**Key Features:**

* **Spaces:** Create and share interactive AI applications.
* **Datasets:** Access and contribute a vast collection of datasets.
* **Models:**  Leverage a comprehensive suite of models across all current AI modalities. 
* **Quick Deployment:** Deploy with Inference Endpoints in seconds. 

**Who Uses Hugging Face?**

* **Companies:** Meta, Amazon, Google, Intel, Microsoft, Grammarly, and many more
* **Non-Profits:** AI at Meta
* **Teams:** Grammerly, Writer and more

**(Small images & logos of some of the companies listed above.)**

**Join the Movement:**

**Careers with Hugging Face:**

We're building the future of AI, and we're looking for passionate individuals to join our team!  Explore current opportunities here: [Link to Hugging Face Jobs Page]

**Resources:**

* **Website:** [https://huggingface.co/](https://huggingface.co/)
* **Hub:** [https://huggingface.co/](https://huggingface.co/)
* **Community Forum:** [Link to Forum]

**(Social Media Icons: GitHub, Twitter, LinkedIn, Discord)** 

---

**(Note: This brochure was created using only the information from the webpage snippets provided. It leans heavily on the breadth of the offerings and community aspects.)**

In [76]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [77]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/diffusers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/safetensors/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/huggingface_hub/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/tokenizers/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/trl/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/transformers.js/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/smolagents/'}, {'type': 'documentation', 'url': 'https://hu

## Hugging Face: The AI Community Building the Future

**A Platform for Innovation in Machine Learning**

Hugging Face is a collaborative platform empowering the global machine learning community. We provide the tools and resources needed to build, share, and deploy cutting-edge AI models and applications.

**What We Offer:**

* **A Vast Ecosystem of Models:** Explore over 1 Million+ models across text, image, video, and audio modalities.  Trending models include OpenAI's GPT-OSS, Qwen/Qwen-Image and many more.
* **Datasets Galore:** Access over 250,000 datasets to fuel your machine learning projects.
* **Spaces for Experimentation:** Quickly deploy and showcase your models and applications using Hugging Face Spaces.
* **Enterprise Solutions:** Scale your AI initiatives with our paid Compute and Enterprise solutions, including priority support, dedicated security, and access controls.

**Key Features:**

* **Community-Driven:**  Join a vibrant community of AI builders, researchers, and developers.
* **Easy Deployment:**  Deploy models in seconds with Inference Endpoints.
* **Scalable Infrastructure:** Benefit from our optimized Compute solutions.
* **Accessible Pricing:** Starting at $0.60/hour for GPU usage and $20/user/month for Enterprise.


**Who Uses Hugging Face?**

Hugging Face is a favorite among:

* **Companies:**  Amazon, Meta, Google, Microsoft, Grammarly, and many more are leveraging our platform.
* **Research Institutions:**  Advance AI research with our open-source tools and extensive dataset offerings.
* **Startups:** Quickly prototype and launch innovative AI applications.


**Get Involved!**

* **Sign Up:** Start building today – [https://huggingface.co/](https://huggingface.co/)
* **Explore the Hub:** [https://huggingface.co/](https://huggingface.co/)
* **Join the Community:** [https://huggingface.co/community](https://huggingface.co/community)

**Hugging Face –  Building the Future of AI, Together.** 


In [78]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/'}, {'type': 'models', 'url': 'https://huggingface.co/models/'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'spaces', 'url': 'https://huggingface.co/spaces/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'inference', 'url': 'https://endpoints.huggingface.co'}, {'type': 'datasets', 'url': 'https://huggingface.co/datasets/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}, {'type': 'documentation', 'url': 'https://huggingface.co/docs/'}]}


# Hugging Face: The AI Community Building the Future

**A Collaborative Platform for Machine Learning Innovation**

Hugging Face is the leading platform empowering the global machine learning community. We provide the tools and resources needed to create, discover, and deploy cutting-edge AI models and applications.

**What We Offer:**

* **1 Million+ Models:** Explore a vast library of pre-trained models across diverse modalities – text, image, audio, video, and 3D – covering everything from large language models to diffusion models and more.
* **Comprehensive Datasets:** Access and share over 250,000 datasets to accelerate your ML projects.
* **Spaces:** Quickly deploy and showcase your AI applications with our user-friendly Spaces platform. 
* **Community-Driven:**  Join a vibrant community of researchers, developers, and enthusiasts.
* **Enterprise Solutions:**  Scale your AI initiatives with our dedicated solutions, offering enterprise-grade security, priority support, and access controls.

**Key Features:**

* **Easy Model Deployment:** Deploy your models with optimized Inference Endpoints or through Spaces applications within seconds.
* **Scalable Infrastructure:**  Leverage our infrastructure for seamless model deployment and scaling.
* **Open Source Tools:** Build your ML foundation with our core open-source libraries:
    * **Transformers:** State-of-the-art AI models for PyTorch
    * **Diffusers:** State-of-the-art Diffusion models in PyTorch
    * **Safetensors:** Safe way to store/distribute neural network weights
    * **Hub Python Library:** Python client to interact with the Hugging Face Hub
* **Financial Solutions:** Offers paid compute solutions.

**Who Uses Hugging Face?**

Hugging Face is trusted by a diverse range of organizations, including:

* **Meta AI**
* **Google**
* **Intel**
* **Microsoft**
* **Grammarly**
* **Amazon**
* **Non-profit organizations** 
* **AI at Meta**

**Get Involved!**

* **Explore the Hub:** Discover and experiment with thousands of models and datasets. [https://huggingface.co/](https://huggingface.co/)
* **Join the Community:**  Connect with fellow builders on our forums and social channels. ([GitHub](https://github.com/huggingface), [Twitter](https://twitter.com/huggingface), [LinkedIn](https://www.linkedin.com/huggingface/), [Discord](https://discord.gg/huggingface)) 


**Ready to build the future of AI?**
