## THE PROJECT

This project builds a Brochure for a company to be used for prospective clients, investors and potential recruits. Inputs and contexts provided to the model are: company name and the company's primary website.



### INSTALLATION

0. Ensure you have an IDE installed. Cursor or VS Code is preferred
1. Install `uv` from https://docs.astral.sh/uv/getting-started/installation/
2. Create an API Key from OpenAI via https://platform.openai.com/
3. Clone this repo
4. Run `uv sync`
5. Add your API Key created in 2 above to a `.env` file as `OPENAI_API_KEY=sk-your-copied-key`


In [None]:
# imports

import os
import json
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from scraper import fetch_website_links, fetch_website_contents # custom library to fetch website links and contents
from openai import OpenAI

In [4]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key) > 10:
    print("API OK")
else:
    print( "There is a problem with your API key. Make sure it's a valid OpenAI API key and try again." )
    
MODEL = 'gpt-5-nano'

openai = OpenAI()

API OK


In [5]:
links = fetch_website_links("https://solomonadebayo.org")
links

['https://www.solomonadebayo.org',
 'https://www.solomonadebayo.org/resume',
 'https://www.solomonadebayo.org/projects',
 'https://www.solomonadebayo.org/contact',
 'https://www.solomonadebayo.org/resume',
 'https://www.solomonadebayo.org/projects',
 'https://www.solomonadebayo.org/contact',
 'mailto:me@solomonadebayo.org',
 'http://www.instagram.com/j.soloz',
 'http://linkedin.com/in/soloz',
 'https://www.twitter.com/soloz']

## Step 1: Figure out which links are relevant using GPT-5-nano model


Let gpt-5-nano figure out which links are relevant from the web page, and respond with a structured JSON document. The GPT-5 LLM should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about". In the next section, "One-shot prompting" will be used to show how LLM should respond to prompting.



In [6]:
SYSMTEM_PROMPT = """
You are provided with a list of links found on a webpage.
You are able to decide which of the links would be most relevant to include in a brochure about the company,
such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:

{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [8]:
def prepare_user_prompt(url):
    user_prompt = f"""
Here is the list of links on the website {url} -
Please decide which of these are relevant web links for a brochure about the company, 
respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

"""
    links = fetch_website_links(url)
    user_prompt += "\n".join(links)
    return user_prompt

In [9]:
print( prepare_user_prompt("https://solomonadebayo.org") )


Here is the list of links on the website https://solomonadebayo.org -
Please decide which of these are relevant web links for a brochure about the company, 
respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

https://www.solomonadebayo.org
https://www.solomonadebayo.org/resume
https://www.solomonadebayo.org/projects
https://www.solomonadebayo.org/contact
https://www.solomonadebayo.org/resume
https://www.solomonadebayo.org/projects
https://www.solomonadebayo.org/contact
mailto:me@solomonadebayo.org
http://www.instagram.com/j.soloz
http://linkedin.com/in/soloz
https://www.twitter.com/soloz


In [12]:
def select_relevant_links(url):
    print(f"Selecting relevant links for {url} by calling {MODEL}")
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": SYSMTEM_PROMPT},
            {"role": "user", "content": prepare_user_prompt(url)}
        ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    links = json.loads(result)
    print(f"Found {len(links['links'])} relevant links")
    return links
    

In [13]:
select_relevant_links("https://solomonadebayo.org")

Selecting relevant links for https://solomonadebayo.org by calling gpt-5-nano
Found 4 relevant links


{'links': [{'type': 'home page', 'url': 'https://www.solomonadebayo.org'},
  {'type': 'projects page', 'url': 'https://www.solomonadebayo.org/projects'},
  {'type': 'contact page', 'url': 'https://www.solomonadebayo.org/contact'},
  {'type': 'twitter', 'url': 'https://www.twitter.com/soloz'}]}

In [14]:
select_relevant_links("https://huggingface.co")

Selecting relevant links for https://huggingface.co by calling gpt-5-nano
Found 5 relevant links


{'links': [{'type': 'brand page', 'url': 'https://huggingface.co/brand'},
  {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'},
  {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'careers page', 'url': 'https://huggingface.co/join'}]}

## Step 2: Make the Brochure!

Assemble all the information into another prompt to GPT-5-nano

In [15]:
def fetch_page_and_all_relevant_links(url):
    contents = fetch_website_contents(url)
    relevant_links = select_relevant_links(url)
    result = f"## Landing Page:\n\n{contents}\n## Relevant Links:\n"
    for link in relevant_links['links']:
        result += f"\n\n### Link: {link['type']}\n"
        result += fetch_website_contents(link["url"])
    return result

In [16]:
print(fetch_page_and_all_relevant_links("https://huggingface.co"))

Selecting relevant links for https://huggingface.co by calling gpt-5-nano
Found 11 relevant links
## Landing Page:

Hugging Face ‚Äì The AI community building the future.

Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 1M+ models
Trending on
this week
Models
Tongyi-MAI/Z-Image-Turbo
Updated
10 days ago
‚Ä¢
323k
‚Ä¢
2.96k
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Updated
about 18 hours ago
‚Ä¢
51.3k
‚Ä¢
338
XiaomiMiMo/MiMo-V2-Flash
Updated
about 11 hours ago
‚Ä¢
3.12k
‚Ä¢
265
ResembleAI/chatterbox-turbo
Updated
3 days ago
‚Ä¢
249
microsoft/VibeVoice-Realtime-0.5B
Updated
6 days ago
‚Ä¢
194k
‚Ä¢
937
Browse 1M+ models
Spaces
Running
on
Zero
MCP
Featured
269
Chatterbox Turbo Demo
‚ö°
269
Chatterbox Turbo Demo
Running
on
Zero
228
TRELLIS.2
üè¢
228
High-fidelity 3D Generation from images
R

In [17]:
BROCHURE_SYSTEM_PROMPT = """
You are an assistant that analyzes the contents of several relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
Respond in markdown without code blocks.
Include details of company culture, customers and careers/jobs if you have the information.
"""


In [18]:
def prepare_brochure_user_prompt(company_name, url):
    user_prompt = f"""
You are looking at a company called: {company_name}
Here are the contents of its landing page and other relevant pages;
use this information to build a short brochure of the company in markdown without code blocks.\n\n
"""
    user_prompt += fetch_page_and_all_relevant_links(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [19]:
prepare_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Selecting relevant links for https://huggingface.co by calling gpt-5-nano
Found 13 relevant links


'\nYou are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages;\nuse this information to build a short brochure of the company in markdown without code blocks.\n\n\n## Landing Page:\n\nHugging Face ‚Äì The AI community building the future.\n\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\nTongyi-MAI/Z-Image-Turbo\nUpdated\n10 days ago\n‚Ä¢\n323k\n‚Ä¢\n2.96k\nnvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16\nUpdated\nabout 18 hours ago\n‚Ä¢\n51.3k\n‚Ä¢\n338\nXiaomiMiMo/MiMo-V2-Flash\nUpdated\nabout 11 hours ago\n‚Ä¢\n3.12k\n‚Ä¢\n265\nResembleAI/chatterbox-turbo\nUpdated\n3 days ago\n‚Ä¢\n249\nmicrosoft/VibeVoice-Realtime-0.5B\nUpdated\n6 days ago\n‚Ä¢\n194k\n‚Ä¢\n937\nBrowse 1M+ 

In [20]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role": "system", "content": BROCHURE_SYSTEM_PROMPT},
            {"role": "user", "content": prepare_brochure_user_prompt(company_name, url)}
        ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [21]:
create_brochure("HuggingFace", "https://huggingface.co")

Selecting relevant links for https://huggingface.co by calling gpt-5-nano
Found 10 relevant links


# Hugging Face Brochure

## About Hugging Face
Hugging Face is the vibrant AI community and collaboration platform shaping the future of machine learning. It serves as the central hub where machine learning engineers, data scientists, researchers, and AI enthusiasts come together to create, share, and innovate on models, datasets, and applications. The platform empowers users across modalities such as text, image, video, audio, and 3D, fostering an open and ethical AI ecosystem.

With over 1 million models, 250,000+ datasets, and 400,000+ applications accessible, Hugging Face enables accelerated machine learning development through an extensive open-source stack and a collaborative environment.

---

## Key Features and Offerings

### Community and Collaboration
- **Open Hub:** Collaborate on unlimited public ML models, datasets, and applications.
- **User Contributions:** Share your work, build your portfolio, and engage with a global AI community.
- **Spaces:** Create and explore live applications powered by ML models hosted on the platform.

### Enterprise and Team Solutions
- **Advanced AI Platform:** Designed for organizations seeking enterprise-grade AI development with focus on security, scalability, and collaboration.
- **Security & Compliance:** Features like Single Sign-On (SSO), audit logs, granular access controls, private storage, and organization-wide security policies ensure safe data usage.
- **Enhanced Compute:** Advanced compute options such as ZeroGPU Quota Boost to scale applications efficiently.
- **Analytics & Billing:** Track usage, manage budgets, and gain insightful analytics through a centralized dashboard.
- **Support:** Priority support and dedicated assistance to maximize platform capabilities for teams and enterprises.

### Platform Modalities
- Text
- Image generation and processing
- Video
- Audio
- 3D models and applications

---

## Customers and Use Cases
Hugging Face caters to a broad spectrum of users from individual researchers and ML engineers to large enterprises in cutting-edge industries. Its ecosystem supports:
- AI research and academic projects
- Industry-specific AI application development
- Ethical and open-source AI initiatives
- Enterprise-scale AI integration with advanced security and management features

Clients include forward-thinking companies harnessing Hugging Face‚Äôs platform to accelerate AI innovation securely and collaboratively.

---

## Company Culture
Hugging Face nurtures a culture of openness, collaboration, and innovation. It prioritizes:
- Building an ethical AI future through transparency and community involvement.
- Empowering the next generation of machine learning professionals.
- Encouraging sharing and experimentation in a supportive and inclusive environment.

The company proudly supports open-source development and values diversity in its global community.

---

## Careers at Hugging Face
Hugging Face is growing and actively seeking talented individuals passionate about AI and machine learning. Current openings span various roles, including engineering, research, product management, and support functions.

Joining Hugging Face means becoming part of a mission-driven team dedicated to democratizing AI and fostering a collaborative technology culture. Employees enjoy:
- Working on state-of-the-art AI projects.
- Contributing to open source.
- Collaborating with leading AI experts worldwide.

---

## Branding and Visual Identity
- **Colors:** Bright yellow (#FFD21E), vibrant orange (#FF9D00), and subtle gray (#6B7280) define the friendly and innovative brand look.
- **Logos:** Available in various formats (SVG, PNG, AI) for flexible use.

---

## Get Involved
- **Sign Up:** Create your account to start exploring or contributing on the platform.
- **Explore:** Browse models, datasets, and Spaces to discover the latest innovations.
- **Enterprise:** Contact Hugging Face sales to learn about flexible enterprise plans tailored to your organization's needs.
- **Community:** Join discussions, contribute, and help build the future of AI together.

For more details, visit [huggingface.co](https://huggingface.co).

---

**Hugging Face** ‚Äî Where the AI community builds the future.

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [None]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        update_display(Markdown(response), display_id=display_handle.display_id)

In [None]:
stream_brochure("HuggingFace", "https://huggingface.co")

In [None]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")