# Company Brochure Generator
## A full business solution

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

In [1]:
# imports

import os
import json
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI
from scraper import fetch_website_contents_and_links

In [2]:
# Initialize and constants
load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
gemini_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins with {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

if gemini_api_key:
    print(f"Google API Key exists and begins with {gemini_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

OpenAI API Key exists and begins with sk-proj-
Google API Key exists and begins with AI


In [3]:
openai_model = "gpt-5-nano"
gemini_model = "gemini-2.5-flash"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"

In [4]:
# Connect to OpenAI client library
# A thin wrapper around calls to HTTP endpoints

openai = OpenAI()
gemini = OpenAI(api_key=gemini_api_key, base_url=gemini_url)

In [5]:
contents, links = fetch_website_contents_and_links("https://huggingface.co/", only_content=False)

In [6]:
contents

'Hugging Face ‚Äì The AI community building the future.\n\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 2M+ models\nTrending on\nthis week\nModels\nQwen/Qwen-Image-2512\nUpdated\n6 days ago\n‚Ä¢\n14.3k\n‚Ä¢\n462\nLGAI-EXAONE/K-EXAONE-236B-A23B\nUpdated\nabout 6 hours ago\n‚Ä¢\n2.06k\n‚Ä¢\n378\nMiniMaxAI/MiniMax-M2.1\nUpdated\n10 days ago\n‚Ä¢\n197k\n‚Ä¢\n876\ntencent/HY-MT1.5-1.8B\nUpdated\n5 days ago\n‚Ä¢\n5.59k\n‚Ä¢\n310\nzai-org/GLM-4.7\nUpdated\n14 days ago\n‚Ä¢\n33.5k\n‚Ä¢\n1.47k\nBrowse 2M+ models\nSpaces\nRunning\nFeatured\n3.6k\nWan2.2 Animate\nüëÅ\n3.6k\nWan2.2 Animate\nRunning\non\nZero\nMCP\nFeatured\n221\nQwen-Image-Edit-2511-LoRAs-Fast\nüéÉ\n221\nDemo of the Collection of Qwen Image Edit LoRAs\nRunning\non\nZero\n1.04k\nZ Image Turbo\nüñº\n1.04k\nGenerat

In [7]:
links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/Qwen/Qwen-Image-2512',
 '/LGAI-EXAONE/K-EXAONE-236B-A23B',
 '/MiniMaxAI/MiniMax-M2.1',
 '/tencent/HY-MT1.5-1.8B',
 '/zai-org/GLM-4.7',
 '/models',
 '/spaces/Wan-AI/Wan2.2-Animate',
 '/spaces/prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast',
 '/spaces/mrfakename/Z-Image-Turbo',
 '/spaces/selfit-camera/Omni-Image-Editor',
 '/spaces/Qwen/Qwen-Image-2512',
 '/spaces',
 '/datasets/facebook/research-plan-gen',
 '/datasets/wikimedia/wikipedia',
 '/datasets/llm-jp/jhle',
 '/datasets/Anthropic/hh-rlhf',
 '/datasets/Gourieff/ReActor',
 '/datasets',
 '/join',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/inference/models',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google',
 '/Intel',
 '/microsoft',
 '/grammarly',
 '/Writer',
 '/docs/transformers',

## Step 1: Have the LLM figure out which links are relevant

### Use a call to gemini-2.5-flash to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

In [8]:
link_system_prompt = """
You are provided with a list of links found on a webpage.
You are able to decide which of the links would be most relevant to include in a brochure about the company,
such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:

{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [9]:
def get_links_user_prompt(url, links):
    user_prompt = f"""
Here is the list of links on the website {url} -
Please decide which of these are relevant web links for a brochure about the company. 
Respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

"""
    # links = fetch_website_links(url)
    user_prompt += "\n".join(links)
    return user_prompt

In [10]:
print(get_links_user_prompt("https://huggingface.co/", links))


Here is the list of links on the website https://huggingface.co/ -
Please decide which of these are relevant web links for a brochure about the company. 
Respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

/
/models
/datasets
/spaces
/docs
/enterprise
/pricing
/login
/join
/spaces
/models
/Qwen/Qwen-Image-2512
/LGAI-EXAONE/K-EXAONE-236B-A23B
/MiniMaxAI/MiniMax-M2.1
/tencent/HY-MT1.5-1.8B
/zai-org/GLM-4.7
/models
/spaces/Wan-AI/Wan2.2-Animate
/spaces/prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
/spaces/mrfakename/Z-Image-Turbo
/spaces/selfit-camera/Omni-Image-Editor
/spaces/Qwen/Qwen-Image-2512
/spaces
/datasets/facebook/research-plan-gen
/datasets/wikimedia/wikipedia
/datasets/llm-jp/jhle
/datasets/Anthropic/hh-rlhf
/datasets/Gourieff/ReActor
/datasets
/join
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/inference/models
/pricing#endpoints
/pricing#spaces
/

In [11]:
def select_relevant_links(url, links, model):
    print(f"Selecting relevant links for {url} by calling {model}")
    response = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(url, links)}
        ],
        response_format={"type": "json_object"}
    )

    result = response.choices[0].message.content
    links = json.loads(result)
    print(f"Found {len(links['links'])} relevant links")

    return links

In [12]:
select_relevant_links(url="https://huggingface.co/", links=links, model=gemini_model)

Selecting relevant links for https://huggingface.co/ by calling gemini-2.5-flash
Found 11 relevant links


{'links': [{'type': 'homepage', 'url': 'https://huggingface.co/'},
  {'type': 'enterprise solutions', 'url': 'https://huggingface.co/enterprise'},
  {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'},
  {'type': 'about page', 'url': 'https://huggingface.co/huggingface'},
  {'type': 'brand guidelines', 'url': 'https://huggingface.co/brand'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'learn page', 'url': 'https://huggingface.co/learn'},
  {'type': 'blog', 'url': 'https://huggingface.co/blog'},
  {'type': 'github profile', 'url': 'https://github.com/huggingface'},
  {'type': 'twitter profile', 'url': 'https://twitter.com/huggingface'},
  {'type': 'linkedin profile',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

In [13]:
models = gemini.models.list()
for model in models:
  print(model.id)

models/embedding-gecko-001
models/gemini-2.5-flash
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-exp-1206
models/gemini-2.5-flash-preview-tts
models/gemini-2.5-pro-preview-tts
models/gemma-3-1b-it
models/gemma-3-4b-it
models/gemma-3-12b-it
models/gemma-3-27b-it
models/gemma-3n-e4b-it
models/gemma-3n-e2b-it
models/gemini-flash-latest
models/gemini-flash-lite-latest
models/gemini-pro-latest
models/gemini-2.5-flash-lite
models/gemini-2.5-flash-image-preview
models/gemini-2.5-flash-image
models/gemini-2.5-flash-preview-09-2025
models/gemini-2.5-flash-lite-preview-09-2025
models/gemini-3-pro-preview
models/gemini-3-flash-preview
models/gemini-3-pro-image-preview
models/nano-banana-pro-preview
models/gemini-robotics-er-1.5-preview
models/g

## Step 2: make the brochure!

Assemble all the details into another prompt to LLM

In [14]:
def fetch_page_and_relevant_links(url, model):
    contents, links = fetch_website_contents_and_links(url)
    relevant_links = select_relevant_links(url, links, model)
    result = f"## Landing Page:\n\n{contents}\n## Relevant Links:\n"
    for link in relevant_links['links']:
        result += f"\n\n### Link: {link['type']}\n"
        sub_content, _ = fetch_website_contents_and_links(link["url"], only_content=True)
        result += sub_content
    
    return result

In [15]:
print(fetch_page_and_relevant_links("https://huggingface.co/", gemini_model))

Selecting relevant links for https://huggingface.co/ by calling gemini-2.5-flash
Found 15 relevant links
## Landing Page:

Hugging Face ‚Äì The AI community building the future.

Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 2M+ models
Trending on
this week
Models
Qwen/Qwen-Image-2512
Updated
6 days ago
‚Ä¢
14.3k
‚Ä¢
462
LGAI-EXAONE/K-EXAONE-236B-A23B
Updated
about 6 hours ago
‚Ä¢
2.06k
‚Ä¢
378
MiniMaxAI/MiniMax-M2.1
Updated
10 days ago
‚Ä¢
197k
‚Ä¢
876
tencent/HY-MT1.5-1.8B
Updated
5 days ago
‚Ä¢
5.59k
‚Ä¢
310
zai-org/GLM-4.7
Updated
14 days ago
‚Ä¢
33.5k
‚Ä¢
1.47k
Browse 2M+ models
Spaces
Running
Featured
3.6k
Wan2.2 Animate
üëÅ
3.6k
Wan2.2 Animate
Running
on
Zero
MCP
Featured
221
Qwen-Image-Edit-2511-LoRAs-Fast
üéÉ
221
Demo of the Collection of Qwen Image Edit LoRAs
Running
on
Ze

In [16]:
brochure_system_prompt = """
You are an assistant that analyzes the contents of several relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
Respond in markdown without code blocks.
Include details of company culture, customers and careers/jobs if you have the information.
"""

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# brochure_system_prompt = """
# You are an assistant that analyzes the contents of several relevant pages from a company website
# and creates a short, humorous, entertaining, witty brochure about the company for prospective customers, investors and recruits.
# Respond in markdown without code blocks.
# Include details of company culture, customers and careers/jobs if you have the information.
# """

In [17]:
def get_brochure_user_prompt(company_name, url, model):
    user_prompt = f"""
You are looking at a company called: {company_name}
Here are the contents of its landing page and other relevant pages;
use this information to build a short brochure of the company in markdown without code blocks.\n\n
"""
    user_prompt += fetch_page_and_relevant_links(url, model)
    user_prompt = user_prompt[:5_000] # truncate of more than 5000 characters
    
    return user_prompt

In [18]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co", gemini_model)

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 10 relevant links


'\nYou are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages;\nuse this information to build a short brochure of the company in markdown without code blocks.\n\n\n## Landing Page:\n\nHugging Face ‚Äì The AI community building the future.\n\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 2M+ models\nTrending on\nthis week\nModels\nQwen/Qwen-Image-2512\nUpdated\n6 days ago\n‚Ä¢\n14.3k\n‚Ä¢\n462\nLGAI-EXAONE/K-EXAONE-236B-A23B\nUpdated\nabout 6 hours ago\n‚Ä¢\n2.06k\n‚Ä¢\n379\nMiniMaxAI/MiniMax-M2.1\nUpdated\n10 days ago\n‚Ä¢\n197k\n‚Ä¢\n876\ntencent/HY-MT1.5-1.8B\nUpdated\n5 days ago\n‚Ä¢\n5.59k\n‚Ä¢\n311\nzai-org/GLM-4.7\nUpdated\n14 days ago\n‚Ä¢\n33.5k\n‚Ä¢\n1.47k\nBrowse 2M+ models\nSpaces\nRunning\nFeatured\n3.

In [19]:
def create_brochure(company_name, url, model):
    response = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url, model)}
        ]
    )

    result = response.choices[0].message.content
    display(Markdown(result))

In [20]:
create_brochure("HuggingFace", "https://huggingface.co", gemini_model)

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 15 relevant links


**Hugging Face: The AI Community Building the Future**

Welcome to Hugging Face, the central hub for machine learning innovation and collaboration. We are dedicated to creating, discovering, and collaborating on ML better, serving as the home for an expansive community of AI builders, researchers, and developers.

**What We Offer:**
Hugging Face is the premier platform where the machine learning community comes together to collaborate on cutting-edge models, datasets, and applications. Our vast ecosystem includes:

*   **Models:** Access and contribute to over 2 million models across various modalities including text, image, video, audio, and 3D. Discover trending models and leverage an open-source stack to accelerate your projects.
*   **Datasets:** Explore over 500,000 public datasets, empowering your machine learning experiments and training with diverse and high-quality data.
*   **Spaces:** Showcase and interact with over 1 million AI applications. These "Spaces" allow you to run and demo ML applications, from image generation and editing tools to advanced research prototypes.
*   **Collaboration Platform:** Host and collaborate on unlimited public models, datasets, and applications, fostering a dynamic environment for shared progress and innovation.
*   **Open Source Stack:** Benefit from the Hugging Face open-source stack, designed to help you move faster and more efficiently in your ML development.

**Who We Serve:**
Hugging Face empowers a diverse range of users:

*   **The Machine Learning Community:** Individuals can build their ML portfolio, share their work with the world, and engage with peers on an advanced collaborative platform.
*   **Teams & Enterprises:** We provide advanced Compute and Enterprise solutions, offering a secure platform with enterprise-grade security and access control, tailored to accelerate your team's AI initiatives.

**Our Culture & Community:**
At our core, Hugging Face is about community. We are passionate about fostering an environment of open collaboration and shared knowledge. We believe in the power of collective effort to build the future of AI, encouraging transparency and contribution through our extensive open-source offerings. Our vibrant activity feed showcases continuous advancements from our contributors, driving innovation daily.

**Careers:**
While specific job postings are not detailed here, Hugging Face is built by and for the AI community. If you are passionate about machine learning, open source, and building the future of AI in a collaborative environment, we invite you to explore opportunities to contribute to our mission and potentially join our team of innovators.

## Step 3: a minor improvement
With a small adjustment, we can change this so that the results stream back from LLM, with the familiar typewriter animation

In [21]:
def stream_brochure(company_name, url, model):
    stream = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url, model)}
        ],
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)

    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        update_display(Markdown(response), display_id=display_handle.display_id)

In [31]:
stream_brochure("HuggingFace", "https://huggingface.co", gemini_model)

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 17 relevant links


**Hugging Face: The AI Community Building the Future**

Welcome to Hugging Face, the collaborative platform where the machine learning community comes together to create, discover, and accelerate the development of models, datasets, and applications. We are the home of machine learning, empowering individuals, teams, and enterprises to build the future of AI.

**What We Offer:**

*   **Models:** Dive into an expansive hub of over **2 million models** covering a multitude of tasks, from Text Generation and Image-to-Text to Text-to-Video and 3D. Our platform supports popular ML libraries like PyTorch, TensorFlow, and Diffusers, alongside various inference providers.
*   **Datasets:** Access and contribute to over **500,000 datasets**, providing the essential fuel for training and refining your AI projects across diverse domains.
*   **Spaces:** Host and explore over **1 million AI applications**. These interactive "Spaces" allow you to run demos, generate images, edit multimedia, and much more, showcasing the practical power of machine learning in real-time.
*   **Collaboration Platform:** Our core strength lies in fostering collaboration. You can host and work together on unlimited public models, datasets, and applications, enabling faster development and easier sharing of your work with the global ML community.
*   **Open Source & Multi-modal:** Leveraging the HF Open Source stack, you can move faster and explore all modalities ‚Äì text, image, video, audio, and 3D ‚Äì within your projects.

**For Teams & Enterprise:**

Beyond our community offerings, Hugging Face provides advanced paid Compute and Enterprise solutions designed to accelerate your team's ML initiatives. Benefit from enterprise-grade security, robust access controls, and dedicated resources to ensure your projects are secure, scalable, and efficient.

**Our Community:**

At its heart, Hugging Face is built on the spirit of community. We believe in open collaboration, continuous learning, and sharing innovation to collectively push the boundaries of artificial intelligence. It's a place where you can build your ML portfolio and contribute to groundbreaking advancements.

**Join the Future of AI:**

Whether you're an individual developer, a researcher, or part of a larger organization, Hugging Face provides the tools and community to bring your AI ideas to life.

Sign up today to explore millions of models, datasets, and applications, or to start building your next great AI project.

*(While Hugging Face fosters a strong community culture, specific details on internal company culture, customer success stories, or direct career opportunities were not explicitly provided in the source material.)*

In [22]:
stream_brochure("LangChain", "https://www.langchain.com/", gemini_model)

Selecting relevant links for https://www.langchain.com/ by calling gemini-2.5-flash
Found 7 relevant links


# LangChain: Ship Reliable Agents, Faster.

At LangChain, we empower developers to bring their AI agents from concept to production with speed and confidence. We are on a mission to make AI agents as reliable and indispensable as databases and APIs, building the leading platform for agent engineering.

## What We Do

LangChain offers a comprehensive suite of tools designed to support the entire agent development lifecycle:

### Agent Engineering Platform: LangSmith
LangSmith is our commercial platform, providing the essential capabilities for robust agent development:
*   **Observability:** Gain deep insights into agent execution with in-depth traces for debugging and monitoring.
*   **Evaluation:** Rapidly iterate on prompts and models with powerful online and offline evaluation tools.
*   **Deployment:** Confidently ship and scale your agents in production, designed for long-running workloads and human oversight.
*   **Agent Builder (Beta):** Quickly construct agents with our intuitive no-code solution.
*   **Key Benefits:** Enjoy visibility and control, fast iteration, durable performance, and framework neutrality ‚Äì working seamlessly with your existing stack. Start for free with 5,000 traces per month!

### Open Source Frameworks
Our open-source frameworks give developers the flexibility and control they need:
*   **LangChain:** Accelerate development with less code, leveraging pre-built agent architectures and model integrations.
*   **LangGraph:** Build highly custom agent workflows with low-level primitives, putting you in control.
*   **Deep Agents (New):** Tackle complex, long-running tasks by utilizing advanced planning, memory, and sub-agents.

## Who We Serve

From innovative AI startups to global