# Company Brochure Generator
## A full business solution

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

In [1]:
# imports

import os
import json
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI
from scraper import fetch_website_links, fetch_website_contents

In [2]:
# Initialize and constants
load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
gemini_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins with {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

if gemini_api_key:
    print(f"Google API Key exists and begins with {gemini_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

OpenAI API Key exists and begins with sk-proj-
Google API Key exists and begins with AI


In [17]:
openai_model = "gpt-5-nano"
gemini_model = "gemini-2.5-flash"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"

In [4]:
# Connect to OpenAI client library
# A thin wrapper around calls to HTTP endpoints

openai = OpenAI()
gemini = OpenAI(api_key=gemini_api_key, base_url=gemini_url)

In [5]:
links = fetch_website_links("https://huggingface.co/")
links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/tencent/HY-MT1.5-1.8B',
 '/Qwen/Qwen-Image-2512',
 '/zai-org/GLM-4.7',
 '/MiniMaxAI/MiniMax-M2.1',
 '/LGAI-EXAONE/K-EXAONE-236B-A23B',
 '/models',
 '/spaces/Wan-AI/Wan2.2-Animate',
 '/spaces/prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast',
 '/spaces/mrfakename/Z-Image-Turbo',
 '/spaces/selfit-camera/Omni-Image-Editor',
 '/spaces/microsoft/TRELLIS.2',
 '/spaces',
 '/datasets/facebook/research-plan-gen',
 '/datasets/bigai/TongSIM-Asset',
 '/datasets/wikimedia/wikipedia',
 '/datasets/llm-jp/jhle',
 '/datasets/Anthropic/hh-rlhf',
 '/datasets',
 '/join',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/inference/models',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google',
 '/Intel',
 '/microsoft',
 '/grammarly',
 '/Writer',
 '/docs/transformers

## Step 1: Have the LLM figure out which links are relevant

### Use a call to gemini-2.5-flash to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

In [6]:
link_system_prompt = """
You are provided with a list of links found on a webpage.
You are able to decide which of the links would be most relevant to include in a brochure about the company,
such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:

{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
def get_links_user_prompt(url):
    user_prompt = f"""
Here is the list of links on the website {url} -
Please decide which of these are relevant web links for a brochure about the company. 
Respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

"""
    links = fetch_website_links(url)
    user_prompt += "\n".join(links)
    return user_prompt

In [None]:
print(get_links_user_prompt("https://huggingface.co/"))


Here is the list of links on the website https://huggingface.co/ -
Please decide which of these are relevant web links for a brochure about the company. 
Respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

/
/models
/datasets
/spaces
/docs
/enterprise
/pricing
/login
/join
/spaces
/models
/tencent/HY-MT1.5-1.8B
/Qwen/Qwen-Image-2512
/zai-org/GLM-4.7
/MiniMaxAI/MiniMax-M2.1
/LGAI-EXAONE/K-EXAONE-236B-A23B
/models
/spaces/Wan-AI/Wan2.2-Animate
/spaces/prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
/spaces/mrfakename/Z-Image-Turbo
/spaces/selfit-camera/Omni-Image-Editor
/spaces/microsoft/TRELLIS.2
/spaces
/datasets/facebook/research-plan-gen
/datasets/bigai/TongSIM-Asset
/datasets/wikimedia/wikipedia
/datasets/llm-jp/jhle
/datasets/Anthropic/hh-rlhf
/datasets
/join
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/enterprise
/inference/models
/pricing#endpoints
/pricing#spaces

In [22]:
def select_relevant_links(url, model):
    print(f"Selecting relevant links for {url} by calling {model}")
    response = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(url)}
        ],
        response_format={"type": "json_object"}
    )

    result = response.choices[0].message.content
    links = json.loads(result)
    print(f"Found {len(links['links'])} relevant links")

    return links

In [None]:
select_relevant_links(url="https://huggingface.co/", model=gemini_model)

{'links': [{'type': 'product page', 'url': 'https://huggingface.co/models'},
  {'type': 'product page', 'url': 'https://huggingface.co/datasets'},
  {'type': 'product page', 'url': 'https://huggingface.co/spaces'},
  {'type': 'enterprise solutions page',
   'url': 'https://huggingface.co/enterprise'},
  {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'},
  {'type': 'brand page', 'url': 'https://huggingface.co/brand'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'learning resources page', 'url': 'https://huggingface.co/learn'},
  {'type': 'blog page', 'url': 'https://huggingface.co/blog'},
  {'type': 'social media (GitHub)', 'url': 'https://github.com/huggingface'},
  {'type': 'social media (Twitter)', 'url': 'https://twitter.com/huggingface'},
  {'type': 'social media (LinkedIn)',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

In [19]:
models = gemini.models.list()
for model in models:
  print(model.id)

models/embedding-gecko-001
models/gemini-2.5-flash
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-exp-1206
models/gemini-2.5-flash-preview-tts
models/gemini-2.5-pro-preview-tts
models/gemma-3-1b-it
models/gemma-3-4b-it
models/gemma-3-12b-it
models/gemma-3-27b-it
models/gemma-3n-e4b-it
models/gemma-3n-e2b-it
models/gemini-flash-latest
models/gemini-flash-lite-latest
models/gemini-pro-latest
models/gemini-2.5-flash-lite
models/gemini-2.5-flash-image-preview
models/gemini-2.5-flash-image
models/gemini-2.5-flash-preview-09-2025
models/gemini-2.5-flash-lite-preview-09-2025
models/gemini-3-pro-preview
models/gemini-3-flash-preview
models/gemini-3-pro-image-preview
models/nano-banana-pro-preview
models/gemini-robotics-er-1.5-preview
models/g

## Step 2: make the brochure!

Assemble all the details into another prompt to LLM

In [20]:
def fetch_page_and_relevant_links(url):
    contents = fetch_website_contents(url)
    relevant_links = select_relevant_links(url, gemini_model)
    result = f"## Landing Page:\n\n{contents}\n## Relevant Links:\n"
    for link in relevant_links['links']:
        result += f"\n\n### Link: {link['type']}\n"
        result += fetch_website_contents(link["url"])
    
    return result

In [23]:
print(fetch_page_and_relevant_links("https://huggingface.co/"))

Selecting relevant links for https://huggingface.co/ by calling gemini-2.5-flash
Found 12 relevant links
## Landing Page:

Hugging Face ‚Äì The AI community building the future.

Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Explore AI Apps
or
Browse 2M+ models
Trending on
this week
Models
tencent/HY-MT1.5-1.8B
Updated
4 days ago
‚Ä¢
4.75k
‚Ä¢
584
Qwen/Qwen-Image-2512
Updated
5 days ago
‚Ä¢
12.1k
‚Ä¢
411
MiniMaxAI/MiniMax-M2.1
Updated
9 days ago
‚Ä¢
195k
‚Ä¢
839
zai-org/GLM-4.7
Updated
13 days ago
‚Ä¢
32.7k
‚Ä¢
1.45k
LGAI-EXAONE/K-EXAONE-236B-A23B
Updated
about 23 hours ago
‚Ä¢
1.52k
‚Ä¢
319
Browse 2M+ models
Spaces
Running
Featured
3.5k
Wan2.2 Animate
üëÅ
3.5k
Wan2.2 Animate
Running
on
Zero
MCP
Featured
202
Qwen-Image-Edit-2511-LoRAs-Fast
üéÉ
202
Demo of the Collection of Qwen Image Edit LoRAs
Running
on
Ze

In [24]:
brochure_system_prompt = """
You are an assistant that analyzes the contents of several relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
Respond in markdown without code blocks.
Include details of company culture, customers and careers/jobs if you have the information.
"""

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# brochure_system_prompt = """
# You are an assistant that analyzes the contents of several relevant pages from a company website
# and creates a short, humorous, entertaining, witty brochure about the company for prospective customers, investors and recruits.
# Respond in markdown without code blocks.
# Include details of company culture, customers and careers/jobs if you have the information.
# """

In [27]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"""
You are looking at a company called: {company_name}
Here are the contents of its landing page and other relevant pages;
use this information to build a short brochure of the company in markdown without code blocks.\n\n
"""
    user_prompt += fetch_page_and_relevant_links(url)
    user_prompt = user_prompt[:5_000] # truncate of more than 5000 characters
    
    return user_prompt

In [26]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 17 relevant links


'\nYou are looking at a company called: {company_name}\nHere are the contents of its landing page and other relevant pages;\nuse this information to build a short brochure of the company in markdown without code blocks.\n\n\n## Landing Page:\n\nHugging Face ‚Äì The AI community building the future.\n\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 2M+ models\nTrending on\nthis week\nModels\nQwen/Qwen-Image-2512\nUpdated\n5 days ago\n‚Ä¢\n12.1k\n‚Ä¢\n415\nMiniMaxAI/MiniMax-M2.1\nUpdated\n9 days ago\n‚Ä¢\n195k\n‚Ä¢\n842\nzai-org/GLM-4.7\nUpdated\n13 days ago\n‚Ä¢\n32.7k\n‚Ä¢\n1.45k\nLGAI-EXAONE/K-EXAONE-236B-A23B\nUpdated\nabout 24 hours ago\n‚Ä¢\n1.52k\n‚Ä¢\n320\ntencent/HY-MT1.5-1.8B\nUpdated\n4 days ago\n‚Ä¢\n4.75k\n‚Ä¢\n296\nBrowse 2M+ models\nSpaces\nRunning\nFeatured\

In [28]:
def create_brochure(company_name, url, model):
    response = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ]
    )

    result = response.choices[0].message.content
    display(Markdown(result))

In [29]:
create_brochure("HuggingFace", "https://huggingface.co", gemini_model)

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 16 relevant links


Hugging Face: The AI Community Building the Future

Hugging Face stands as the definitive home for machine learning, a vibrant platform where the global AI community actively collaborates on cutting-edge models, diverse datasets, and innovative applications. We are dedicated to accelerating the advancement of artificial intelligence, empowering everyone from individual researchers to large enterprises.

**Our Collaborative Platform**

At Hugging Face, we facilitate a powerful ecosystem for ML development:

*   **Models:** Explore over 2 million models, covering a vast array of tasks including text generation, image-to-text, text-to-image, text-to-video, and text-to-speech across all modalities like text, image, video, audio, and 3D. Our platform supports leading ML libraries like PyTorch, TensorFlow, and JAX, alongside popular inference providers.
*   **Datasets:** Access and utilize more than 500,000 datasets to train and refine your AI projects.
*   **Spaces (AI Applications):** Discover and run over 1 million AI applications, from image generation and editing tools to complex 3D generation. Our Spaces allow you to showcase and experiment with interactive ML demos.

**Empowering Innovation and Collaboration**

We are built on the principle of open collaboration, enabling users to:

*   **Create and Share:** Host and collaborate on unlimited public models, datasets, and applications, building a robust ML portfolio.
*   **Move Faster:** Leverage our powerful open-source stack to streamline your machine learning workflows.
*   **Explore All Modalities:** Push the boundaries of AI across text, image, video, audio, and even 3D.

**For Every Machine Learning Journey**

Whether you are an individual ML enthusiast, an academic researcher, a startup, or a large corporation, Hugging Face offers solutions tailored to your needs:

*   **Individuals:** Build your ML profile, share your work, and contribute to the collective intelligence of the AI community.
*   **Teams & Enterprises:** Accelerate your ML development with advanced platforms offering paid compute solutions, enterprise-grade security, and robust access controls.

**Our Culture: Community-Driven and Open**

Hugging Face thrives on a culture of openness, collaboration, and shared innovation. We believe in the power of community to collectively build the future of AI. Our platform is designed to foster connection, enable knowledge sharing, and drive progress in machine learning through an accessible and supportive environment.

*Note: Information regarding specific career opportunities or current job listings was not available on the provided pages.*

## Step 3: a minor improvement
With a small adjustment, we can change this so that the results stream back from LLM, with the familiar typewriter animation

In [30]:
def stream_brochure(company_name, url, model):
    stream = gemini.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
        stream=True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)

    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        update_display(Markdown(response), display_id=display_handle.display_id)

In [31]:
stream_brochure("HuggingFace", "https://huggingface.co", gemini_model)

Selecting relevant links for https://huggingface.co by calling gemini-2.5-flash
Found 17 relevant links


**Hugging Face: The AI Community Building the Future**

Welcome to Hugging Face, the collaborative platform where the machine learning community comes together to create, discover, and accelerate the development of models, datasets, and applications. We are the home of machine learning, empowering individuals, teams, and enterprises to build the future of AI.

**What We Offer:**

*   **Models:** Dive into an expansive hub of over **2 million models** covering a multitude of tasks, from Text Generation and Image-to-Text to Text-to-Video and 3D. Our platform supports popular ML libraries like PyTorch, TensorFlow, and Diffusers, alongside various inference providers.
*   **Datasets:** Access and contribute to over **500,000 datasets**, providing the essential fuel for training and refining your AI projects across diverse domains.
*   **Spaces:** Host and explore over **1 million AI applications**. These interactive "Spaces" allow you to run demos, generate images, edit multimedia, and much more, showcasing the practical power of machine learning in real-time.
*   **Collaboration Platform:** Our core strength lies in fostering collaboration. You can host and work together on unlimited public models, datasets, and applications, enabling faster development and easier sharing of your work with the global ML community.
*   **Open Source & Multi-modal:** Leveraging the HF Open Source stack, you can move faster and explore all modalities ‚Äì text, image, video, audio, and 3D ‚Äì within your projects.

**For Teams & Enterprise:**

Beyond our community offerings, Hugging Face provides advanced paid Compute and Enterprise solutions designed to accelerate your team's ML initiatives. Benefit from enterprise-grade security, robust access controls, and dedicated resources to ensure your projects are secure, scalable, and efficient.

**Our Community:**

At its heart, Hugging Face is built on the spirit of community. We believe in open collaboration, continuous learning, and sharing innovation to collectively push the boundaries of artificial intelligence. It's a place where you can build your ML portfolio and contribute to groundbreaking advancements.

**Join the Future of AI:**

Whether you're an individual developer, a researcher, or part of a larger organization, Hugging Face provides the tools and community to bring your AI ideas to life.

Sign up today to explore millions of models, datasets, and applications, or to start building your next great AI project.

*(While Hugging Face fosters a strong community culture, specific details on internal company culture, customer success stories, or direct career opportunities were not explicitly provided in the source material.)*

In [32]:
stream_brochure("LangChain", "https://www.langchain.com/", gemini_model)

Selecting relevant links for https://www.langchain.com/ by calling gemini-2.5-flash
Found 16 relevant links


# LangChain: Powering the Future of Agent Engineering

Welcome to LangChain, the leading platform dedicated to helping engineering teams build, observe, evaluate, and deploy reliable AI agents with unprecedented speed and control. We are committed to developing the essential tools and frameworks that enable the next generation of intelligent applications.

## Our Comprehensive Solutions

LangChain offers a powerful stack for agent engineering, designed for both rapid development and fine-grained control:

### LangSmith: The Agent Engineering Platform
LangSmith is our comprehensive platform for the entire agent engineering lifecycle. It provides:
*   **Observability:** Debug and monitor in-depth traces of your agent's execution, giving you complete visibility into its behavior.
*   **Evaluation:** Iterate on prompts and models with online and offline evaluations to ensure optimal performance.
*   **Deployment:** Ship and scale agents in production with robust monitoring and alerting.
*   **Agent Builder (Beta):** Build no-code agents, making advanced AI capabilities accessible to more teams.
LangSmith ensures **visibility & control**, enables **fast iteration**, delivers **durable performance** at scale, and is **framework neutral**, integrating seamlessly with your existing stack. Get started with a free plan offering 5,000 traces per month!

### Open Source Frameworks
For developers who want to build custom agents and workflows, our open-source frameworks provide the flexibility you need:
*   **LangChain:** Kickstart your projects with a pre-built agent architecture and model integrations, enabling you to ship quickly with less code.
*   **LangGraph:** Take full control with low-level primitives to design and implement custom agent workflows.
*   **Deep Agents (New):** Tackle complex, long-running tasks using advanced planning, memory, and sub-agents.

## Who We Serve

LangChain empowers top engineering teams across the spectrum, from innovative **AI startups** to established **global enterprises**. Our solutions are proven to transform customer experiences, as demonstrated by companies like Fastweb + Vodafone, who revolutionized their customer service with AI agents built using LangGraph and LangSmith.

## Join Our Journey

We are building the platform for agent engineering, backed by significant investment. If you're passionate about the future of AI and want to contribute to groundbreaking technology, explore the [Careers](link-to-careers-page) section on our website. We also offer extensive [Documentation](link-to-docs-page), [Community Forums](link-to-community-forum), and [LangChain Academy](link-to-academy-page) resources to support your learning and development.

---

**Ready to ship reliable agents?**
[Try LangSmith for Free Today!](link-to-try-langsmith)