# Automated Company Brochure Generator

This project is based on a course "Become an LLM Engineer in 8 weeks: Build and deploy 8 LLM apps, mastering Generative AI, RAG, LoRA and AI Agents."

This notebook presents a full business solution that builds on our Day 1 project and takes it to the next level.

This solution demonstrates how LLMs can be integrated into a real-world product with direct business value. The final output includes key information such as company mission, services, values, and highlights — all structured into a clean and presentable brochure format.



In [None]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants

load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [3]:
# A class to represent a Webpage

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [4]:
ziad = Website("https://ziadtamim.com")
ziad.links

['/',
 '/posts',
 '/projects',
 '/contact',
 '/resume',
 'https://www.instagram.com/art_zt/',
 'https://www.linkedin.com/in/ziad-tamim/',
 'https://x.com/Ziad_Tamim_',
 'https://github.com/Ziad-Tamim',
 '/posts/every-data-scientist-should-know-this-by-heart',
 '/posts/Introduction-to-tinyML',
 '/posts',
 '/projects/classification_HOML',
 '/projects/portfolio-website',
 '/projects',
 '/privacy',
 'https://www.instagram.com/art_zt/',
 'https://www.linkedin.com/in/ziad-tamim/',
 'https://x.com/Ziad_Tamim_',
 'https://github.com/Ziad-Tamim',
 '/privacy']

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [5]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [6]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [7]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [8]:
print(get_links_user_prompt(ziad))

Here is the list of links on the website of https://ziadtamim.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
/
/posts
/projects
/contact
/resume
https://www.instagram.com/art_zt/
https://www.linkedin.com/in/ziad-tamim/
https://x.com/Ziad_Tamim_
https://github.com/Ziad-Tamim
/posts/every-data-scientist-should-know-this-by-heart
/posts/Introduction-to-tinyML
/posts
/projects/classification_HOML
/projects/portfolio-website
/projects
/privacy
https://www.instagram.com/art_zt/
https://www.linkedin.com/in/ziad-tamim/
https://x.com/Ziad_Tamim_
https://github.com/Ziad-Tamim
/privacy


In [9]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [10]:
anthropic = Website("https://anthropic.com")
anthropic.links

['#main',
 '#footer',
 'https://www.anthropic.com/',
 'https://www.anthropic.com/claude',
 'https://www.anthropic.com/team',
 'https://www.anthropic.com/enterprise',
 'https://www.anthropic.com/education',
 'https://www.anthropic.com/pricing',
 'https://claude.ai/download',
 'https://claude.ai/',
 'https://www.anthropic.com/news/claude-character',
 'https://www.anthropic.com/api',
 'https://docs.anthropic.com/',
 'https://www.anthropic.com/pricing#api',
 'https://console.anthropic.com/',
 'https://docs.anthropic.com/en/docs/welcome',
 'https://www.anthropic.com/solutions/agents',
 'https://www.anthropic.com/solutions/coding',
 'https://www.anthropic.com/solutions/customer-support',
 'https://www.anthropic.com/customers',
 'https://www.anthropic.com/research',
 'https://www.anthropic.com/economic-index',
 'https://www.anthropic.com/claude/sonnet',
 'https://www.anthropic.com/claude/haiku',
 'https://www.anthropic.com/news/claude-3-family',
 'https://www.anthropic.com/news/visible-extend

In [11]:
get_links("https://anthropic.com")

{'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://www.anthropic.com/team'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'},
  {'type': 'events page', 'url': 'https://www.anthropic.com/events'},
  {'type': 'education page', 'url': 'https://www.anthropic.com/education'},
  {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'},
  {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [12]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [13]:
print(get_all_details("https://www.ziadtamim.com/"))

Found links: {'links': [{'type': 'about page', 'url': 'https://www.ziadtamim.com/'}, {'type': 'projects page', 'url': 'https://www.ziadtamim.com/projects'}, {'type': 'contact page', 'url': 'https://www.ziadtamim.com/contact'}, {'type': 'resume page', 'url': 'https://www.ziadtamim.com/resume'}, {'type': 'LinkedIn profile', 'url': 'https://www.linkedin.com/in/ziad-tamim/'}, {'type': 'GitHub profile', 'url': 'https://github.com/Ziad-Tamim'}]}
Landing page:
Webpage Title:
Ziad Tamim – AI Portfolio & Blog
Webpage Contents:
Z|T
Posts
Projects
Contact
Resume
Hey, I'm Ziad.
I'm an artificial intelligence graduate based in
Jeddah, Saudi Arabia
. I build
AI powered applications
and share my learning and projects here.
Instagram
LinkedIn
X
GitHub
Recent posts
Every Data Scientist Should Know This By Heart— Beginners Level Machine Learning Concept
A high-level overview of the fundamental machine learning concepts every data scientist should know before starting their ML journey.
Machine Learning
D

In [14]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [15]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000] # Truncate if more than 20,000 characters
    return user_prompt

In [16]:
get_brochure_user_prompt("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'contact sales page', 'url': 'https://www.anthropic.com/contact-sales'}]}


'You are looking at a company called: Anthropic\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHome \\ Anthropic\nWebpage Contents:\nSkip to main content\nSkip to footer\nClaude\nChat with Claude\nOverview\nTeam plan\nEnterprise plan\nEducation plan\nExplore pricing\nDownload apps\nClaude log in\nNews\nClaude’s character\nAPI\nBuild with Claude\nAPI\xa0overview\nDeveloper docs\nExplore pricing\nConsole log in\nNews\nLearn how to build with Claude\nSolutions\nCollaborate with Claude\nAI\xa0agents\nCoding\nCustomer support\nCase studies\nHear from our customers\nResearch\nResearch\nOverview\nEconomic Index\nClaude model family\nClaude 3.7 Sonnet\nClaude 3.5 Haiku\nClaude 3 Opus\nResearch\nClaude’s extended thinking\nCommitments\nInitiatives\nTransparency\nResponsible scaling policy\nTrust center\nSecurity and compliance\nAnnouncement\nISO\xa042001 certification

In [17]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [18]:
create_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'company page', 'url': 'https://www.anthropic.com/company'}, {'type': 'about page', 'url': 'https://www.anthropic.com/about'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}


# Anthropic: Building the Future of Safe AI

---

## Company Overview

**Anthropic** is a pioneering AI safety and research company headquartered in San Francisco. Our mission is to develop reliable, interpretable, and steerable AI systems, focused on ensuring that AI serves humanity’s best interests. With a commitment to making responsible advancements in AI, we recognize the profound impact our technologies may have on society.

### Our Purpose

At Anthropic, we believe that AI has the potential to fundamentally alter the world. To navigate this, we prioritize building systems people can trust while simultaneously investigating the opportunities and risks associated with AI.

---

## Company Culture

### Our Values
1. **Act for the Global Good**: We aim to maximize positive outcomes for humanity, making bold decisions to guide the technological revolution responsibly.
2. **Hold Light and Shade**: Understanding both the potential risks and benefits of AI is crucial for crafting its future.
3. **Be Good to Our Users**: We foster kindness and generosity towards our customers, stakeholders, and everyone impacted by our technology.
4. **Ignite a Race to the Top on Safety**: We are dedicated to pioneering the safest and most reliable AI systems in the industry.
5. **Do the Simple Thing that Works**: We focus on effective, empirical solutions that drive significant impact.
6. **Be Helpful, Honest, and Harmless**: We value a low-ego, high-trust culture that encourages thoughtful communication and collaboration.
7. **Put the Mission First**: Our mission guides every decision and fosters a collaborative work environment.

### Interdisciplinary Team

Our diverse team includes researchers, engineers, policy experts, and operational leaders, each contributing unique insights from various fields to advance our mission.

---

## Products and Services

At the forefront of our offerings is **Claude**, our cutting-edge AI model designed to assist in a wide range of applications from support roles to coding and beyond. Claude exemplifies our commitment to user-centric AI development.

- **Claude's Model Family**: 
  - Claude 3.7 Sonnet — Our latest and most advanced model, offering exceptional capabilities.
  - Claude Code — Tools for creating AI-powered applications.
- **API Solutions**: Collaborate with Claude to build custom experiences.

### Research Initiatives

Our focus on AI safety includes rigorous research in areas such as interpretability, reinforcement learning, and societal impacts. We believe that transparency and responsible scaling are crucial as we develop and apply innovative safety techniques.

---

## Careers at Anthropic

Join us in shaping the future of AI! We are looking for passionate individuals committed to our mission.

### What We Offer

- **Health & Wellness**: Comprehensive health, dental, and vision coverage, along with generous mental health support and 22 weeks of paid parental leave.
- **Compensation & Support**: Competitive salaries, equity packages, and various benefits that support both personal and professional development.
- **Flexible Work Environment**: We foster a collaborative atmosphere that encourages creativity and innovation.

### Join Our Team

Whether you are experienced in machine learning or just starting out, we invite you to explore our open roles. Together, we can drive the future of responsible AI.

---

## Connect with Us

For more information on services, partnerships, and career opportunities, visit our [website](https://www.anthropic.com) or connect with us on social media.

---

At Anthropic, we strive to create AI technologies that not only advance our capabilities but prioritize safety and sustainability for all humankind. Join us in our mission to guide the world through the transformative power of AI responsibly!

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [19]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [20]:
stream_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}]}


# Welcome to Anthropic

## Who We Are
At Anthropic, we are committed to building safe and effective AI systems that serve humanity's long-term well-being. As an AI safety and research company, we focus on creating reliable, interpretable, and steerable AI technologies. Our flagship product, Claude, is designed with human benefit at its core, showcasing our commitment to responsible AI development.

## Our Mission
Our mission centers on ensuring AI enhances human life by balancing the vast potential benefits with appropriate safeguards. We believe that AI development should be a collaborative effort involving government, academia, and civil society, thereby promoting safety across the industry.

## Company Culture
Anthropic operates as a Public Benefit Corporation with a strong emphasis on personal responsibility towards our mission. Our core values include:

- **Global Good:** We prioritize long-term positive outcomes for humanity through our technology.
- **User-Centric Approach:** We treat all users, from customers to policy-makers, with generosity and kindness.
- **Safety First:** We aspire to ignite a ‘race to the top’ for AI safety, inspiring other companies to follow suit.
- **Simplicity:** We aim for the simplest effective solutions and iterate based on results.
- **Integrity:** Kind, honest communication is paramount as everyone contributes to our mission.

## Careers at Anthropic
We offer diverse, interdisciplinary opportunities for those passionate about AI safety and research. Our current team members come from various fields, including physics, machine learning, public policy, and business. 

### What We Offer:
- **Health & Wellness:** Comprehensive health, dental, vision insurance, and mental health support.
- **Flexibility:** Generous paid leave and flexible time-off policies.
- **Compensation:** Competitive salaries, equity packages with donation matching, and retirement plans.
- **Culture of Growth:** Daily meals, educational stipends, and support for your home office setup.
  
We actively seek individuals who are not afraid to think outside the box, and we embrace candidates from varied backgrounds – even those without prior machine learning experience are encouraged to apply.

## Our Customers
We serve a wide range of customers including businesses, nonprofits, and civil society organizations, enabling them to leverage advanced AI for better outcomes. By focusing on user feedback and case studies, we ensure that our offerings resonate with real-world needs.

## Join Us
If you’re enthusiastic about shaping the future of AI and believe in the importance of responsible development, we invite you to explore our [current job openings](#) and become part of our mission to build a safer future for everyone.

## Contact Us
For inquiries or to explore partnerships, please speak with our sales team or check out our website for more information on our offerings, including Claude’s capabilities.

---

Thank you for considering Anthropic as your partner in AI innovation. Together, we can ensure that artificial intelligence remains a force for good!

In [21]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")


Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}]}


# Hugging Face Company Brochure

---

## About Us
**Hugging Face** is a vibrant **AI community** committed to building the future of machine learning (ML) and artificial intelligence (AI). Our platform fosters collaboration among users to create, discover, and share models, datasets, and applications. Our mission is to democratize quality machine learning, making it accessible to everyone.

### Company Vision
We aspire to be the go-to platform for machine learning, empowering users to innovate, collaborate, and deploy their projects more effectively using open-source technologies.

---

## Our Offerings
- **Models**: Access over **1 million** models across various modalities, including text, image, audio, video, and 3D.
  
- **Datasets**: A hub for over **250,000** datasets for diverse ML tasks, easily shared and utilized.
  
- **Spaces**: Explore thousands of applications and tools that harness the power of ML to solve real-world problems.
  
- **Enterprise Solutions**: We offer comprehensive **compute and enterprise options**, starting at just **$20/user/month** for teams needing advanced features like Single Sign-On and dedicated support.

### Key Customers
Hugging Face is utilized by over **50,000 organizations**, including industry leaders like:
- **Google**
- **Meta**
- **Amazon**
- **Microsoft**

---

## Community Engagement
At Hugging Face, we strongly value our community. We regularly host forums and discussions where users can share insights, ask questions, and support one another in their machine learning journeys. Our blog features articles from community members about the latest advancements, tutorials, and innovations in AI and ML.

---

## Company Culture
Hugging Face prides itself on fostering an inclusive and collaborative work environment. We advocate for the following principles:
- **Open Source Collaboration**: We build our tools and models in the open, encouraging contributions from our community.
- **Innovation**: Employees and community members are encouraged to experiment and share their findings.
- **Diversity & Inclusion**: We believe in creating a diverse workplace where everyone’s voice is heard and valued.

---

## Careers at Hugging Face
Join us in building the future of AI! We are continually looking for passionate, innovative individuals to join our team. **Explore current openings** on our careers page and consider becoming a part of a community that is dedicated to pushing the boundaries of machine learning.

---

## Connect With Us
For more information and to explore our resources, visit our website: [HuggingFace.co](https://huggingface.co)

**Follow us** on social media to stay updated:
- [Twitter](https://twitter.com/huggingface)
- [LinkedIn](https://linkedin.com/company/huggingface)
- [Discord](https://discordapp.com/invite/huggingface)

Together, let’s build the future of AI!