In [None]:
# A full business solution


### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [4]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [6]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [7]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [10]:
Arhum = Website("https://arhumhussainportfolio.com/")
print(Arhum.get_contents())

Webpage Title:
Syed Arhum Hussain - AI/ML Engineer Portfolio | Syed Arhum Hussain – AI/ML Engineer
Webpage Contents:
Home
Portfolio
Skills
Services
Home
Portfolio
Skills
Services
I'm
Syed Arhum Hussain
Explore innovative AI solutions and projects by Syed Arhum Hussain,  a passionate AI/ML Engineer and Data Scientist with a strong foundation in Large Language Models (LLMs), OCR, and intelligent chatbot systems.
Download CV
Skills
N8N
Automating tasks using AI Agents
LLMs, LangChain , RAG
ChatBots
Run JavaScript on the server
API Integrations
RESTful AI automation
Agentic AI
Agents performing dynamic tasks
Twilio
For automated voice/SMS
NiFi
Data extraction, transformation, loading
AWS S3
Docker
Hosting Kafka, n8n workflow
SQL
MySQL , MangoDB , ChromaDB, Vector DB
React.js
Build UI in the browser
Node.js
Scikit-learn, PyTorch , OpenCV
Big data storage
PYTHON
Projects
Explore innovative AI solutions and machine learning projects here.
CareShare
Smart Donation website with Intelligent Chat

In [12]:
Arhum = Website("https://arhumhussainportfolio.com/")
Arhum.links

['/',
 '/',
 '/portfolio',
 '/skills',
 '/services',
 '/',
 '/',
 '/portfolio',
 '/skills',
 '/services',
 'https://assets.zyrosite.com/YanJKzBP7BfN5xVE/syed-arhum-hussain-resume-Awv9xBJ2pjFlDaBz.pdf',
 'mailto:arhumhussain212@gmail.com',
 'https://www.linkedin.com/in/arhum-hussain/',
 'https://github.com/SyedArhumHussain?tab=repositories',
 'https://x.com/ArhumHussain12',
 'https://github.com/SyedArhumHussain/CareShare',
 'https://github.com/SyedArhumHussain/2025-Monaco-Grand-Prix',
 'https://github.com/SyedArhumHussain/2025-Monaco-Grand-Prix',
 'https://www.coursera.org/account/accomplishments/verify/O3GS88QMVY9Z',
 'https://www.coursera.org/account/accomplishments/verify/XPKG2NUPFHQE',
 'https://www.coursera.org/account/accomplishments/professional-cert/L2DSQEMKYSEY',
 'https://www.coursera.org/account/accomplishments/verify/0T7AHXWE05JJ',
 'https://www.coursera.org/account/accomplishments/verify/CAOU7GOIUHU0']

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [13]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [14]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [15]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [16]:
print(get_links_user_prompt(Arhum))

Here is the list of links on the website of https://arhumhussainportfolio.com/ - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
/
/
/portfolio
/skills
/services
/
/
/portfolio
/skills
/services
https://assets.zyrosite.com/YanJKzBP7BfN5xVE/syed-arhum-hussain-resume-Awv9xBJ2pjFlDaBz.pdf
mailto:arhumhussain212@gmail.com
https://www.linkedin.com/in/arhum-hussain/
https://github.com/SyedArhumHussain?tab=repositories
https://x.com/ArhumHussain12
https://github.com/SyedArhumHussain/CareShare
https://github.com/SyedArhumHussain/2025-Monaco-Grand-Prix
https://github.com/SyedArhumHussain/2025-Monaco-Grand-Prix
https://www.coursera.org/account/accomplishments/verify/O3GS88QMVY9Z
https://www.coursera.org/account/accomplishments/verify/XPKG2NUPFHQE
https://www.coursera.org/account/accomplishments/professional-cert/L2DSQEM

In [19]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [24]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

Anthropic = Website("https://www.anthropic.com/")
Anthropic.links

['#main',
 '#footer',
 'https://www.anthropic.com/',
 'https://www.anthropic.com/claude',
 'https://www.anthropic.com/max',
 'https://www.anthropic.com/team',
 'https://www.anthropic.com/enterprise',
 'https://www.anthropic.com/pricing',
 'https://claude.ai/download',
 'https://claude.ai/',
 'https://www.anthropic.com/news/claude-character',
 'https://www.anthropic.com/api',
 'https://docs.anthropic.com/',
 'https://www.anthropic.com/pricing#api',
 'https://console.anthropic.com/',
 'https://docs.anthropic.com/en/docs/welcome',
 'https://www.anthropic.com/solutions/agents',
 'https://www.anthropic.com/solutions/coding',
 'https://www.anthropic.com/solutions/customer-support',
 'https://www.anthropic.com/solutions/education',
 'https://www.anthropic.com/solutions/financial-services',
 'https://www.anthropic.com/solutions/government',
 'https://www.anthropic.com/customers',
 'https://www.anthropic.com/research',
 'https://www.anthropic.com/economic-index',
 'https://www.anthropic.com/cla

In [23]:
get_links("https://www.anthropic.com/")

{'links': [{'type': 'company page',
   'url': 'https://www.anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://www.anthropic.com/team'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'},
  {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions/'},
  {'type': 'events page', 'url': 'https://www.anthropic.com/events'},
  {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'},
  {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'},
  {'type': 'transparency page',
   'url': 'https://www.anthropic.com/transparency'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [26]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [27]:
print(get_all_details("https://www.anthropic.com/"))

Found links: {'links': [{'type': 'company page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'about page', 'url': 'https://www.anthropic.com/about'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}]}
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Skip to main content
Skip to footer
Claude
Chat with Claude
Overview
Max plan
Team plan
Enterprise plan
Explore pricing
Download apps
Claude log in
News
Claude’s Character
API
Build with Claude
API overview
Developer docs
Explore pricing
Console log in
News
Learn how to build with Claude
Solution

In [29]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [30]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [31]:
get_brochure_user_prompt("Anthropic", "https://www.anthropic.com/")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'contact sales page', 'url': 'https://www.anthropic.com/contact-sales'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'engineering page', 'url': 'https://www.anthropic.com/engineering'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}]}


'You are looking at a company called: Anthropic\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHome \\ Anthropic\nWebpage Contents:\nSkip to main content\nSkip to footer\nClaude\nChat with Claude\nOverview\nMax plan\nTeam plan\nEnterprise plan\nExplore pricing\nDownload apps\nClaude log in\nNews\nClaude’s Character\nAPI\nBuild with Claude\nAPI\xa0overview\nDeveloper docs\nExplore pricing\nConsole log in\nNews\nLearn how to build with Claude\nSolutions\nCollaborate with Claude\nAI\xa0agents\nCoding\nCustomer support\nEducation\nFinancial services\nGovernment\nCase studies\nHear from our customers\nResearch\nResearch\nOverview\nEconomic Index\nClaude model family\nClaude Opus 4.1\nClaude Sonnet 4\nClaude Haiku 3.5\nResearch\nClaude’s extended thinking\nCommitments\nInitiatives\nTransparency\nResponsible scaling policy\nTrust center\nSecurity and compliance\nAnn

In [32]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [38]:
create_brochure("Figma", "https://www.figma.com/")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.figma.com/'}, {'type': 'careers page', 'url': 'https://www.figma.com/careers/'}, {'type': 'company news', 'url': 'https://www.figma.com/newsroom/'}, {'type': 'contact page', 'url': 'https://www.figma.com/contact/'}, {'type': 'pricing page', 'url': 'https://www.figma.com/pricing/'}]}


# Figma Company Brochure

## About Us
Welcome to Figma, the leading collaborative design tool that empowers teams to create, prototype, and iterate together seamlessly. Whether you are in design, engineering, or product management, Figma enables a shared environment where creativity and collaboration flourish. 

---

## Our Products
Figma provides a suite of tools designed to meet the diverse needs of modern product development:

- **Figma Design**: A comprehensive platform for designing and prototyping products in one place.
- **Dev Mode**: A specialized workspace that helps developers translate designs into code effortlessly.
- **FigJam**: A digital whiteboard for brainstorming, planning, and collaborating.
- **Figma Slides**: Co-create stunning presentations collaboratively.
- **Figma Draw**: Leverage advanced vector tools for creative illustrations.
- **Figma Buzz**: In beta, this feature allows for the production of on-brand assets at scale.
- **Figma Sites**: In beta, publish fully responsive websites quickly.
- **Figma Make**: An innovative tool that utilizes AI to convert your design ideas into code.

Explore use cases that enhance productivity across design systems, UX design, web development, and more.

---

## Our Customers
Figma serves a vast range of customers from varied sectors including:

- **Enterprises**: Large organizations looking for integrated solutions for their design and development teams.
- **Educational Institutions**: Schools and universities leveraging collaborative design tools to enhance learning and creativity.
- **Startups**: Agile teams aiming for rapid iterations in product development.

Join leading product teams in transforming ideas into effective solutions.  

---

## Our Culture
At Figma, we prioritize collaboration, creativity, and community. Our values include:

- **Inclusivity**: We foster an environment where diverse perspectives are welcomed and valued.
- **Empowerment**: We encourage our team members to take initiative and influence projects and decisions.
- **Growth**: We believe in continuous learning and offer opportunities for professional development through events, webinars, and a rich resource library.
  
Figma is a place where creativity takes center stage, driving innovative problem-solving across teams and disciplines.

---

## Careers
Join our dynamic team at Figma! We are always on the lookout for talented individuals who are passionate about design, technology, and collaboration. Whether you're a designer, developer, or product manager, you can find a role that matches your skills and aspirations. 

Explore our current job openings and become part of a fast-paced, innovative environment that values your contributions and nurtures your personal and professional growth.

---

## Connect With Us
Interested in learning more about Figma? 
- **Get Started**: Sign up for free and explore our tools today!
- **Contact Us**: Have questions? Reach out to our sales team for more information.

Experience the future of design with Figma—where great products are built collaboratively!  

--- 

**Figma**: Think bigger. Build faster.

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [34]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [36]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'documentation page', 'url': 'https://huggingface.co/docs'}, {'type': 'community forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}]}


# Hugging Face: The AI Community Building the Future

---

## About Us

At Hugging Face, we are more than just a company; we are a vibrant community dedicated to transforming the landscape of Artificial Intelligence (AI) and Machine Learning (ML). Our platform empowers researchers, developers, and organizations to collaborate seamlessly on models, datasets, and applications in the ever-evolving world of AI. With over 50,000 organizations using our services, including industry giants like Meta, Amazon, and Google, we are at the forefront of innovation.

## Products and Services

### Collaborate on Models and Datasets
- **1M+ Models and 250k+ Datasets:** Our open-source platform allows users to explore a vast library of machine learning models (like GPT and others) and datasets, ensuring you have all the resources needed to accelerate your projects.
- **Spaces:** Create, share, and run your applications easily. Whether it's generating images or web application code from descriptions, our Spaces feature allows for limitless creativity.

### Enterprise Solutions
We offer tailored packages for teams and enterprises, providing:
- Advanced security measures and access controls.
- Priority support and audit logs.
- Deployment on optimized inference endpoints for swift model integration.

### Community Driven
We encourage collaboration and contribution, making tools like **Transformers** and **Diffusers** available to everyone. Dive into our extensive libraries and join the discussion in our forums.

## Company Culture

At Hugging Face, our core belief is in the power of community. We foster an inclusive, collaborative environment where innovation thrives. Our diverse team of over 211 dedicated individuals is passionate about technology and committed to building accessible AI solutions. We value creativity, mutual respect, and the sharing of knowledge to ensure everyone’s success in their AI journey.

## Careers

Looking for a place where your skills can shine? Join us!
- **Open Positions:** We're always on the lookout for talented individuals who are eager to push the boundaries of AI.
- **Work Environment:** Embrace a culture of continuous learning, autonomy, and respect. Whether you’re an experienced professional or just starting your career, there’s a place for you at Hugging Face.

Explore career opportunities on our website and become a part of building the future of AI!

## Join Us

Are you ready to explore the possibilities of AI? Whether you are a researcher, developer, or enthusiast, join our growing community today!

- **Website:** [Hugging Face](https://huggingface.co)
- **Follow Us on Social Media:** [Twitter](https://twitter.com/huggingface) | [LinkedIn](https://linkedin.com/company/huggingface) | [Discord](https://discord.gg/huggingface)

Together, let’s build the future of AI!