### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

#### imports

In [1]:
import os
import requests
from dotenv import load_dotenv
import json
from typing import List
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

#### initialize and constants

In [2]:
load_dotenv()
def require_env(name):
    v = os.getenv(name)
    try:
        print(f"{name} found and returned")
        return v
    except:
        raise RuntimeError(f"{name} not found")

API_KEY = require_env("OPENAI_API_KEY")
MODEL = 'gpt-4o-mini'
openai = OpenAI()

print("task completed")

OPENAI_API_KEY found and returned
task completed


#### web scraping bit 

In [3]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers = headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, "html.parser")
        self.title = soup.title.text if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(['script', 'style', 'input', 'img']):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Website Titile:\n{self.title}\n Webpage Content:\n{self.text}\n\n"

In [11]:
site = Website("https://protikmjalal.netlify.app/")
site.links

['#home',
 '#work',
 '#expertise',
 '#talks',
 '#publications',
 '#contact',
 '/privacy',
 'https://www.linkedin.com/in/mostafamohiuddin/',
 'https://github.com/mjsworks',
 '/Mostafa_Mohiuddin_Jalal_Resume.pdf',
 '#work',
 '#contact',
 'https://github.com/mjsworks',
 '#',
 '#publications',
 'https://github.com/mjsworks/t1_whistleOut',
 '#',
 'https://dagshub.com/protikcodes/machinelearningpipeline',
 'https://github.com/protikmostafa083/nlp_data-exploration_reddit',
 '#',
 'https://github.com/mjsworks/Big-Data-Projects',
 '#',
 '#',
 '#',
 '#',
 'mailto:MostafaMohiuddin.Jalal@uts.edu.au',
 'https://www.linkedin.com/in/mostafamohiuddin/',
 'https://github.com/mjsworks',
 '/Mostafa_Mohiuddin_Jalal_Resume.pdf',
 'https://github.com/mjsworks']

#### filter out the irrelevant links

In [5]:
link_system_prompt = """
you are provided with a list of links found on a website.

you are able to decide which of the links would be relevant to include in a brocure about the company,
such as links to an About page or a company page or Careers/jobs pages.

You should response in JSON as in the following example:
"""
link_system_prompt+="""{
    "links":[
        {"type":"about page", "url": "http:full.url/goes/here/about"},
        {"type":"career page", "url": "http:another.full.url/careers"}
    ]
}"""

link_system_prompt

'\nyou are provided with a list of links found on a website.\n\nyou are able to decide which of the links would be relevant to include in a brocure about the company,\nsuch as links to an About page or a company page or Careers/jobs pages.\n\nYou should response in JSON as in the following example:\n{\n    "links":[\n        {"type":"about page", "url": "http:full.url/goes/here/about"},\n        {"type":"career page", "url": "http:another.full.url/careers"}\n    ]\n}'

In [6]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of the links on the website of {website.title} -"
    user_prompt+= """please decide which of these are relevant web links fo a brochure about the company. 
    respond with full http url. Do not include terms of services, privacy and email likes and similar irrelevant links"""
    user_prompt +="Links are as follows: \n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [7]:
print(get_links_user_prompt(site))

Here is the list of the links on the website of Home - Edward Donner -please decide which of these are relevant web links fo a brochure about the company. 
    respond with full http url. Do not include terms of services, privacy and email likes and similar irrelevant linksLinks are as follows: 
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/
https://edwarddon

#### call to the model

In [8]:
def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role":"system", "content": link_system_prompt},
            {"role":"user", "content": get_links_user_prompt(website)},
        ],
        response_format = {"type": "json_object"} # claude does not have this. OpenAI does. 
        #But they mention in the documentation that we need to specify explicitly 
        #in the prompt that we need the response in JSON format. Otherwise it wont work
    )
    result = completion.choices[0].message.content
    # we can choose to have variations of answers as the model responds. We are taking the first response here.
    return json.loads(result)

In [16]:
get_links("https://anthropic.com")

{'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'},
  {'type': 'career page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://www.anthropic.com/team'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'},
  {'type': 'event page', 'url': 'https://www.anthropic.com/events'},
  {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'},
  {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}]}

#### make the brochure

In [21]:
def get_all_details(url):
    result = "landing page: \n"
    result += Website(url).get_contents()

    links = get_links(url)
    print(f"Found links: {links}")
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [22]:
print(get_all_details("https://anthropic.com"))

Found links: {'links': [{'type': 'company page', 'url': 'https://www.anthropic.com/company'}, {'type': 'career page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'about page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}]}
landing page: 
Website Titile:
Home \ Anthropic
 Webpage Content:
Skip to main content
Skip to footer
Claude
Chat with Claude
Overview
Max plan
Team plan
Enterprise plan
Explore pricing
Download apps
Claude log in
News
Claude’s Character
API
Build with Claude
API overview
Developer docs
Explore pricing
Console log in
News
Learn how to build with Claude
Solutions
Collaborate with Claude
AI agents
Coding
Customer support
Education
Financial services
Government
Case studies
Hear from our cu

In [46]:
system_prompt = """
You are an assistant that analyzes the content of several  relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
respond in markdown
include details of company culture, customers and careers/jobs if you have the information.
your tone has to be with short humors, entertaining and jokey. if there is any pdf file, try to read that. if you cannot, skip that
"""

In [47]:
system_prompt

'\nYou are an assistant that analyzes the content of several  relevant pages from a company website\nand creates a short brochure about the company for prospective customers, investors and recruits.\nrespond in markdown\ninclude details of company culture, customers and careers/jobs if you have the information.\nyour tone has to be with short humors, entertaining and jokey. if there is any pdf file, try to read that. if you cannot, skip that\n'

In [48]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"you are looking at a company called {company_name}\n"
    user_prompt += f"here are the contents of it's landing page anf other relevant pages; use this information to build a short brocure of the company in markdown"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20000] # truncate anything more than 20000 characters
    return user_prompt

In [26]:
get_brochure_user_prompt("Anthropic", "http://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'career page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'solutions page', 'url': 'https://www.anthropic.com/solutions'}]}


"you are looking at a company called Anthropic\nhere are the contents of it's landing page anf other relevant pages; use this information to build a short brocure of the company in markdownlanding page: \nWebsite Titile:\nHome \\ Anthropic\n Webpage Content:\nSkip to main content\nSkip to footer\nClaude\nChat with Claude\nOverview\nMax plan\nTeam plan\nEnterprise plan\nExplore pricing\nDownload apps\nClaude log in\nNews\nClaude’s Character\nAPI\nBuild with Claude\nAPI\xa0overview\nDeveloper docs\nExplore pricing\nConsole log in\nNews\nLearn how to build with Claude\nSolutions\nCollaborate with Claude\nAI\xa0agents\nCoding\nCustomer support\nEducation\nFinancial services\nGovernment\nCase studies\nHear from our customers\nResearch\nResearch\nOverview\nEconomic Index\nClaude model family\nClaude Opus 4.1\nClaude Sonnet 4\nClaude Haiku 3.5\nResearch\nClaude’s extended thinking\nCommitments\nInitiatives\nTransparency\nResponsible scaling policy\nTrust center\nSecurity and compliance\nAnnou

In [49]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ]
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [31]:
create_brochure("Anthropic", "http://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'career page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'customer page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'learning page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'engineering page', 'url': 'https://www.anthropic.com/engineering'}]}


```markdown
# Anthropic: Building the AI of the Future (Without the Apocalypse)

Welcome to Anthropic! We're an AI safety and research company with a mission as big as our ambitions (and our coffee mugs). We aim to create AI systems that are reliable, interpretable, and steerable—might we say, the good side of the robot uprising? 🤖💡

## Who are we?

Anthropic is fashioned from an eclectic mix of talents—from researchers who understand AI to policy experts who explain it to the rest of us mere mortals. Located in the sunny shores of San Francisco, we’re a team that even the Avengers would be jealous of—if they weren’t too busy saving the world in spandex. 

### Our Purpose

At Anthropic, we're driven by a deep belief that AI can change the world for the better (or worse, if you let it). That’s why we focus on building systems that benefit humanity, while trying to avoid the whole Skynet scenario. We don’t just build AI; we make sure it plays nice with the humans.

## Meet Claude

Meet Claude, our star AI model. Whether you need him to help out with coding, customer support, or even education, he’s always ready to lend a 'hand' (or whatever else AIs have). With Claude Opus 4.1, things just got a lot smarter around here—don’t worry, we still made sure he’s not plotting world domination… at least not today.

## Company Culture

We're not just about code and algorithms; we believe in people too! Our culture is built around seven core values that make us the fun-loving family you want to be part of—sorta like a sitcom, minus the laugh track.

- **Act for the Global Good:** We make bold moves to ensure our tech is a force for good. Think Captain Planet, but with more tech-savvy.
- **Hold Light and Shade:** We recognize that AI can be both a friend and foe—much like your average Facebook comment section.
- **Be Good to Our Users:** Our users, from customers to policy-makers, are treated like royalty—minus the awkward family drama.
- **Ignite a Race to the Top on Safety:** We set the standard high to inspire others to build safe AI. Let’s make it a challenge, shall we?
- **Do the Simple Thing That Works:** We embrace the idea that sometimes, the best solution is a good ol’ bicycle rather than an over-engineered spaceship.
- **Be Helpful, Honest, and Harmless:** Our organization prides itself on low-ego interactions. We’re all in this rollercoaster together, holding on for dear life!
- **Put the Mission First:** Collaboration and trust lead our way—because who else will keep those robots in check?

## Join Our Journey

Looking for a career that combines exciting challenges with meaningful impact? Join us to build safe AI and explore what it means to thrive in a mission-driven company. Check out our **open roles** (and maybe bring your best dad jokes to the interview).

### Perks of Being an Anthropic Ant

- Health & wellness packages that make you feel like a superhero.
- A competitive salary and equity—because what’s better than the sweet smell of shared success?
- Flexible paid time off, office snacks, and a daily shot of enthusiasm to keep the energy levels up.

## Customers, Not Just Users

We serve a range of industries, from education to financial services, ensuring that everyone can benefit from our AI tools. If you’re in need of reliable AI, we’re like that friend who always remembers your birthday and shows up with cake!

### Ready to Connect?

Curious to learn more? Whether you’re a prospective customer, an investor, or looking to join our ranks, [let's chat!](#)

Remember, at Anthropic, we aren’t just building AI; we’re creating a better future—one algorithm at a time. 

Now, who’s ready to try chatting with Claude? You might just find he’s more charming than most people you know! 😄
```


#### minor improvement

In [50]:
def simply_stream(company_name, url):
    stream = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role":"system", "content": system_prompt},
            {"role":"user", "content": get_brochure_user_prompt(company_name, url)}
        ],
        stream= True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True) # reserve the billnoard
    for chunk in stream:
        response += chunk.choices[0].delta.content or "" # getting small bit of the entire story. delta is for incremental pieces
        response = response.replace("```", "").replace("markdown", "") # trim out the doodles
        display_handle.update(Markdown(response))

In [43]:
simply_stream("Anthropic", "http://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'career page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'customers page', 'url': 'https://www.anthropic.com/customers'}, {'type': 'engineering page', 'url': 'https://www.anthropic.com/engineering'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}


# Welcome to Anthropic

Where we believe that AI should be the reliability you can take to the next dinner party without making awkward eye contact... but don’t worry, we’ll also help you with that!

---

## Who We Are

At **Anthropic**, we’re not just building another AI; we’re crafting companions that put humanity first. Think of us as your friendly neighborhood AI developers—if your neighbor was also a public benefit corporation headquartered in sunny San Francisco, and always had an empanada recipe handy!

### Our mission? 
To build brighter, safer AI systems for a brighter future. 🤖🌟

---

## Meet Claude: Our Cutting-Edge AI

- 🎉 **Claude Opus 4.1**: Not just a fancy name! Our top-tier AI model ready to tackle problems you never even thought of. 
- 🛠️ **AI Agents**: Work with Claude and collaborate on coding, customer support, education, financial services, and even government tasks—yes, we might even make taxes fun! (Just kidding, taxes are never fun.)
- 🧠 **Transparency**: We share our research and results because what’s the point of keeping it all to ourselves?

---

## Our Culture: The Anthropic Way

At Anthropic, we’re a diverse bunch—from techies to policy whizzes. But one thing unites us: our commitment to making the world a safer place with AI. It’s basically our love language!

Here’s what we value at Anthropic:
1. **Act for the global good**: We channel our creative energies into outcomes that don’t just benefit us, but everyone. Think of it as a group hug for humanity 🤗.
2. **Be good to our users**: Kindness is key! We treat everyone—from customers to coworkers—with generosity and respect. No string attached, just kindness!
3. **Ignite a race to the top**: It’s like a friendly competition, but instead of trophies, we’re aiming for AI systems that protect and serve humanity responsibly—gold stars all around! ⭐

---

## Join the Anthropic Family

Got an interest in AI, a burning passion for problem-solving, or just love plants (we do too)? We're always on the lookout for new teammates to join us in our mission!

### What We Offer:
- **Health & Wellness**: Because you can't run an AI company if you're not feeling your best! Comprehensive insurance, 22 weeks of paid parental leave, and flexible paid time off. 🏖️
- **Compensation & Support**: Competitive salaries with equity packages—because everyone loves a slice of the pie! 🥧
- **Daily Snacks & Meals**: Yes, that means you won’t have to survive on vending machine pretzels. 🥨

### Think you’ve got what it takes?
👉 Check our open roles and join a team that’s just as passionate about building the future as they are about surviving on coffee and good vibes.



---

## Get in Touch

Ready to chat with Claude, or maybe just someone on our team? Visit us at [Anthropic](https://www.anthropic.com) and let's talk about how we can make AI truly awesome together!

Just remember: If AI starts plotting world domination, we hope it's friendly! 😄

In [44]:
simply_stream("HuggingFace", "http://huggingface.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'career page', 'url': 'https://apply.workable.com/huggingface/'}]}


# Welcome to Hugging Face - The AI Community Building the Future! 🎉

At Hugging Face, we are not just a company; we're a vibrant **community** of machine learning enthusiasts, builders, and dreamers determined to transform the world with the magic of AI. Our mission? To democratize machine learning—because everyone deserves a little *hug* from AI!

---

## Our Offerings 🤖

- **Models:** Over **1 million models** and counting! We’re practically hoarding them. Want a language model that understands sarcasm? We’ve got you covered!
  
- **Datasets:** With **250k+ datasets**, you’ll never run out of data to analyze or a chatbot to confuse!

- **Applications & Spaces:** Dive into **400k+ applications** that let you run, play, and create AI models in our wonderful Spaces. It's like a theme park for data!

---

## Who Uses Us? 🌍

Join the league of over **50,000 organizations**, including big names like:
- Meta (Yes, they didn’t just create “the metaverse”!)
- Amazon (Nope, we do NOT sell books…yet!)
- Google (Know we’re in their algorithm? We do!)
- Microsoft (Why yes, we do make your Word smarter!)

---

## Company Culture 🤗

At Hugging Face, we live by a simple motto: **"Collabing is the new black!"** Our diverse team of 211 (and growing) loves sharing ideas over code and coffee. We're big on **inclusivity**, **team spirit**, and the occasional group karaoke session! 🎤

And if you're worried about fitting in, don’t! We actively encourage quirks and oddities. Have an unwelcome pet rock? Bring it along—“it can serve as inspiration”!

---

## Careers – Join the Hug Squad! 🚀

If you’re excited about **building AI** and want to be part of something transformative, check out our careers page! Whether you're a seasoned ML wizard or a coding newbie looking for adventure, we have roles that suit every level of wizardry!

### 🌟 Current Openings Include:
- Machine Learning Engineer
- Data Scientist
- Community Manager (to keep the hugs flowing!)

*Come make magic with us, and you might even get some free coffee!*

---

## Get Involved! 🎊

**Discover, Collaborate, or Just Stalk Us!**
- Check our models, datasets, and applications on our platform
- Join our Discord community (where only good vibes are allowed)

Feel free to drop us a line or pop by our website to explore more! We’re always around to chat about AI, data, and the latest cat video craze.

---

Whether you’re looking for cutting-edge models, datasets, or a friendly community to collaborate with, Hugging Face is the place to be! **Remember, in the world of AI, a good hug goes a long way!** 🤗

In [None]:
simply_stream("Protik", "https://protikmjalal.netlify.app/")