In [1]:
import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

In [3]:
MODEL = 'gpt-4o-mini'
openai = OpenAI()

In [4]:
class Website:
    """
    A utility class to represent a Website that we have scraped
    """
    url: str
    title: str
    text: str
    link: List[str]
    text: str
    
    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library 
        """
        self.url = url
        response = requests.get(url)
        self.boday = response.content
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]
    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
print(ed.get_contents())
print(ed.links)

Webpage Title:
Home - Edward Donner
Webpage Contents:
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers a

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Career/Jobs pages.\n"
link_system_prompt += "You should response in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Career/Jobs pages.
You should response in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [8]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of the links on the website of {website.url} - "
    user_prompt += "Please decide which of these are relevant web links for a brochure about the company, response with the full https URL in JSON format: \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [9]:
print(get_links_user_prompt(ed))

Here is the list of the links on the website of https://edwarddonner.com - Please decide which of these are relevant web links for a brochure about the company, response with the full https URL in JSON format: Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/09/15/ai-in-production-gen-ai-and-agentic-ai-on-aws-at-scale/
https://edwarddonner.com/2025/09/15/ai-in-production-gen-ai-and-agentic-ai-on-aws-at-scale/
https://e

In [10]:

def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],
        response_format = {"type": "json_object"}
    )
    result = completion.choices[0].message.content
    return json.loads(result)


In [11]:
get_links("https://anthropic.com")

{'links': [{'type': 'company page',
   'url': 'https://www.anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'about page', 'url': 'https://www.anthropic.com/learn'},
  {'type': 'events page', 'url': 'https://www.anthropic.com/events'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'}]}

In [12]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    
    links = get_links(url) 
    print("Found links:", links)

    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result


In [13]:
print(get_all_details("https://anthropic.com"))

Found links: {'links': [{'type': 'main company page', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Skip to main content
Skip to footer
Research
Economic Futures
Commitments
Initiatives
Transparency
Responsible Scaling Policy
Trust center
Security and compliance
Learn
Learn
Anthropic Academy
Engineering at Anthropic
Developer docs
Company
About
Careers
Events
News
Try Claude
Try Claude
Try Claude
Learn more about Claude
Overview


In [14]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

In [15]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company:\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000]  # Truncate if more than 20,000 characters
    return user_prompt


In [16]:
print(get_brochure_user_prompt("Anthropic", "https://anthropic.com"))

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}]}
You are looking at a company called: Anthropic
Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company:
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Skip to main content
Skip to footer
Research
Economic Futures
Commitments
Initiatives
Transparency
Responsible Scaling Policy
Trust center
Security and compliance
Learn
Learn
Anthropic Academy
Engineering at Anthropic
Developer docs
Company
About
Careers
Events
News
Try Claude
Try Claude
Try Claude
Learn more 

In [17]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [18]:
create_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'main page', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}]}


# Anthropic Brochure

---

## Welcome to Anthropic

At **Anthropic**, we're dedicated to making AI systems safer, more interpretable, and more reliable. As a **public benefit corporation**, we are committed to building technologies that serve humanity's long-term well-being. Our mission is simple yet profound: to harness the transformative potential of AI while ensuring its benefits are realized safely and ethically.

---

## Our Vision

Anthropic is founded on the belief that AI will have a vast impact on the world, bringing both opportunities and challenges. Our work emphasizes responsible scaling of AI technologies—focusing on creating systems that you can trust.

### Core Principles:
- **Safety First**: We prioritize AI safety through rigorous research and systematic approaches to mitigate risks.
- **Interdisciplinary Collaboration**: Our team consists of researchers, engineers, and policy experts from varied domains, fostering a rich environment for innovation.
- **Acting for Global Good**: We strive to maximize positive outcomes for humanity and promote safety industry-wide.

---

## What We Offer

**Products**:
- **Claude**: Our flagship AI product that values interpretability and alignment with human interests.
- **Claude Console & Code**: User-friendly platforms designed for seamless interaction with our AI technologies.
- **Anthropic Academy**: A resource for learning how to effectively utilize our technologies and enhance knowledge on AI.

**Research**:
Our **Anthropic Economic Index** tracks AI's impact on labor markets and the economy, fostering informed dialogue about AI's role in society.

---

## Company Culture

At Anthropic, we pride ourselves on a **high-trust, low-ego**, and **collaborative environment**. Our values guide us in our quest to shape the future responsibly. Some core tenets include:

- **Be Good to Our Users**: We consider everyone's needs—customers, policymakers, and civil society—ensuring kindness and support in all interactions.
- **Ignite a Race to the Top on Safety**: We strive to set industry standards for safety and reliability in AI development.

---

## Join Our Team

We’re on the lookout for talented individuals passionate about building safe AI. With roles across various disciplines, from engineering to policy analysis, you will find an inspiring and dynamic work environment. 

**Open Roles**: (Visit our [Careers Page](#) to see current job openings)

### Growth and Learning
At Anthropic, we encourage continual learning and support our employees' growth through various training programs and initiatives.

---

## Our Commitment to Transparency

We believe in transparency regarding our research, policies, and practices. Our **Responsible Scaling Policy** outlines how we approach the development and deployment of AI, ensuring safety is embedded in every decision we make.

---

## Get in Touch

Interested in learning more about our products, joining our team, or understanding our initiatives focused on AI safety?  
Visit us at: [Anthropic Website](#)

---

**Together, we can navigate the complexities of AI and unlock its full potential for a better future. Thank you for considering Anthropic—a leader in responsible AI development.**

In [19]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
        stream = True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```", "").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [20]:
stream_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'homepage', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}]}


# Anthropic: Building Safer AI for Humanity

---

## About Us

**Anthropic** is a pioneering public benefit corporation dedicated to advancing the field of artificial intelligence (AI) while ensuring safety and responsibility. Established with the mission of developing reliable, interpretable, and steerable AI systems, our efforts focus on promoting the benefits of AI while addressing its potential risks.

- **Mission Statement**: To secure the benefits of AI and mitigate its risks, ensuring it serves humanity's long-term well-being.

- **Core Values**:
  - Act for the global good.
  - Hold light and shade.
  - Be good to our users.
  - Ignite a race to the top on safety.
  - Do the simple thing that works.
  - Be helpful, honest, and harmless.
  - Put the mission first.

---

## Our Products

At Anthropic, we strive to translate cutting-edge research into practical solutions. Our flagship product, **Claude**, is designed to be a reliable AI assistant for various applications. It operates using a safe and robust architecture, promoting high-quality interactions in fields such as:

- **Education**
- **Financial Services**
- **Customer Support**
- **Government Initiatives**

**Feature Models**:
- Claude
- Claude Code
- Claude Console
- Claude Developer Platform

---

## Research and Initiatives

We conduct thorough research in various AI modalities and safety techniques, integrating findings back into product development. Our dedicated **Anthropic Academy** provides educational resources for developers to learn how to effectively build with Claude.

### Key Initiatives
- **Anthropic Economic Index**: An ongoing study assessing AI's impact on labor markets and the economy.
- **Responsible Scaling Policy**: Frameworks set to ensure responsible AI development and governance.

---

## Company Culture

At Anthropic, we believe that cultivating a positive and inclusive work environment leads to greater creativity and collaboration. Our team comprises experienced researchers, engineers, policy experts, and operational leaders who come from diverse backgrounds, including startups, academia, and government organizations.

- **Team Dynamics**: A collaborative approach where every team member’s input is valued.
- **Commitment to Safety**: Our workplace encourages a safety-first mindset, where we actively inspire a culture of reliability in AI systems.

---

## Join Our Team

We are always on the lookout for passionate individuals eager to contribute to the future of safe AI. Our diverse teams are built on collaboration and innovation. If you are interested in making a meaningful impact on technology and society, explore our **open roles**.

- **Opportunities**: Roles span across various disciplines including research, engineering, policy, and operations.
- **Location**: Headquartered in San Francisco, but with a growing remote workforce.

**Let’s build the future of AI together!** 

--- 

For more information about our initiatives, products, or careers, visit us at **[Anthropic.com](http://www.anthropic.com)**. 

**Together, we can shape a safer and more equitable future with AI.**

In [21]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'documentation page', 'url': 'https://huggingface.co/docs'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Brochure

---

## 🌟 Welcome to Hugging Face

### **The AI Community Building the Future**
At Hugging Face, we are more than just a company; we are a vibrant community dedicated to democratizing artificial intelligence. Our platform serves as the central hub where machine learning enthusiasts collaborate on models, datasets, and applications.

### **What We Offer**
- **1 Million+ Models**: Access a vast library of machine learning models that cater to various needs.
- **250,000+ Datasets**: Explore diverse datasets to power your AI projects.
- **Spaces**: Collaborate and showcase your AI applications easily.
- **Enterprise Solutions**: Tailored resources for organizations looking to integrate advanced AI solutions.

### **Join Our Community**
Over **50,000 organizations** globally, including industry giants like Google, Amazon, and Microsoft, utilize Hugging Face to leverage machine learning innovations. Our collaboration platform enables users to create, discover, and share cutting-edge AI technology.

---

## 🤝 Company Culture

### **Our Mission**
We aim to democratize machine learning, making it accessible and manageable for everyone. Every commit we make is a step towards building open-source tools that individuals and organizations can rely on.

### **Community-Driven**
Our culture thrives on collaboration. Whether you’re a developer, researcher, or hobbyist, everyone is welcome to contribute. We believe in sharing knowledge and resources to foster innovation and progress.

### **Inclusive Work Environment**
Hugging Face prioritizes diversity and inclusivity in the workplace, welcoming people from all backgrounds. We encourage a healthy work-life balance, offer flexible working conditions, and provide opportunities for personal and professional growth.

---

## 🚀 Careers at Hugging Face

### **Join Us!**
If you are passionate about leveraging AI to create remarkable products, check out our current job openings. We are looking for candidates who are excited to grow within a collaborative, inclusive, and innovative environment.

**Why work with us?**
- Opportunities for growth and development
- Be part of a mission-driven organization
- Contribute to open-source projects and initiatives

### **Current Openings**
Visit our careers page to see available positions and apply today!

---

## 💼 Enterprise Solutions

### **Partner with Us**
For organizations looking to scale their AI capabilities, we provide specialized solutions including:
- **Team & Enterprise Plans**: Ideal for groups to access our platform with advanced features, security measures, and dedicated support.
- **Subscription Options**: Starting at just **$20/user/month**, our plans can accommodate teams of all sizes.

### **Enterprise Features Include**
- Granular access control
- Advanced analytics and auditing
- Dedicated onboarding support
- Managed billing with annual commitments

---

## 🎓 Learn More

Explore our extensive documentation, participate in forums, and engage with our community on platforms like GitHub, Discord, and LinkedIn.

**Ready to collaborate?** [Sign up today!](https://huggingface.co)

---

Hugging Face is where AI innovation happens, and we want you to be a part of it. Whether you're a potential customer, investor, or recruit, we look forward to working with you in shaping the future of AI! 