In [1]:
import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

In [3]:
MODEL = 'gpt-4o-mini'
openai = OpenAI()

In [4]:
class Website:
    """
    A utility class to represent a Website that we have scraped
    """
    url: str
    title: str
    text: str
    link: List[str]
    text: str
    
    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library 
        """
        self.url = url
        response = requests.get(url)
        self.boday = response.content
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]
    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
print(ed.get_contents())
print(ed.links)

Webpage Title:
Home - Edward Donner
Webpage Contents:
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers a

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Career/Jobs pages.\n"
link_system_prompt += "You should response in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Career/Jobs pages.
You should response in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [8]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of the links on the website of {website.url} - "
    user_prompt += "Please decide which of these are relevant web links for a brochure about the company, response with the full https URL in JSON format: \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [9]:
print(get_links_user_prompt(ed))

Here is the list of the links on the website of https://edwarddonner.com - Please decide which of these are relevant web links for a brochure about the company, response with the full https URL in JSON format: Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/09/15/ai-in-production-gen-ai-and-agentic-ai-on-aws-at-scale/
https://edwarddonner.com/2025/09/15/ai-in-production-gen-ai-and-agentic-ai-on-aws-at-scale/
https://e

In [10]:

def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],
        response_format = {"type": "json_object"}
    )
    result = completion.choices[0].message.content
    return json.loads(result)


In [11]:
get_links("https://anthropic.com")

{'links': [{'type': 'home page', 'url': 'https://www.anthropic.com/'},
  {'type': 'about page', 'url': 'https://www.anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'events page', 'url': 'https://www.anthropic.com/events'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'},
  {'type': 'economic futures page',
   'url': 'https://www.anthropic.com/economic-futures'},
  {'type': 'transparency page',
   'url': 'https://www.anthropic.com/transparency'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}

In [12]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    
    links = get_links(url) 
    print("Found links:", links)

    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result


In [13]:
print(get_all_details("https://anthropic.com"))

Found links: {'links': [{'type': 'homepage', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Skip to main content
Skip to footer
Research
Economic Futures
Commitments
Initiatives
Transparency
Responsible Scaling Policy
Trust center
Security and compliance
Learn
Learn
Anthropic Academy
Engineering at Anthropic
Developer docs
Company
About
Careers
Events
News
Try Claude
Try Claude
Try Claude
Learn more about Claude
Overview
Meet Claude
Pricing
Products
Claude
Claude Console
Claude Code
Claude Developer Platform
Models
Opus
Sonnet
Haiku
Log in to Claude
Log in to Claude
Log in to Claude
EN
This is s

In [14]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

In [15]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company:\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000]  # Truncate if more than 20,000 characters
    return user_prompt


In [16]:
print(get_brochure_user_prompt("Anthropic", "https://anthropic.com"))

Found links: {'links': [{'type': 'homepage', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'engineering page', 'url': 'https://www.anthropic.com/engineering'}]}
You are looking at a company called: Anthropic
Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company:
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Skip to main content

In [17]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [18]:
create_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'home page', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}, {'type': 'engineering page', 'url': 'https://www.anthropic.com/engineering'}]}


# Anthropic: Pioneering Safe AI for Humanity

---

## About Anthropic

**Anthropic** is a public benefit corporation based in San Francisco, focused on creating reliable, interpretable, and steerable artificial intelligence systems. We believe in the profound impact AI will have on the world and are committed to researching and building tools that prioritize human safety and well-being.

### Our Purpose
At Anthropic, we are dedicated to understanding AI’s opportunities and risks, while ensuring that our systems serve humanity's long-term interests. Our guiding principle is to build systems that people can rely on.

### Our Research and Products
We conduct innovative AI safety research and translate those findings into practical tools like **Claude**, ensuring that our products benefit businesses, civil society, and the general public. We regularly engage in discussions about safety in AI, which includes collaborations with various entities like governments, nonprofits, and academic institutions.

## Company Culture

### Our Values
1. **Act for the Global Good:** We aim to take bold actions that maximize positive outcomes for humanity.
2. **Hold Light and Shade:** We acknowledge both the risks and benefits of AI, striving to mitigate risks while enhancing benefits.
3. **Be Good to Our Users:** Kindness and generosity define our interactions with users and each other.
4. **Ignite a Race to the Top on Safety:** We aspire to lead the industry in creating safe and reliable AI systems.
5. **Do the Simple Thing that Works:** We prioritize effective, straightforward solutions.
6. **Be Helpful, Honest, and Harmless:** Our high-trust, low-ego environment fosters open communication and a considerate approach to our work.
7. **Put the Mission First:** Our mission is our top priority, guiding all our decisions and actions.

### Our Team
Our team is diverse, bringing expertise from numerous disciplines such as engineering, research, policy, and operations. We emphasize collaboration and a shared commitment to our mission, creating an atmosphere that values innovative thinking and continuous improvement.

## Careers at Anthropic

We strive to recruit talented individuals from various fields, including machine learning, public policy, and business. Our unique company culture promotes teamwork and collective ownership of ideas, making it an exciting place to work.

### Open Roles
We are constantly looking for passionate individuals who want to be a part of our journey in building safe AI. Join us in making a positive impact through technology.

### Anthropic Academy
For those interested in learning, the **Anthropic Academy** offers resources and training designed to help individuals build on their understanding of AI and its applications.

## Customer Commitment

At Anthropic, our customers are essential to our mission. We provide robust solutions like **Claude Console**, **Claude Code**, and tailored AI products across various domains, including education, financial services, and government. Our commitment to transparency and excellence ensures that we address customer needs effectively.

## Join Us in Shaping the Future

Anthropic is at the forefront of AI development, and we invite you to be part of this transformative journey. Together, we can build technologies that prioritize safety and benefit humanity as a whole. To learn more about our initiatives and explore career opportunities, visit our website.

---

**Connect with Us Today!**

For inquiries, training resources, or to explore open roles in our team, don't hesitate to reach out or visit our [website](https://www.anthropic.com).

**_Anthropic - Building AI for Humanity's Future._**

In [19]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
        stream = True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```", "").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [20]:
stream_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'main website', 'url': 'https://www.anthropic.com/'}, {'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'transparency page', 'url': 'https://www.anthropic.com/transparency'}, {'type': 'events page', 'url': 'https://www.anthropic.com/events'}, {'type': 'economic futures page', 'url': 'https://www.anthropic.com/economic-futures'}, {'type': 'learn page', 'url': 'https://www.anthropic.com/learn'}]}


# Anthropic Company Brochure

---

## About Us

### Who We Are
Anthropic is a pioneering public benefit corporation dedicated to the research and development of safe and reliable artificial intelligence (AI). Our mission is to build AI systems that are interpretable, steerable, and aligned with human values, prioritizing long-term benefits for humanity. We envision a future where AI technology works for the betterment of society.

### Our Purpose
We recognize that AI can profoundly impact various facets of life. At Anthropic, we focus on understanding and securing the benefits of AI while mitigating its risks. Our commitment involves rigorous research that informs our product development and policy-making efforts, ensuring responsible AI integration into the modern economy.

---

## Products

### Claude Suite
Claude is our flagship AI product, designed to offer robust, human-centric solutions across multiple applications such as coding, customer support, education, and government services. The Claude family includes:

- **Claude Console**
- **Claude Code**
- **Claude Developer Platform**
- Models: Opus, Sonnet, Haiku

### Innovations
Our continuous innovation includes the Anthropic Economic Index, assessing AI's influence on the labor market and the broader economy, enabling businesses and policymakers to make informed decisions.

---

## Company Culture

### Collaborative Environment
At Anthropic, we pride ourselves on our interdisciplinary team of researchers, engineers, policy experts, and operational leaders who foster collaboration and innovation. Our environment thrives on diversity of thought and shared expertise, equipping us to tackle complex challenges in AI development.

### Core Values
We maintain a strong value system that guides our actions and decisions:
1. **Act for the Global Good**: Ensure that our technology promotes positive outcomes for humanity.
2. **Hold Light and Shade**: Balance the potential benefits and risks of AI.
3. **Be Good to Our Users**: Cultivate kindness and generosity towards everyone impacted by our work.
4. **Ignite a Race to the Top on Safety**: Encourage a competitive drive for heightened AI safety standards industry-wide.
5. **Do the Simple Thing That Works**: Strive for empirical and pragmatic solutions.
6. **Be Helpful, Honest, and Harmless**: Foster a culture of trust through open and respectful communication.
7. **Put the Mission First**: Align all actions with our mission to build safe AI systems.

---

## Careers at Anthropic

### Join Us
We are actively seeking passionate individuals to join our mission of ensuring that AI benefits everyone. Our team members come from diverse backgrounds—including physics, machine learning, public policy, and business—working cooperatively to push the boundaries of safe AI development.

### Open Roles
We encourage enthusiastic candidates to explore our current job openings and to join a team committed to making AI safe. 

---

## Making an Impact

Anthropic is more than just a technology provider; we are a research partner, a policy consultant, and a thought leader in the AI space. We actively engage with civil society, government entities, and other organizations to promote the safe and secure development of AI technologies. 

Together, let’s build a future where AI can truly enhance the human experience!

---

For more information, visit our website at [Anthropic](https://www.anthropic.com).

--- 

© 2024 Anthropic PBC. All rights reserved.

In [21]:
stream_brochure("HuggingFace", "https://huggingface.co") #Brochure HuggingFace

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'models page', 'url': 'https://huggingface.co/models'}, {'type': 'datasets page', 'url': 'https://huggingface.co/datasets'}, {'type': 'spaces page', 'url': 'https://huggingface.co/spaces'}, {'type': 'documentation page', 'url': 'https://huggingface.co/docs'}, {'type': 'community discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Brochure

## Welcome to Hugging Face
**_The AI community building the future._**

Hugging Face is at the forefront of the machine learning revolution, providing a collaborative platform for developers, researchers, and organizations. We are dedicated to democratizing machine learning through open-source initiatives and empowering the community to innovate and create state-of-the-art AI applications.

---

### **Our Offerings**

#### **Models**
With over **1 million models** and a robust repository, users can easily discover and experiment with cutting-edge algorithms tailored for different tasks and modalities, including text, image, video, and audio.

#### **Datasets**
Access a vast library of **250,000+ datasets** to fuel your AI projects. Our platform enables users to share and collaborate on datasets seamlessly.

#### **Spaces**
Hugging Face provides **Spaces**, an interactive environment that allows developers to host and run their machine learning applications effortlessly.

#### **Enterprise Solutions**
Our enterprise platform offers advanced features to ensure scalability, security, and collaboration for organizations. Key benefits include:
- Enterprise-grade security and access controls
- Dedicated support and priority service
- Private datasets and analytics for usage tracking

Pricing for enterprise solutions starts at **$20/user/month**.

---

### **Our Community**
Join over **50,000 organizations**, including notable names like Google, Microsoft, and Amazon, who trust Hugging Face for their machine learning needs. Our vibrant community contributes to ongoing projects, engages in insightful discussions, and collaborates on innovative solutions through forums and blogs.

### **Company Culture**
At Hugging Face, we foster a culture of:
- **Inclusivity**: We believe every voice in our community matters. 
- **Innovation**: Trial and experimentation are key to our ethos, encouraging team members to push the boundaries of what’s possible in AI.
- **Collaboration**: Through our open-source projects, we promote collective growth and knowledge sharing.

---

### **Career Opportunities**
We are continuously looking for passionate individuals who share our vision of making machine learning accessible to all. Explore current job openings on our careers page. Join our team and help us build cutting-edge tools that shape the future of AI.

---

### **Get Involved**
- Explore AI applications and models on our **[website](https://huggingface.co)**.
- Join our community discussions on platforms like GitHub and Discord.
- Follow us on **Twitter**, **LinkedIn**, and **GitHub** for updates and insights.

---

**Join us at Hugging Face and become part of a community committed to building the future through collaboration and innovation in AI!**

In [22]:
stream_brochure("HUST", "https://hust.edu.vn/")

Found links: {'links': [{'type': 'home page', 'url': 'https://hust.edu.vn'}, {'type': 'about page', 'url': 'https://www.hust.edu.vn/vi/about/'}, {'type': 'about page', 'url': 'https://www.hust.edu.vn/vi/about/su-mang-tam-nhin-gia-tri-cot-loi-va-chinh-sach-chat-luong.html'}, {'type': 'careers page', 'url': 'https://tuyendung.hust.edu.vn/'}, {'type': 'admissions page', 'url': 'https://www.hust.edu.vn/vi/tuyen-sinh/'}, {'type': 'admissions page', 'url': 'https://www.hust.edu.vn/vi/tuyen-sinh/dai-hoc/thong-tin-tuyen-sinh-dai-hoc-nam-2025-651881.html'}, {'type': 'admissions page', 'url': 'https://www.hust.edu.vn/vi/tuyen-sinh/cao-hoc/'}, {'type': 'events page', 'url': 'https://hust.edu.vn/vi/events/'}, {'type': 'news page', 'url': 'https://hust.edu.vn/vi/news/'}, {'type': 'research page', 'url': 'https://www.hust.edu.vn/vi/nghien-cuu/'}]}


# HUST: HaNoi University of Science and Technology

**Overview**  
HUST, also known as HaNoi University of Science and Technology, is a prestigious educational institution located in Hanoi, Vietnam. Founded with the mission of advancing science and technology, HUST is dedicated to providing high-quality education, fostering research, and nurturing innovative talents.

---

## Our Mission and Vision

At HUST, our mission is to cultivate intellectual leaders equipped to advance technology and innovation in Vietnam and beyond. We aim to create a dynamic learning environment that emphasizes practical knowledge and fosters creativity, critical thinking, and collaboration.

### Core Values:
- **Innovation**: Encouraging creativity and progressive thinking in education and research.
- **Quality**: Upholding high standards in teaching, learning, and administration.
- **Community**: Promoting teamwork among staff, students, and partners.

---

## Our Culture

HUST prides itself on a collaborative and inclusive culture. Faculty, students, and staff work together as a community to achieve common goals. The university actively encourages student engagement through various extracurricular activities, research initiatives, labs, and clubs. The spirit of innovation is emphasized through events like the Young Creative Competition and Student Startup.

### Staff and Student Engagement
- **Commendations**: HUST has received numerous accolades for its innovation in education and research.
- **Support Systems**: Programs like career guidance and scholarships are provided to support students' needs.

---

## Education and Research

HUST offers a comprehensive range of undergraduate and postgraduate programs, including specialized degrees in engineering, technology, and the sciences. The university is dedicated to groundbreaking research and has established numerous partnerships with businesses and international institutions for collaborative projects.

### Areas of Focus:
- Engineering and Technology
- Natural Sciences
- Research and Innovation

---

## Careers at HUST

HUST is not only a place for students but also a welcoming environment for potential employees. We are always on the lookout for passionate and dedicated individuals to join our diverse team. Opportunities range from faculty positions to administrative roles, each fostering growth and innovation.

### Why Work at HUST?
- **Professional Development**: Continuous learning opportunities for growth.
- **Impactful Work**: Be part of a team that shapes the future of science and technology in Vietnam.
- **Community**: Join a community that values collaboration and support.

---

## Join Us

We welcome prospective students, partners, and recruits to explore what HUST has to offer. Whether you’re looking to advance your career, embark on a transformative educational journey, or contribute to cutting-edge research, there’s a place for you at HaNoi University of Science and Technology.

---

**Contact Us**  
**Website**: [HUST Official Site](#)  
**Location**: 1 Dai Co Viet Street, Bạch Mai Ward, Hanoi, Vietnam  
**Phone**: 024 3869 4242  

Explore the future of science and technology with HUST, where innovation meets education!