## Full Business Solution: Build a Brochure for a Company

*[Coding along with the Udemy online course [LLM Engineering: Master AI & Large Language Models](https://www.udemy.com/course/llm-engineering-master-ai-and-large-language-models/) by Ed Donner; GitHub repo can be found at [github.com/ed-donner/llm_engineering](https://github.com/ed-donner/llm_engineering)]*

Practical project with the goal to create a product that builds a business brochure for a company. This brochure is to be used for prospective clients, investors and potential recruits.

__We'll provise a LLM with information scraped from a website and take the following steps the achieve our goal:__

1. Have GPT-4o-mini figure out which links are relevant

2. Make the Brochure (Let the LLM do the Job for You)

3. Adding the typewriter animation as a minor improvement

In [4]:
import requests
import json
from typing import List
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI
import pandas as pd

In [5]:
api_key = pd.read_csv("~/tmp/chat_gpt/agentic-design-1.txt", sep=" ", header=None)[0][0]
print("Don't be a fool and sent your api key to github")

Don't be a fool and sent your api key to github


In [9]:
openai = OpenAI(api_key=api_key)
MODEL = 'gpt-4o-mini' # defining model for later usage

In [7]:
# once again, a class to represent a Webpage
class Website:
    url: str
    title: str
    body: str
    links: List[str]

    def __init__(self, url):
        self.url = url
        response = requests.get(url)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [8]:
ed = Website("https://edwarddonner.com")
print(ed.get_contents())

Webpage Title:
Home - Edward Donner
Webpage Contents:
Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of pr

In [11]:
# printing the links we stored
print(ed.links)

['https://edwarddonner.com/', 'https://edwarddonner.com/outsmart/', 'https://edwarddonner.com/about-me-and-about-nebula/', 'https://edwarddonner.com/posts/', 'https://edwarddonner.com/', 'https://news.ycombinator.com', 'https://nebula.io/?utm_source=ed&utm_medium=referral', 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html', 'https://patents.google.com/patent/US20210049536A1/', 'https://www.linkedin.com/in/eddonner/', 'https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/', 'https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/', 'https://edwarddonner.com/2024/08/06/outsmart/', 'https://edwarddonner.com/2024/08/06/outsmart/', 'https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/', 'https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/', 'https://edwarddonner.com/2024/02/07/fine-tune-llm-on-texts-a-simulation

## Step 01: Have GPT-4o-mini figure out which links are relevant

__We'll use a call to gpt-4o-mini we'll provide it with the extracted links from the webpage and have it respond in a fixed format, in our case structured JSON.__

The LLm should decide for us which links are relevant and replace relative links such as "/about" with absolut links like "https://company.com/about".  

In this example will use __"one shot prompting"__ and we will provide the LLM with an example (in the prompt) of how we expect the LLm to respond.

__Sidenote:__ OpenAI is no providing the possibility to give the LLM a JSON Schema (a spec(ification)) with the message. This is a more advanced way we won't use here. This more advanced technique is called __"Structured Outputs"__ in which we require the model to respond according to a spec (stay tuned for Week 8 (autonomous Agentic AI project) to learn more about this technique).

In [12]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
# specifying concrete example of the JSON format that we want
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [13]:
link_system_prompt

'You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.\nYou should respond in JSON as in this example:\n{\n    "links": [\n        {"type": "about page", "url": "https://full.url/goes/here/about"},\n        {"type": "careers page": "url": "https://another.full.url/careers"}\n    ]\n}\n'

In [14]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [15]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/
https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/
https://edwarddonner.com/2024/08/06/outsmart/
https://e

In [20]:
def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],
        # now instructing the llm to respond in json
        # in addition to this the instruction to respond in json must be included in the prompt
        response_format={"type": "json_object"}
    )
    result = completion.choices[0].message.content # will come back as a string
    return json.loads(result) # converting the string into a dictionary

In [21]:
get_links("https://anthropic.com")

{'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://anthropic.com/team'},
  {'type': 'research page', 'url': 'https://anthropic.com/research'},
  {'type': 'news page', 'url': 'https://anthropic.com/news'}]}

## Step 02: Make the Brochure (Let the LLM do your Job for You)

Assemble all the details into another prompt to GPT-4o-mini while coding along with [week 01, section 20](https://www.udemy.com/course/llm-engineering-master-ai-and-large-language-models/learn/lecture/45772439#search).

In [23]:
# Assemble all the details into another prompt to GPT4-o
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        # finding the webpage for each link and add its contents to the result
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [24]:
# getting info for https://anthropic.com
print(get_all_details("https://anthropic.com"))

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}, {'type': 'team page', 'url': 'https://anthropic.com/team'}, {'type': 'research page', 'url': 'https://anthropic.com/research'}, {'type': 'enterprise page', 'url': 'https://anthropic.com/enterprise'}]}
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Claude
Overview
Team
Enterprise
API
Pricing
Research
Company
Careers
News
AI
research
and
products
that put safety at the frontier
New
Meet Claude 3.5 Sonnet
Claude 3.5 Sonnet, our most intelligent AI model, is now available.
Talk to Claude
API
Build with Claude
Start using Claude to drive efficiency and create new revenue streams.
Get started now
Our Work
Announcements
Claude 3.5 Sonnet
Jun 21, 2024
Alignment
·
Research
Constitutional AI: Harmlessness from AI Feedback
Dec 15, 2022
Announcements
Core Views on AI Safety: When, Why, What, and How
Mar 8, 2023
Work with Anthropi

__Now everything is in place and we can come to producing the brochure itself.__

In [25]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."

In [26]:
system_prompt

'You are an assistant that analyzes the contents of several relevant pages from a company website and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.Include details of company culture, customers and careers/jobs if you have the information.'

In [28]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000] # Truncate if more than 20,000 characters
    return user_prompt

In [29]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [30]:
create_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}, {'type': 'team page', 'url': 'https://anthropic.com/team'}, {'type': 'research page', 'url': 'https://anthropic.com/research'}, {'type': 'news page', 'url': 'https://anthropic.com/news'}]}


# Anthropic Company Brochure

## Welcome to Anthropic

Anthropic is an innovative AI safety and research company based in San Francisco. Our mission is to build reliable, interpretable, and steerable AI systems while prioritizing safety and ethical considerations in the development of artificial intelligence. We believe that AI has the potential to reshape the world, and we are dedicated to ensuring it does so in a beneficial manner.

---

## Our Core Offerings

### Meet Claude
Our cutting-edge AI model, **Claude 3.5 Sonnet**, is designed to enhance productivity and streamline workflows across various industries. Claude serves as both a virtual teammate and an expert assistant, enabling teams to achieve their objectives more efficiently by leveraging shared knowledge.

### API and Enterprise Solutions
With our robust **API**, developers can integrate Claude into their own applications, driving efficiency and creating new revenue opportunities. Whether you're in marketing, engineering, or customer support, Claude's capabilities transform the way teams work.

---

## A Culture of Collaboration and Trust

At Anthropic, we pride ourselves on our unique company culture characterized by:

- **Mission-Driven Work**: Our focus is on creating transformative AI that benefits society. Every team member is committed to our mission.
  
- **High Trust Environment**: We foster a culture where honesty, empathy, and intellectual openness are prioritized. Team members are encouraged to disagree kindly and contribute their unique perspectives.

- **Collaborative Spirit**: We view ourselves as one big team, breaking silos to work together towards common goals. Everyone has a role in defining our direction and the success of our projects.

- **Empirical Approach**: Instead of chasing the cleverest solutions, we prioritize sensible, practical approaches that have proven effective.

---

## Research and Innovation

Anthropic is committed to conducting frontier AI research. Our interdisciplinary teams explore safety techniques and assess the societal impacts of AI technology. We regularly share our findings with the global community to promote safe and responsible AI development.

**Key Research Areas:**
- AI interpretability
- Reinforcement learning from human feedback
- Policy and societal impacts analysis

---

## Join Our Team

We are on the lookout for passionate individuals who want to contribute to the future of AI safety. At Anthropic, you will find:

### Benefits and Perks
- Comprehensive health, dental, and vision insurance
- Unlimited PTO policy
- Competitive salary and equity packages
- Flexible wellness stipends and commuter coverage
- Paid parental leave up to 22 weeks

### Career Growth
Our hiring process is designed to minimize bias and showcase your actual capabilities. We value diverse experiences, and you don't need a PhD or prior ML experience to apply! **Explore our open roles and apply to join us in making AI safe.**

---

## Company Governance

As a Public Benefit Corporation, our primary aim is the responsible development of advanced AI for the long-term benefit of humanity. Our Board of Directors and Long-Term Benefit Trust oversee our mission to ensure ethical practices in every facet of our business.

---

## Connect with Us

For more information, partnerships, or inquiries, feel free to reach out through our website or follow us on:
- [Twitter](#)
- [LinkedIn](#)
- [YouTube](#)

Let’s work together to build a safer AI future!

## Step 03: A Minor Improvement

With a small adjustment, we can change this so that the results stream back from OpenAI with the familiar typewriter animation.

In [32]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    # doing some more work her to ensure that markdown come properly
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [34]:
stream_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}, {'type': 'team page', 'url': 'https://anthropic.com/team'}, {'type': 'research page', 'url': 'https://anthropic.com/research'}]}


# Anthropic Company Brochure

---

## About Us
Anthropic is an AI safety and research company, headquartered in San Francisco, committed to developing reliable, interpretable, and steerable AI systems. Through our interdisciplinary team of experts in machine learning, physics, policy, and business, we strive to create beneficial AI that people can trust.

### Mission
Our mission is to ensure transformative AI technologies help both individuals and society flourish. We recognize the vast potential and risks laid out by rapidly advancing AI systems, and we actively pursue the safe and responsible development of these technologies.

---

## Our Product: Claude
### Meet Claude 3.5 Sonnet
Claude 3.5 Sonnet, our flagship AI model, offers enterprises unparalleled efficiency and the ability to innovate revenue streams. Claude functions as a collaborative teammate, empowering teams to create ideas, draft documents, write code, and tackle multiple tasks effectively.

**Key Features:**
- Generate and debug code.
- Create marketing campaigns effortlessly.
- Draft personalized communications in seconds.
- Enhance productivity across various functions from engineering to marketing.

---

## Company Culture
At Anthropic, we foster an environment built on trust, collaboration, and pragmatism. Our core values include:

- **Here for the mission:** We prioritize our commitment to building safer AI systems.
- **Unusually high trust:** We assume good faith and value honest communication.
- **One big team:** Collaboration is paramount, and every team member contributes to our collective goals.
- **Do the simple thing that works:** We celebrate practical solutions over overly complex strategies.

### Diversity and Inclusion
Our interdisciplinary team comprises diverse backgrounds, reflecting our commitment to inclusivity and broad perspectives in our work.

---

## Customers
We serve a variety of clients including businesses, nonprofits, and civil society organizations. Through our product offerings and partnerships, we are continuously translating our research into valuable tools that enhance AI safety and efficacy across different domains.

---

## Careers at Anthropic
Join a team that is making strides in AI safety! We are always on the lookout for passionate individuals who share our mission to responsibly develop AI. 

### Open Roles
- Research Scientists
- Software Engineers
- Policy Experts
- Business Operations

### What We Offer
- **Health & Wellness:** Comprehensive health, dental, and vision plans, inclusive family benefits, unlimited PTO.
- **Compensation:** Competitive salary with equity options and a 401(k) matching plan.
- **Additional Benefits:** Flexible wellness stipend, relocation support, daily office lunches, and more!

---

## Join Us!
Are you ready to help shape the future of AI? Check our current job openings and become part of our mission-driven team at Anthropic.

### [Explore Open Roles](https://www.anthropic.com/careers)

---

### Stay Connected
Follow us for the latest updates and research insights:
- [Twitter](https://www.twitter.com/Anthropic)
- [LinkedIn](https://www.linkedin.com/company/Anthropic)
- [YouTube](https://www.youtube.com/c/Anthropic)

---

**Anthropic PBC**  
[www.anthropic.com](https://www.anthropic.com)  
© 2024 Anthropic PBC

In [35]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Company Brochure

---

## About Us
**Hugging Face** is the pioneering AI community dedicated to building the future of machine learning. Our mission is to democratize machine learning, empowering individuals and organizations to create, discover, and collaborate on models, datasets, and applications seamlessly. 

With a platform hosting over **400k models** and **100k datasets**, we are at the forefront of technological innovation, catering to a dynamic and ever-growing machine learning community.

---

## Our Community & Culture
At Hugging Face, we foster an inclusive and vibrant community where collaboration is key. Our team comprises **218 passionate members**, united in a common goal: to advance AI technology for everyone. Our culture emphasizes transparency, open-source collaboration, and collective growth. 

Join us in creating a supportive environment where ideas flourish, and contributions are celebrated! 

- **Team Spirit**: Daily collaborations and community discussions foster innovation.
- **Diversity & Inclusion**: We welcome diverse perspectives to strengthen our solutions.
- **Continuous Learning**: Opportunities for professional development through shared knowledge.

---

## Customer Base
With more than **50,000 organizations** utilizing our platform, including prominent names like **Google**, **Amazon Web Services**, **Microsoft**, and **NVIDIA**, Hugging Face provides AI solutions that span various industries such as healthcare, finance, and technology.

Our customers range from small startups to large enterprises, all benefitting from our cutting-edge technologies and user-friendly platforms.

---

## Career Opportunities
**Join Our Team!**
We are always looking for innovative and driven individuals ready to contribute to our mission. Discover numerous opportunities across technical and non-technical roles, including data scientists, software engineers, and product managers.

- **Current Openings**: Explore our careers page for the latest job listings.
- **Benefits**: We offer competitive salaries, flexible working hours, and a supportive work environment to help you achieve your career goals.

---

## Services & Solutions
Our platform includes:
- **Models**: An extensive repository to create and utilize machine learning models.
- **Datasets**: Access and share a vast range of datasets for various machine learning tasks.
- **Spaces**: Build and host applications seamlessly through our dedicated spaces.

### Enterprise Solutions
For organizations seeking advanced capabilities, our **Enterprise Hub** provides:
- Enterprise-grade security and access controls.
- Dedicated support options for maximizing platform usage.
- Customizable compute options to enhance scalability.

---

## Connect with Us
Ready to explore more? Visit our website or follow our discussions in the community forums and social media.

- **Website**: [Hugging Face](https://huggingface.co)
- **Twitter**: [@huggingface](https://twitter.com/huggingface)
- **LinkedIn**: [Hugging Face LinkedIn](https://www.linkedin.com/company/huggingface)

Join us on this exciting journey as we build the future of AI together!

--- 

**Hugging Face** - The AI community building the future.

__Wrapping things up__

In this example we combined multiple calls to LLMs to achieve our goal and have the LLM produce a brochure for us. This can be considered as an example of an __Agentic AI design pattern__.
