## Full Business Solution: Build a Brochure for a Company

*[Coding along with the Udemy online course [LLM Engineering: Master AI & Large Language Models](https://www.udemy.com/course/llm-engineering-master-ai-and-large-language-models/) by Ed Donner; GitHub repo can be found at [github.com/ed-donner/llm_engineering](https://github.com/ed-donner/llm_engineering)]*

Practical project with the goal to create a product that builds a business brochure for a company. This brochure is to be used for prospective clients, investors and potential recruits.

__We'll provise a LLM with information scraped from a website and take the following steps the achieve our goal:__

1. Have GPT-4o-mini figure out which links are relevant

2. Make the Brochure (Let the LLM do the Job for You)

3. Adding the typewriter animation as a minor improvement

In [1]:
import requests
import json
from typing import List
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI
import pandas as pd

In [2]:
api_key = pd.read_csv("~/tmp/chat_gpt/agentic-design-1.txt", sep=" ", header=None)[0][0]
print("Don't be a fool and sent your api key to github")

Don't be a fool and sent your api key to github


In [3]:
openai = OpenAI(api_key=api_key)
MODEL = 'gpt-4o-mini' # defining model for later usage

In [4]:
# once again, a class to represent a Webpage
class Website:
    url: str
    title: str
    body: str
    links: List[str]

    def __init__(self, url):
        self.url = url
        response = requests.get(url)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [5]:
ed = Website("https://edwarddonner.com")
print(ed.get_contents())

Webpage Title:
Home - Edward Donner
Webpage Contents:
Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
Iâ€™m Ed. I like writing code and experimenting with LLMs, and hopefully youâ€™re here because you do too. I also enjoy DJing (but Iâ€™m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
Iâ€™m the co-founder and CTO of
Nebula.io
. Weâ€™re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. Iâ€™m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, weâ€™ve
patented
our matching model, and our award-winning platform has happy customers 

In [6]:
# printing the links we stored
print(ed.links)

['https://edwarddonner.com/', 'https://edwarddonner.com/outsmart/', 'https://edwarddonner.com/about-me-and-about-nebula/', 'https://edwarddonner.com/posts/', 'https://edwarddonner.com/', 'https://news.ycombinator.com', 'https://nebula.io/?utm_source=ed&utm_medium=referral', 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html', 'https://patents.google.com/patent/US20210049536A1/', 'https://www.linkedin.com/in/eddonner/', 'https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/', 'https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/', 'https://edwarddonner.com/2024/08/06/outsmart/', 'https://edwarddonner.com/2024/08/06/outsmart/', 'https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/', 'https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/', 'https://edwarddonner.com/2024/02/07/fine-tune-llm-on-texts-a-simulation

## Step 01: Have GPT-4o-mini figure out which links are relevant

__We'll use a call to gpt-4o-mini we'll provide it with the extracted links from the webpage and have it respond in a fixed format, in our case structured JSON.__

The LLm should decide for us which links are relevant and replace relative links such as "/about" with absolut links like "https://company.com/about".  

In this example will use __"one shot prompting"__ and we will provide the LLM with an example (in the prompt) of how we expect the LLm to respond.

__Sidenote:__ OpenAI is no providing the possibility to give the LLM a JSON Schema (a spec(ification)) with the message. This is a more advanced way we won't use here. This more advanced technique is called __"Structured Outputs"__ in which we require the model to respond according to a spec (stay tuned for Week 8 (autonomous Agentic AI project) to learn more about this technique).

In [7]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
# specifying concrete example of the JSON format that we want
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [8]:
link_system_prompt

'You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.\nYou should respond in JSON as in this example:\n{\n    "links": [\n        {"type": "about page", "url": "https://full.url/goes/here/about"},\n        {"type": "careers page": "url": "https://another.full.url/careers"}\n    ]\n}\n'

In [9]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [10]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/
https://edwarddonner.com/2024/10/16/from-software-engineer-to-ai-data-scientist-resources/
https://edwarddonner.com/2024/08/06/outsmart/
https://e

In [11]:
def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],
        # now instructing the llm to respond in json
        # in addition to this the instruction to respond in json must be included in the prompt
        response_format={"type": "json_object"}
    )
    result = completion.choices[0].message.content # will come back as a string
    return json.loads(result) # converting the string into a dictionary

In [12]:
get_links("https://anthropic.com")

{'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://anthropic.com/team'},
  {'type': 'enterprise page', 'url': 'https://anthropic.com/enterprise'},
  {'type': 'research page', 'url': 'https://anthropic.com/research'},
  {'type': 'api page', 'url': 'https://anthropic.com/api'},
  {'type': 'pricing page', 'url': 'https://anthropic.com/pricing'},
  {'type': 'news page', 'url': 'https://anthropic.com/news'}]}

## Step 02: Make the Brochure (Let the LLM do your Job for You)

Assemble all the details into another prompt to GPT-4o-mini while coding along with [week 01, section 20](https://www.udemy.com/course/llm-engineering-master-ai-and-large-language-models/learn/lecture/45772439#search).

In [13]:
# Assemble all the details into another prompt to GPT4-o
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        # finding the webpage for each link and add its contents to the result
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [14]:
# getting info for https://anthropic.com
print(get_all_details("https://anthropic.com"))

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}]}
Landing page:
Webpage Title:
Home \ Anthropic
Webpage Contents:
Claude
Overview
Team
Enterprise
API
Pricing
Research
Company
Careers
News
AI
research
and
products
that put safety at the frontier
New
Meet Claude 3.5 Sonnet
Claude 3.5 Sonnet, our most intelligent AI model, is now available.
Talk to Claude
API
Build with Claude
Start using Claude to drive efficiency and create new revenue streams.
Get started now
Our Work
Announcements
Claude 3.5 Sonnet
Jun 21, 2024
Alignment
Â·
Research
Constitutional AI: Harmlessness from AI Feedback
Dec 15, 2022
Announcements
Core Views on AI Safety: When, Why, What, and How
Mar 8, 2023
Work with Anthropic
Anthropic is an AI safety and research company based in San Francisco. Our interdisciplinary team has experience across ML, physics, policy, and product. Together, we generate research and create r

__Now everything is in place and we can come to producing the brochure itself.__

In [15]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."

In [16]:
system_prompt

'You are an assistant that analyzes the contents of several relevant pages from a company website and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.Include details of company culture, customers and careers/jobs if you have the information.'

In [17]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000] # Truncate if more than 20,000 characters
    return user_prompt

In [18]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [19]:
create_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}, {'type': 'team page', 'url': 'https://anthropic.com/team'}, {'type': 'research page', 'url': 'https://anthropic.com/research'}, {'type': 'news page', 'url': 'https://anthropic.com/news'}]}


# Anthropic: Pioneering AI Safety and Innovation

Welcome to **Anthropic**, an AI safety and research company headquartered in San Francisco. Our mission is to build reliable, interpretable, and steerable AI systems that prioritize safety and benefit society. Join us as we explore the opportunities and challenges presented by advanced AI.

---

## Meet Claude

**Claude 3.5 Sonnet** is our most advanced AI model, designed to streamline workflows across various sectors and elevate productivity. Claude acts as a virtual teammate, helping teams generate insights and complete tasks efficiently.

- **Transformative Collaboration**: Claude revolutionizes how teams work by synthesizing shared knowledge and driving creativity.
- **Empowered Workflows**: From drafting documents to troubleshooting code, Claude reduces time spent on tasks, ensuring teams can focus on impactful projects.

---

## Our Purpose

At Anthropic, we recognize the extensive impact AI will have on the world, and we are committed to developing systems that can be trusted. Our interdisciplinary team conducts frontier research to enhance AI safety, tackling various aspects from human feedback to societal implications.

### Core Values

1. **Mission-Driven**: We exist to ensure AI helps humanity flourish.
2. **Trust-Centric**: An environment of honesty and intellectual openness fosters better decision-making.
3. **Collaborative Spirit**: We believe in leveraging the diverse skills of our entire team.
4. **Pragmatic Approach**: We favor simple, effective solutions for complex problems.

---

## Company Culture

At Anthropic, our culture celebrates:
- **Interdisciplinary Collaboration**: Our team consists of engineers, researchers, policy experts, and business leaders, fostering rich collaborations.
- **Employee Wellbeing**: We offer comprehensive health insurance, parental leave, and unlimited PTO, aiming to support the wellness of our team and their families.
- **Professional Growth**: Continuous learning is encouraged, with educational stipends and flexible career development options.

---

## Join Our Team

We are always looking for talented individuals who are eager to contribute to our mission. Our hiring process is designed to assess skill and potential without bias while welcoming various backgrounds and experiences.

### Why Work At Anthropic?
- **Competitive Compensation**: Attractive salary packages with significant equity options.
- **Flexible Work Arrangements**: Remote work flexibility with support for relocation.
- **Comprehensive Benefits**: Health, wellness stipends, daily office lunches, and a nurturing work environment.

Explore current job openings and apply to join our team dedicated to shaping the future of AI.

---

## Customer Commitment

We serve a diverse array of clients across sectors, empowering businesses to harness AI effectively. Our partnerships and product offerings are crafted to enhance capabilities, promote efficiency, and ensure safety.

---

## Follow Us

Stay updated with our latest research, product developments, and news:
- [Twitter](#)
- [LinkedIn](#)
- [YouTube](#)

For inquiries, contact us at **press@anthropic.com**.

---

### **Anthropic**: Where AI innovation meets safety and reliability. Explore the future with us!

## Step 03: A Minor Improvement

With a small adjustment, we can change this so that the results stream back from OpenAI with the familiar typewriter animation.

In [20]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    # doing some more work her to ensure that markdown come properly
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [21]:
stream_brochure("Anthropic", "https://anthropic.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://anthropic.com/company'}, {'type': 'careers page', 'url': 'https://anthropic.com/careers'}, {'type': 'team page', 'url': 'https://anthropic.com/team'}, {'type': 'research page', 'url': 'https://anthropic.com/research'}]}


# Anthropic Brochure

## Who We Are
**Anthropic** is an AI safety and research company headquartered in San Francisco. Our mission is to develop reliable, interpretable, and steerable AI systems that prioritize safety at every level.

## Our Purpose
The rapid advancement of AI technologies presents both incredible opportunities and significant challenges. At Anthropic, we are dedicated to ensuring that transformative AI technologies help individuals and society flourish. We conduct rigorous research to evaluate the impacts of AI, with a commitment to transparency as we share our findings with the global community.

## Our AI Model: Claude
Introducing **Claude 3.5 Sonnet**, our most advanced AI model yet. Claude is designed to function as a virtual teammate, enhancing productivity by collaborating on tasks ranging from writing to coding. We invite you to [talk to Claude](#) and see the capabilities of our AI firsthand.

## Company Culture
At Anthropic, our workplace is characterized by:
- **Mission-Driven Work**: Every decision is driven by our commitment to AI safety and advancement.
- **High Trust Environment**: We prioritize honesty, kindness, and emotional maturity. Our culture encourages open disagreement and candid discussions.
- **Collaboration**: We believe that success comes from teamwork. While we have specialized teams, we operate as one cohesive unit toward our shared goals.

## Our Team
Our interdisciplinary team comprises:
- Researchers
- Engineers
- Policy Experts
- Business Leaders

Members come from diverse backgrounds, including machine learning, physics, public policy, and more, contributing to a rich blend of ideas and perspectives.

## Customer Base
Anthropic collaborates with a variety of sectors including businesses, nonprofits, and civil society organizations. Our products empower clients by harnessing AI to drive efficiency and unlock new revenue streams.

## Careers at Anthropic
We are always looking for passionate individuals to join our mission. We offer a supportive work environment with benefits that include:
- Competitive salaries and equity packages
- Unlimited Paid Time Off (PTO)
- Comprehensive health, dental, and vision insurance
- Substantial parental leave
- Flexible wellness stipends

### Hiring Process
Our interview process is designed to minimize bias while thoroughly assessing candidatesâ€™ fit for our team. If youâ€™re interested in working with us, please visit our [careers page](#) to explore open positions.

## Research and Development
At Anthropic, we engage in frontier research focused on improving AI safety. Our research explores areas such as interpretability, human feedback reinforcement learning, and policy impact analysis. We strive to communicate insights to policymakers and the public, influencing responsible AI development industry-wide.

## Get In Touch
Interested in learning more? Join us on our journey to make AI safer for all:
- [Website](#)  
- [LinkedIn](#)  
- [Twitter](#)  

---

**Anthropic**: Building AI that you can trust.  
**Together, letâ€™s shape the future responsibly.**

In [22]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'company page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'discussion forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Company Brochure

## Welcome to Hugging Face

### The AI Community Building the Future

At Hugging Face, we are on a mission to democratize good machine learning, one commit at a time. Hugging Face is a collaborative platform for the machine learning community to share models, datasets, and applications, enabling innovation in AI and ML.

### What We Offer

- **Models**: Over 400,000 models available to explore, including state-of-the-art solutions from leading organizations like NVIDIA, Meta, and Google.
- **Datasets**: Access and share over 100,000 datasets tailored for various machine learning tasks across different modalities such as text, image, video, and audio.
- **Spaces**: Utilize over 150,000 applications to create, discover, and collaborate on ML projects.
- **Compute Solutions**: Affordable pricing starting at $0.60/hour for GPU compute, with enterprise-ready options for teams seeking dedicated support.

### Our Customers

More than **50,000 organizations** use Hugging Face, including renowned names such as:
- **Amazon Web Services**
- **Microsoft**
- **Google**
- **Intel**
- **Uber**

### Company Culture

At Hugging Face, we pride ourselves on fostering a collaborative and inclusive environment that drives creativity and innovation. With a team of 218 dedicated members, we continuously strive for open-source contributions and community engagement. Our culture emphasizes:

- **Collaboration**: We work together to innovate and enhance machine learning tools.
- **Inclusivity**: We welcome contributions from diverse perspectives in the community.
- **Mission-Driven**: Everyone is motivated by the shared goal of democratizing AI.

### Careers at Hugging Face

Join our rapidly growing team and become part of a vibrant community passionate about machine learning. We are always looking for talented individuals to contribute to our mission:

- **Current Openings**: Explore exciting career opportunities [here](https://huggingface.co/jobs).
- **Why Work With Us?**:
  - Access to cutting-edge ML technologies.
  - Opportunities for professional growth and learning.
  - Flexible work arrangements to foster well-being.

### Get in Touch!

Visit us at: [huggingface.co](https://huggingface.co)

Follow us on social media to stay updated on our latest developments:
- [GitHub](https://github.com/huggingface)
- [Twitter](https://twitter.com/huggingface)
- [LinkedIn](https://www.linkedin.com/company/huggingface)

Join us in building the future of AI and machine learning! ðŸŒŸ

__Wrapping things up__

In this example we combined multiple calls to LLMs to achieve our goal and have the LLM produce a brochure for us. This can be considered as an example of an __Agentic AI design pattern__.
