# A Full Business Solution

## Now taking our *Scraper Summarizer* project to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

Acompany name and their primary website will be provided.

See the end of this notebook for examples of real-world business applications

In [1]:
# imports

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [3]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [None]:
bpcs = Website("https://bpcs.com")
bpcs.links

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. 

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [9]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [None]:
print(get_links_user_prompt(bpcs))

In [11]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [None]:
bpcs = Website("https://bpcs.com")
bpcs.links

In [14]:
get_links("https://Anthropic.co")

{'links': [{'type': 'company page', 'url': 'https://anthropiclabs.com'},
  {'type': 'company page', 'url': 'https://anthropiclabs.com/'}]}

In [15]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'blog', 'url': 'https://huggingface.co/blog'},
  {'type': 'company page',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

In [None]:
get_links("https://bpcs.com")

## Second Step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [17]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [None]:
print(get_all_details("https://bpcs.com"))

In [None]:
print(get_all_details("https://huggingface.co"))

In [19]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."


In [20]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [21]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'status page', 'url': 'https://status.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


'You are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHugging Face – The AI community building the future.\nWebpage Contents:\nHugging Face\nModels\nDatasets\nSpaces\nPosts\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\nQwen/Qwen2.5-Omni-7B\nUpdated\n1 day ago\n•\n53k\n•\n998\ndeepseek-ai/DeepSeek-V3-0324\nUpdated\n6 days ago\n•\n86.6k\n•\n2.18k\nmanycore-research/SpatialLM-Llama-1B\nUpdated\n12 days ago\n•\n12.6k\n•\n844\nds4sd/SmolDocling-256M-preview\nUpdated\n9 days ago\n•\n57.8k\n•\n1.09k\nByteDance/InfiniteYou\nUpdated\n7 days ago\n•\n514\nBrowse 1M+ models\nSpaces\nRunning\n1.18k\n1.18k\nDeepSite\n🐳\nG

In [22]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [23]:
create_brochure("Blueprint Technologies", "https://bpcs.com")

Found links: {'links': [{'type': 'about page', 'url': 'https://bpcs.com/our-approach'}, {'type': 'careers page', 'url': 'https://bpcs.com/careers'}, {'type': 'services page', 'url': 'https://bpcs.com/application-development'}, {'type': 'services page', 'url': 'https://bpcs.com/cloud-and-infrastructure'}, {'type': 'services page', 'url': 'https://bpcs.com/data-platform-modernization'}, {'type': 'services page', 'url': 'https://bpcs.com/data-governance'}, {'type': 'services page', 'url': 'https://bpcs.com/data-management'}, {'type': 'services page', 'url': 'https://bpcs.com/data-science'}, {'type': 'services page', 'url': 'https://bpcs.com/intelligent-sop'}, {'type': 'services page', 'url': 'https://bpcs.com/video-analytics'}, {'type': 'services page', 'url': 'https://bpcs.com/lakehouse-optimization'}, {'type': 'services page', 'url': 'https://bpcs.com/data-migration-experts'}]}


# Blueprint Technologies Brochure

## Delivering Intelligence that Matters

### Who We Are
Blueprint Technologies is at the forefront of technological innovation, providing a range of services powered by Artificial Intelligence, data analytics, and cloud infrastructure. Our mission is to deliver intelligence that transforms businesses, drives efficiency, and enables innovation across various industries.

### Our Solutions
We specialize in a comprehensive array of services tailored to meet the unique challenges of diverse sectors:

- **Artificial Intelligence:** Implementing cutting-edge AI solutions, including generative AI and intelligent Standard Operating Procedures (SOPs).
- **Engineering & Application Development:** Developing robust applications and modernizing cloud infrastructure.
- **Data & Analytics:** From data platform modernization to governance and migration, we optimize data management to drive actionable insights.
- **Industry-Specific Solutions:** 
  - **Manufacturing**: Boosting productivity and operational efficiency.
  - **Retail**: Creating seamless shopping journeys.
  - **Financial Services**: Enhancing security and customer personalization.
  - **Health & Life Sciences**: Improving healthcare outcomes and innovation.
  - **CMEG**: Transforming digital experiences with AI.
  - **Federal Government**: Advancing IT modernization and cybersecurity.

### Our Approach
At Blueprint Technologies, we believe that providing the right information to the right people at the right time can change lives and innovate industries. Our holistic approach uncovers strategic opportunities to maximize value for organizations swiftly and efficiently.

### Success Stories
We pride ourselves on delivering transformative solutions:
- **Case Study**: A national convenience store chain's data strategy was modernized, enabling quicker insights and agile reporting.
- **Client Snapshot**: Implemented a medical service pricing transparency platform that resulted in $140k annual cloud savings.

### Company Culture
Our culture is built on collaboration, innovation, and a commitment to delivering excellence. We encourage our team to tackle complex challenges and foster an environment where creativity thrives. The spirit of teamwork drives our success, and we celebrate diversity in thoughts and perspectives.

### Join Our Team
At Blueprint Technologies, we are always on the lookout for passionate individuals ready to make an impact. Explore exciting career opportunities to solve unique challenges and help shape the future of technology. 

**Apply Today! [Join Us](#)**

### Connect With Us
Stay updated on industry insights, webinars, and company news:
- **Insights & Thought Leadership**: Visit our [blog](#) for the latest in tech industry trends.
- **Upcoming Events**: Don't miss our informative [webinars](#) designed to enhance your knowledge.

---

For more information about our services and how we can help your business thrive, please visit [Blueprint Technologies](#). 

Together, let's deliver intelligence that matters!

In [19]:
create_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'discussion forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face: Connecting the AI Community

### Welcome to Hugging Face
Hugging Face is a leading platform in the field of artificial intelligence and machine learning, dedicated to building collaborative tools for the community. Our mission is simple: to empower and support the creators, researchers, and innovators shaping the future of AI.

---

### What We Offer
- **Models:** Access a vast library of over 1 million models, enabling efficient experimentation and deployment in various applications.
- **Datasets:** Browse through more than 250,000 datasets tailored for machine learning tasks across multiple modalities—text, image, video, audio, and 3D.
- **Spaces:** Create and share applications in an interactive environment, fostering collaboration among developers and data scientists.

---

### Our Customers
Join a diverse group of over 50,000 organizations utilizing Hugging Face, including industry giants like:
- **Meta**
- **Amazon Web Services**
- **Google**
- **Microsoft**
- **Intel**
- **Grammarly**

---

### Company Culture
At Hugging Face, we pride ourselves on fostering an inclusive and dynamic work environment. Our culture emphasizes:
- **Collaboration:** We believe in collective intelligence. Our platform encourages teamwork and knowledge sharing.
- **Innovation:** We are at the forefront of AI research and applications, driving technological advancements and exploration.
- **Community Driven:** Our open-source nature allows users to contribute to and benefit from shared knowledge, helping all members grow and succeed.

---

### Careers at Hugging Face
We are continuously looking for passionate individuals to join our team. Whether you're a data scientist, ML engineer, or product manager, there are various opportunities to grow within our organization. Explore our [careers page](https://huggingface.co/jobs) to learn more about joining our community.

---

### Join Us
Be part of the AI revolution. Whether you’re a researcher, developer, or an enthusiast, Hugging Face is here to support your journey in artificial intelligence. Explore our [platform](https://huggingface.co) today and start collaborating with a vibrant community!

--- 

For more information, follow us on [GitHub](https://github.com), [Twitter](https://twitter.com), and [LinkedIn](https://linkedin.com) for the latest updates and developments.

## Finally - a minor improvement

**With a small adjustment, we can change this so that the results *stream* back from *OpenAI*, with the familiar typewriter animation**

In [24]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [21]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'documentation page', 'url': 'https://huggingface.co/docs'}]}


# Hugging Face Brochure

## Welcome to Hugging Face

**The AI community building the future.**  
Hugging Face serves as a collaboration platform for the machine learning community, assisting in the creation, discovery, and sharing of models, datasets, and applications. Our tools are at the forefront of AI innovation, welcoming diverse users from researchers to enterprise organizations.

---

### Our Offerings

#### **Models**  
Explore over **1M+ models** covering various modalities including text, images, video, audio, and even 3D.

#### **Datasets**  
Access **250k+ datasets** designed for multiple ML tasks, facilitating better training and application of AI.

#### **Spaces**  
Create and run interactive applications through our **Spaces** platform, making it easier to generate and experiment with ML ideas in real time.

#### **Enterprise Solutions**  
Over **50,000 organizations** leverage Hugging Face for its enterprise-grade security, dedicated support, and optimized compute resources, which you can access starting at **$20/user/month**.

#### **Open Source Tools**  
Join the movement to build tooling for machine learning with our open-source libraries:
- **Transformers**: Cutting-edge models for PyTorch, TensorFlow, and JAX.
- **Diffusers**: State-of-the-art models for image and audio generation.

---

### Company Culture

At Hugging Face, we foster an inclusive and innovative culture. Our team thrives on collaboration, community engagement, and a shared passion for advancing AI technologies. We believe in contributing to open-source projects and prioritize transparent processes to elevate collective knowledge.

---

### Our Customers

Hugging Face is trusted by industry leaders such as:
- **Microsoft**
- **Google**
- **Amazon Web Services**
- **Meta**

These partnerships highlight our commitment to providing advanced tools that accommodate the diverse needs of those at the forefront of AI research and application.

---

### Join Our Team

Hugging Face is on the lookout for **talented individuals** who share our passion for AI and open-source technology. Explore open positions in software engineering, data science, and more on our **Jobs** page. We encourage diversity and seek innovators who strive to push the boundaries of what’s possible in machine learning.

---

### Connect With Us
Ready to explore the future of AI?  
Visit us at [Hugging Face](https://huggingface.co) to learn more about our services, community initiatives, and career opportunities.

---

**Hugging Face - Together, we're building the future of AI.**  


## Let's add a bit of humor to our brochure by modifying the *system_prompt*
This actually demonstrates how easy it is to incorporate 'tone'

In [26]:

system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."


In [27]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}]}



# Welcome to Hugging Face: The AI Community Building the Future! 🚀

## Get Ready to Hug It Out!

Are you ready to embrace the future with a bear hug of innovation? At **Hugging Face**, we’re all about building a community of intelligent minds (and intelligent machines) who are as passionate about AI as they are comfortable with a good pun. Whether you're a budding data scientist, a seasoned engineer, or just someone who appreciates hugging imaginary faces, there's a place for you here!

### What We Offer

- **1M+ Models:** That’s right! We have a model for all occasions, from "How to ruin dinner with bad datasets" to "Impress your boss with machine learning wizardry!"
  
- **250k+ Datasets:** Our library has more than enough data to keep your algorithms busy. Feeling adventurous? You can even say you’ve been “data-diving” while scoring those datasets!

- **Countless Applications:** Dive into **Spaces!** Whether it’s generating 3D models or crafting the perfect meme generator, you’ll find where tech and creativity collide!

### Company Culture: We Hug, Therefore We Are! 🤗

- **Collaborative Community:** Think of us as the friendship ball of AI—every time you pass ideas around, you get smarter. We even have our own **HuggingChat** so you can chat as snugly as you like!

- **Open Source Spirit:** We believe sharing is caring, especially when it comes to ML tools. Our open-source libraries are like candy—sweet, satisfying, and meant for sharing!

- **Flexible Working Environment:** Tired of rigid office hours? With us, you can work anywhere from your cozy couch to your favorite coffee shop as long as you keep that creativity flowing!

### Join the Cool Kids in AI 🌟

- **Careers:** Expand your horizons with a job at Hugging Face! We’re on the lookout for AI aficionados who want to sprinkle a little stardust over models and datasets. Bonus points if you can name at least three animated movies involving talking toys!

- **Diversity Matters:** We appreciate different flavors and backgrounds! Our team is as diverse as the models we host— come be part of a melting pot of talents, ideas, and yes, spirited discussions!

### Our Customers: The Who’s Who of Tech 🌍

Join **50,000+ organizations** that are already part of the Hugging Face family, including major players like **Google, Microsoft, and even Santa’s Workshop (if they ever decided to go digital)**! They trust us to keep their AI solutions fresh out of the oven and full of creativity!

---

So, what are you waiting for? Come give us a virtual hug by signing up today! Let’s build the future together, one smile and one model at a time! 

**[Sign Up Here!](https://huggingface.co/)**

---

*Hugging Face: Because a day without AI is like a day without sunshine! (And who wants to be in the dark?)*


<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../images/business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this notebook the code for Scraper Summarizer project is extended to make multiple LLM calls, and generate a document.

This is also an example of Agentic AI design patterns, as we combined multiple calls to LLMs.

Generating content in this way is one of the very most common Use Cases. As with summarization, this can be applied to any business vertical. Write marketing content, generate a product tutorial from a spec, create personalized email content, and so much more. Explore how you can apply content generation to your business, and try making yourself a proof-of-concept prototype.</span>
        </td>
    </tr>
</table>