# A full business solution

## Now we will take our project from Day 1 to the next level

### BUSINESS CHALLENGE:

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

See the end of this notebook for examples of real-world business applications.

And remember: I'm always available if you have problems or ideas! Please do reach out.

In [26]:
# imports
# If these fail, please check you're running from an 'activated' environment with (llms) in the command prompt

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [27]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


In [28]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [29]:
ed = Website("https://bixtecnologia.com.br/")
ed.links

['#content',
 'https://bixtecnologia.com.br/',
 '#',
 '/sobre-nos/',
 'https://bixtecnologia.com.br/trabalhe-conosco/',
 '#',
 'https://bixtecnologia.com.br/engenharia-de-dados/',
 'https://bixtecnologia.com.br/business-intelligence/',
 'https://bixtecnologia.com.br/ciencia-de-dados/',
 'https://bixtecnologia.com.br/desenvolvimento-de-software/',
 '#',
 'https://bixtecnologia.com.br/google-cloud/',
 'https://bixtecnologia.com.br/databricks/',
 'https://bixtecnologia.com.br/qliksense/',
 'https://bixtecnologia.com.br/powerbi/',
 'https://bixtecnologia.com.br/blog/',
 'https://bixtecnologia.com.br/contato/',
 'https://evento.semanadedados.com/inicio',
 'https://bixtecnologia.com.br/',
 '#',
 '/sobre-nos/',
 'https://bixtecnologia.com.br/trabalhe-conosco/',
 '#',
 'https://bixtecnologia.com.br/engenharia-de-dados/',
 'https://bixtecnologia.com.br/business-intelligence/',
 'https://bixtecnologia.com.br/ciencia-de-dados/',
 'https://bixtecnologia.com.br/desenvolvimento-de-software/',
 '#',


## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

This is an excellent use case for an LLM, because it requires nuanced understanding. Imagine trying to code this without LLMs by parsing and analyzing the webpage - it would be very hard!

Sidenote: there is a more advanced technique called "Structured Outputs" in which we require the model to respond according to a spec. We cover this technique in Week 8 during our autonomous Agentic AI project.

In [None]:
# one shot prompting 

In [30]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [31]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [32]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [33]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://bixtecnologia.com.br/ - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
#content
https://bixtecnologia.com.br/
#
/sobre-nos/
https://bixtecnologia.com.br/trabalhe-conosco/
#
https://bixtecnologia.com.br/engenharia-de-dados/
https://bixtecnologia.com.br/business-intelligence/
https://bixtecnologia.com.br/ciencia-de-dados/
https://bixtecnologia.com.br/desenvolvimento-de-software/
#
https://bixtecnologia.com.br/google-cloud/
https://bixtecnologia.com.br/databricks/
https://bixtecnologia.com.br/qliksense/
https://bixtecnologia.com.br/powerbi/
https://bixtecnologia.com.br/blog/
https://bixtecnologia.com.br/contato/
https://evento.semanadedados.com/inicio
https://bixtecnologia.com.br/
#
/sobre-nos/
https://bixtecnologia.com.br/trabalhe-conosco/
#
https://bixtecnolo

In [34]:
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

In [35]:
# Anthropic has made their site harder to scrape, so I'm using HuggingFace..

# huggingface = Website("https://huggingface.co")
# huggingface.links

bixtecnologia = Website("https://bixtecnologia.com.br/")
bixtecnologia.links

['#content',
 'https://bixtecnologia.com.br/',
 '#',
 '/sobre-nos/',
 'https://bixtecnologia.com.br/trabalhe-conosco/',
 '#',
 'https://bixtecnologia.com.br/engenharia-de-dados/',
 'https://bixtecnologia.com.br/business-intelligence/',
 'https://bixtecnologia.com.br/ciencia-de-dados/',
 'https://bixtecnologia.com.br/desenvolvimento-de-software/',
 '#',
 'https://bixtecnologia.com.br/google-cloud/',
 'https://bixtecnologia.com.br/databricks/',
 'https://bixtecnologia.com.br/qliksense/',
 'https://bixtecnologia.com.br/powerbi/',
 'https://bixtecnologia.com.br/blog/',
 'https://bixtecnologia.com.br/contato/',
 'https://evento.semanadedados.com/inicio',
 'https://bixtecnologia.com.br/',
 '#',
 '/sobre-nos/',
 'https://bixtecnologia.com.br/trabalhe-conosco/',
 '#',
 'https://bixtecnologia.com.br/engenharia-de-dados/',
 'https://bixtecnologia.com.br/business-intelligence/',
 'https://bixtecnologia.com.br/ciencia-de-dados/',
 'https://bixtecnologia.com.br/desenvolvimento-de-software/',
 '#',


In [36]:
get_links("https://bixtecnologia.com.br/")

{'links': [{'type': 'about page',
   'url': 'https://bixtecnologia.com.br/sobre-nos/'},
  {'type': 'careers page',
   'url': 'https://bixtecnologia.com.br/trabalhe-conosco/'},
  {'type': 'blog page', 'url': 'https://bixtecnologia.com.br/blog/'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [37]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [38]:
print(get_all_details("https://bixtecnologia.com.br/"))

Found links: {'links': [{'type': 'about page', 'url': 'https://bixtecnologia.com.br/sobre-nos/'}, {'type': 'careers page', 'url': 'https://bixtecnologia.com.br/trabalhe-conosco/'}, {'type': 'data engineering page', 'url': 'https://bixtecnologia.com.br/engenharia-de-dados/'}, {'type': 'business intelligence page', 'url': 'https://bixtecnologia.com.br/business-intelligence/'}, {'type': 'data science page', 'url': 'https://bixtecnologia.com.br/ciencia-de-dados/'}, {'type': 'software development page', 'url': 'https://bixtecnologia.com.br/desenvolvimento-de-software/'}, {'type': 'Google Cloud page', 'url': 'https://bixtecnologia.com.br/google-cloud/'}, {'type': 'QlikSense page', 'url': 'https://bixtecnologia.com.br/qliksense/'}, {'type': 'Power BI page', 'url': 'https://bixtecnologia.com.br/powerbi/'}, {'type': 'Databricks page', 'url': 'https://bixtecnologia.com.br/databricks/'}, {'type': 'blog page', 'url': 'https://bixtecnologia.com.br/blog/'}, {'type': 'contact page', 'url': 'https://b

In [46]:
# system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
# and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information."

# Or uncomment the lines below for a more humorous brochure - this demonstrates how easy it is to incorporate 'tone':

system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short humorous, entertaining, jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."


In [47]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [48]:
# get_brochure_user_prompt("HuggingFace", "https://huggingface.co")
get_brochure_user_prompt("BixTecnologia", "https://bixtecnologia.com.br/")

Found links: {'links': [{'type': 'about page', 'url': 'https://bixtecnologia.com.br/sobre-nos/'}, {'type': 'careers page', 'url': 'https://bixtecnologia.com.br/trabalhe-conosco/'}, {'type': 'services page', 'url': 'https://bixtecnologia.com.br/engenharia-de-dados/'}, {'type': 'services page', 'url': 'https://bixtecnologia.com.br/business-intelligence/'}, {'type': 'services page', 'url': 'https://bixtecnologia.com.br/ciencia-de-dados/'}, {'type': 'services page', 'url': 'https://bixtecnologia.com.br/desenvolvimento-de-software/'}, {'type': 'blog page', 'url': 'https://bixtecnologia.com.br/blog/'}, {'type': 'contact page', 'url': 'https://bixtecnologia.com.br/contato/'}]}


'You are looking at a company called: BixTecnologia\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nBIX Tecnologia | Consultoria de Dados | Brasil\nWebpage Contents:\nIr para o conte√∫do\nA BIX Tecnologia\nAlternar menu\nSobre n√≥s\nCarreiras\nSolu√ß√µes\nAlternar menu\nEngenharia de Dados\nBusiness Intelligence\nCi√™ncia de Dados\nDesenvolvimento de Software\nFerramentas\nAlternar menu\nGoogle Cloud\nDatabricks\nQlik Sense\nPower BI\nBlog\nContato\nSemana de Dados\nMain Menu\nA BIX Tecnologia\nAlternar menu\nSobre n√≥s\nCarreiras\nSolu√ß√µes\nAlternar menu\nEngenharia de Dados\nBusiness Intelligence\nCi√™ncia de Dados\nDesenvolvimento de Software\nFerramentas\nAlternar menu\nGoogle Cloud\nDatabricks\nQlik Sense\nPower BI\nBlog\nContato\nSemana de Dados\nDesbloqueie o\nm√°ximo potencial\ndos seus dados!\nOferecemos solu√ß√µes em Engenharia de Dados, Business I

In [49]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [50]:
# create_brochure("HuggingFace", "https://huggingface.co")
create_brochure("BixTecnologia", "https://bixtecnologia.com.br/")

Found links: {'links': [{'type': 'about page', 'url': 'https://bixtecnologia.com.br/sobre-nos/'}, {'type': 'careers page', 'url': 'https://bixtecnologia.com.br/trabalhe-conosco/'}, {'type': 'blog page', 'url': 'https://bixtecnologia.com.br/blog/'}]}


# Welcome to BIX Tecnologia: Where Data Meets Delight!

---

## Who Are We?

At **BIX Tecnologia**, we don‚Äôt just crunch numbers; we sprinkle a bit of magic on data and transform it into gold! üíº‚ú® We're a sparkling data consultancy based in the land of samba and sun (also known as Brazil). Our team of highly qualified, multidisciplinary wizards, licensed in the best tools in town, is dedicated to helping businesses unleash the full potential of their data. 

### Our Mission
**Generate positive impact through technology!** üöÄ Because why settle for mediocrity when you can skyrocket your business?

---

## What We Do

Hold onto your seats as we take you through our dazzling journey from **Data Engineering** to **Artificial Intelligence**! 

- **Data Engineering**: We build the stairways to Heaven... uh, we mean data highways, to ensure your data flows like a Brazilian carnival parade!
- **Business Intelligence**: Turning confusion into clarity, one data point at a time. Who knew data could be this much fun?
- **Data Science**: Our scientists use their brainy magic to conduct advanced analyses that would make Einstein jealous.
- **Software Development**: We don our capes to create stellar software solutions to digitally transform your empire!

---

## Our Crowning Glory - Tools of the Trade

We‚Äôve got a toolbox that would make any handyman weep:
- **Power BI**
- **Qlik Sense**
- **Databricks**
- **And many more**! 
Because why use a hammer when you can have a laser-guided, precision data-extracting device?

---

## Customer Wonderland üåà 

Join the ranks of over **60 leading companies** that have been dazzled by our data insights! Our clients have experienced faster processes, lower costs, and fatter profit margins. Don't you want in on that action? 

---

## Our Culture 

At BIX, we‚Äôre not just about business; we‚Äôre about **community**! Think of us as the family reunion you actually want to attend. We celebrate diversity, hop on deadlines like it‚Äôs a dance floor, and encourage our team to bring their best selves every day. Every day is a new opportunity to *BIX* things up!

---

## Join Us! 

Looking for a career that feels more like an adventure? üéâ 
- Check out our **Careers Page**! 
- You might just find your next calling with us (we promise to keep the data puns to a minimum‚Ä¶ maybe).

---

## **Get In Touch!**

Ready for a transformative coffee chat? ‚òï We love to talk data trends, solutions, and how we can boost your business. Slide into our DMs (or whatever mode of classiness you prefer), and let‚Äôs dive into the data ocean together!

üåä **Say hello on WhatsApp**!

---

So, whether you're an overwhelmed business owner, a data enthusiast, or someone just searching for a place to belong, **BIX Tecnologia** has something for everyone. Let's make some data magic together! ‚ú®

## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [44]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [45]:
# stream_brochure("HuggingFace", "https://huggingface.co")
stream_brochure("BixTecnologia", "https://bixtecnologia.com.br/")

Found links: {'links': [{'type': 'about page', 'url': 'https://bixtecnologia.com.br/sobre-nos/'}, {'type': 'careers page', 'url': 'https://bixtecnologia.com.br/trabalhe-conosco/'}, {'type': 'blog', 'url': 'https://bixtecnologia.com.br/blog/'}]}



# BIX Tecnologia Brochure

## Unlock the Power of Your Data

Welcome to **BIX Tecnologia**, your trusted partner in data consulting and technology solutions based in Brazil. Our mission is to generate positive impact through data and innovative technology, helping organizations to harness the full potential of their data landscape.

---

## About Us

Founded with the vision of transforming how businesses leverage data, BIX Tecnologia offers end-to-end services throughout the analytical journey, from data engineering to software development. Our multidisciplinary and highly qualified team is certified in leading market tools, allowing us to deliver top-tier solutions designed to optimize business operations and drive efficiency.

### Our Purpose

Our primary goal is simple: to generate a positive impact through technology. We assist organizations across various sectors in extracting valuable insights from their data, leading to cost reductions, faster processes, and increased profitability.

---

## Our Services

### **Data Engineering**
Building robust data structures and processes that ensure a successful analytical journey for your organization.

### **Business Intelligence**
Developing data analysis projects that transform your company‚Äôs data into efficient decision-making and innovation opportunities.

### **AI and Data Science**
Providing artificial intelligence and machine learning solutions for advanced analytical processes with efficiency and precision.

### **Software Development**
Offering solutions in data acquisition systems, user interface design, and systems integration for digital transformation.

---

## Our Tools

We leverage a powerful technology stack to deliver the best results for our clients. Our expertise includes tools like:
- **Google Cloud**
- **Databricks**
- **Qlik Sense**
- **Power BI**
- **Python, Django, and FastAPI**
- **AWS, Azure, and Docker**

---

## Our Culture

At BIX Tecnologia, we foster a culture of continuous learning and collaboration. Our team is dedicated to empowering professionals from various sectors, promoting a data-driven culture that enables organizations to thrive in a competitive market. 

We believe in:
- **Innovation**: Driving creativity and forward-thinking approaches.
- **Empowerment**: Enabling our clients to make data-driven decisions.
- **Collaboration**: Working together to achieve outstanding results.

---

## Join Us!

We are always looking for talented individuals to join our team of experts. If you are passionate about data and technology, consider a career with BIX Tecnologia! 

### Open Positions:
- Data Engineers
- Business Analysts
- Software Developers
- Data Scientists

---

## Our Clients

We proudly serve over 60 leading companies in various industries, providing them with insights that enhance their operational efficiency and strategic decision-making.

---

## Connect with Us

Are you ready to revolutionize your data strategy? Let‚Äôs have a coffee! Reach out to our team, and let's discuss the latest trends in data analysis and how your company can benefit from them.

**Contact Us Today!**

---

**Follow Us:**
- [Website](#)
- [Blog](#)
- WhatsApp: Connect with our team for immediate assistance!

---

**BIX Tecnologia** - Your partner in data-driven success.



In [None]:
# Try changing the system prompt to the humorous version when you make the Brochure for Hugging Face:

stream_brochure("HuggingFace", "https://huggingface.co")

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this exercise we extended the Day 1 code to make multiple LLM calls, and generate a document.

This is perhaps the first example of Agentic AI design patterns, as we combined multiple calls to LLMs. This will feature more in Week 2, and then we will return to Agentic AI in a big way in Week 8 when we build a fully autonomous Agent solution.

Generating content in this way is one of the very most common Use Cases. As with summarization, this can be applied to any business vertical. Write marketing content, generate a product tutorial from a spec, create personalized email content, and so much more. Explore how you can apply content generation to your business, and try making yourself a proof-of-concept prototype. See what other students have done in the community-contributions folder -- so many valuable projects -- it's wild!</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you move to Week 2 (which is tons of fun)</h2>
            <span style="color:#900;">Please see the week1 EXERCISE notebook for your challenge for the end of week 1. This will give you some essential practice working with Frontier APIs, and prepare you well for Week 2.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">A reminder on 3 useful resources</h2>
            <span style="color:#f71;">1. The resources for the course are available <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">here.</a><br/>
            2. I'm on LinkedIn <a href="https://www.linkedin.com/in/eddonner/">here</a> and I love connecting with people taking the course!<br/>
            3. I'm trying out X/Twitter and I'm at <a href="https://x.com/edwarddonner">@edwarddonner<a> and hoping people will teach me how it's done..  
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../thankyou.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#090;">Finally! I have a special request for you</h2>
            <span style="color:#090;">
                My editor tells me that it makes a MASSIVE difference when students rate this course on Udemy - it's one of the main ways that Udemy decides whether to show it to others. If you're able to take a minute to rate this, I'd be so very grateful! And regardless - always please reach out to me at ed@edwarddonner.com if I can help at any point.
            </span>
        </td>
    </tr>
</table>