# A full business solution

Create a product that builds a Brochure for a company to be used for prospective clients, investors and potential recruits.

We will be provided a company name and their primary website.

In [1]:
# imports

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
# Initialize and constants. Here, we are using OpenAI

load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')
MODEL = 'gpt-4o-mini'
openai = OpenAI()

In [3]:
# A class to represent a Webpage

class Website:
    url: str
    title: str
    body: str
    links: List[str]

    def __init__(self, url):
        self.url = url
        response = requests.get(url)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

## First step: Have GPT-4o-mini figure out which links are relevant

### Use a call to gpt-4o-mini to read the links on a webpage, and respond in structured JSON.  
It should decide which links are relevant, and replace relative links such as "/about" with "https://company.com/about".  
We will use "one shot prompting" in which we provide an example of how it should respond in the prompt.

In [5]:
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [6]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [7]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2024/08/06/outsmart/
https://edwarddonner.com/2024/08/06/outsmart/
https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/
https://edwarddonner.com/2024/06/26/choosing-the-right-llm-resources/
https

In [8]:
def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = completion.choices[0].message.content
    return json.loads(result)

In [16]:
get_links("https://www.idrb.kerala.gov.in/")

{'links': [{'type': 'about page',
   'url': 'http://www.idrb.kerala.gov.in/about-us'},
  {'type': 'mission and vision page',
   'url': 'http://www.idrb.kerala.gov.in/mission-and-vision'},
  {'type': 'organogram page',
   'url': 'http://www.idrb.kerala.gov.in/organogram'},
  {'type': 'officers profile page',
   'url': 'http://www.idrb.kerala.gov.in/officers-profile-0'},
  {'type': 'careers page', 'url': 'http://www.idrb.kerala.gov.in/internship'},
  {'type': 'company page',
   'url': 'http://www.idrb.kerala.gov.in/establishment_finance'}]}

## Second step: make the brochure!

Assemble all the details into another prompt to GPT4-o

In [10]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [17]:
print(get_all_details("https://www.idrb.kerala.gov.in/"))

Found links: {'links': [{'type': 'about page', 'url': 'http://www.idrb.kerala.gov.in/about-us'}, {'type': 'mission and vision', 'url': 'http://www.idrb.kerala.gov.in/mission-and-vision'}, {'type': 'organogram', 'url': 'http://www.idrb.kerala.gov.in/organogram'}, {'type': 'officers profile', 'url': 'http://www.idrb.kerala.gov.in/officers-profile-0'}, {'type': 'careers page', 'url': 'https://cwc.gov.in/vacancies'}, {'type': 'annual plan progress', 'url': 'http://www.idrb.kerala.gov.in/annual-plan-progress-2024-2025'}, {'type': 'internship page', 'url': 'http://www.idrb.kerala.gov.in/internship'}]}
Landing page:
Webpage Title:
WELCOME TO IDRB | Irrigation Design and Research Board
Webpage Contents:
Skip to main content
Visitor Counter: 249877
Last Updated: 30/09/2024, 08:29am
Search
Main navigation
Home
About IDRB
About us
Vision & Mission
Who's Who
Organogram
Officers Profile
Know IDRB
Activities
Events
Infrastructure
Annual Plan
Daily Reports
Maps
River Basin
Drainage Network
Hydromet N

In [18]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short witty brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

In [19]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    user_prompt = user_prompt[:20_000] # Truncate if more than 20,000 characters
    return user_prompt

In [20]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [21]:
create_brochure("Idrb Kerala", "https://www.idrb.kerala.gov.in/")

Found links: {'links': [{'type': 'about page', 'url': 'http://www.idrb.kerala.gov.in/about-us'}, {'type': 'mission and vision', 'url': 'http://www.idrb.kerala.gov.in/mission-and-vision'}, {'type': 'organogram', 'url': 'http://www.idrb.kerala.gov.in/organogram'}, {'type': 'internship page', 'url': 'http://www.idrb.kerala.gov.in/internship'}, {'type': 'careers page', 'url': 'https://cwc.gov.in/vacancies'}]}


```markdown
# Welcome to IDRB: Where Water Wisdom Flows!

## About Us
At the **Irrigation Design and Research Board (IDRB)**, we turn water woes into water wins! Operating under the Kerala Government's Irrigation Department, we oversee everything from groundbreaking research to top-notch quality control, ensuring that Kerala’s water resources are managed efficiently and sustainably. Founded in 1986, we’ve been keeping the state hydrated and happy ever since!

## Our Vision
> **"Towards a dynamic centre striving for excellence in water resources management."**

We don’t just design; we innovate, adapt, and contribute to sustainable irrigation practices while boldly tackling climate-induced challenges.

## Our Mission
> **"Enhancing capabilities for Water Conservation and Sustainable Irrigation Management."**

From foreseeing monsoons to executing dam safety workshops, our mission is as deep as the reservoirs we protect!

## Company Culture
We pride ourselves on our **dynamic teamwork**, fostering a culture where creativity and innovation flow as freely as the rivers we manage. At IDRB, every employee’s voice matters, and we encourage fresh ideas that challenge the status quo to meet the changing needs of our environment.

## Our Customers
Our key customers and stakeholders include:
- The **Government of Kerala**
- Local farmers and agricultural firms counting on our irrigation designs
- Environmental agencies looking for collaborative and sustainable management methods
- Engineering professionals keen to innovate through well-informed practices

## Careers at IDRB
Join us in our watery pursuits! We are always on the lookout for motivated individuals who are ready to dive into a career rich in opportunity and exploration. We offer internships that provide real-life insights into water management and engineering practices - because why just learn about water when you can become a *splash* in the industry?

### Current Opportunities:
- **Dam Safety Experts**: Help us tackle the vital aspects of dam safety in line with the Dam Safety Act 2021.
- **Internships**: Coming Soon! Stay tuned for immersive opportunities that give you hands-on experience in water resources management.

## Get In Touch!
Whether you’re looking to partner with us, explore job opportunities, or just want to learn more about our research, we’re here to connect.

**Contact us:**  
*Irrigation Design & Research Board*  
Third Floor, Vikas Bhavan,  
Thiruvananthapuram - 695033  
📞 +91 471 2784001  
📧 idrbtvm@gmail.com  

Let's collaborate to make waves in water management!
```


## Finally - a minor improvement

With a small adjustment, we can change this so that the results stream back from OpenAI,
with the familiar typewriter animation

In [22]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [24]:
stream_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'docs page', 'url': 'https://huggingface.co/docs'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'community forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}



# Welcome to Hugging Face: The AI Community Building the Future! 🤗

### About Us
At Hugging Face, we're on a mission to **democratize machine learning**, one delightful commit at a time! Our platform is where **collaboration, innovation, and fun** meet; where machine learning enthusiasts can create, share, and deploy cutting-edge models, datasets, and applications. 

### Why Choose Hugging Face?
- **A Community of Innovators:** Join over 50,000 organizations, from non-profits like Ai2 to giants like Google and Microsoft, all utilizing our tools for their AI and ML projects.
- **Endless Resources:** With **400k+ models** and **100k+ datasets** available, the possibilities for building your next AI solution are virtually limitless!
- **Open Source Heaven:** Our tools, including Transformers and Diffusers, are designed to be community-driven and easily accessible – because sharing is caring!

### What’s Cooking?
- **Trending Models**: Explore the latest and greatest like meta-llama and stepfun-ai in the AI space.
- **Spaces**: Create and host your AI applications effortlessly with scalable compute resources starting at just $0.60/hour for GPU access.

### Company Culture
At Hugging Face, we believe in **collaboration** over competition. Our team thrives in a flexible, inclusive environment where innovation reverberates from every corner. We encourage **curiosity and creativity**, with opportunities to work on groundbreaking projects that can truly make an impact on artificial intelligence's future. 

### Join Us!
Looking for a career that not only excites you but also lets you make a difference? Check out our **open positions** and become part of a forward-thinking team that's shaping the world of AI!

### Affordable Pricing Plans
Whether you’re a hobbyist or an enterprise looking for robust AI solutions, we have plans tailored just for you:
- **Free Forever**: Host and collaborate with models, datasets, and Spaces.
- **Pro Plan**: Unlock advanced features for just $9/month.
- **Enterprise Solutions**: Custom offers starting at $20/user/month for organizations needing serious AI horsepower.

### Get In Touch
Curious about how we can help you? Dive into our resources, join our forums, and let’s chat on our platforms like Discord and GitHub! 

**Let’s make AI accessible to all – one cozy hug at a time!**

