# Company Website Brochure Generator Tool

**Author**: [Pouria Ebrahimnezhad]  
**Date**: [2025-02-12]  

## Overview
This notebook is designed to scrape content from a website and generate a brochure using both cloud-based and local LLM models. It leverages web scraping tools, OpenAI API, Google Gemini, Anthropic Claude, and Gradio for an interactive interface.

## Key Technologies:
- **Web Scraping**: `requests`, `BeautifulSoup`
- **LLM Integration**: `OpenAI`, `google.generativeai`, `anthropic`
- **Environment Management**: `dotenv`
- **UI & Display**: `gradio`, `IPython.display.Markdown`

---


In [47]:
# imports

import os
import requests
from bs4 import BeautifulSoup
from typing import List
from dotenv import load_dotenv
from openai import OpenAI
import gradio as gr
from IPython.display import Markdown, display

## Testing the OpenAI API key and the local LLM models

Here we initiate a test to make sure we can access the OpenAI models using the API key we have and that we are able to retrieve response from a llm model in this case the gpt-4o-mini.
Note you will need to setup the key in a .env file in your environment for this to work.

We also test the local model in this case llama3.2 using Ollama 

In [13]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

OpenAI API Key exists and begins sk-proj-


In [14]:
# Connect to OpenAI
openai = OpenAI()
system_message = "You are a helpful assistant that responds in markdown"

In [39]:
def message_gpt(prompt):
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": prompt}
      ]
    completion = openai.chat.completions.create(
        model='gpt-4o-mini',
        messages=messages,
    )
    return completion.choices[0].message.content

def display_markdown(response):
    display(Markdown(response))

In [40]:
display_markdown(message_gpt("what is a good recepie for carrot cake"))

## Delicious Carrot Cake Recipe

### Ingredients

#### For the Cake:
- **2 cups** all-purpose flour
- **2 tsp** baking powder
- **1 tsp** baking soda
- **1/2 tsp** salt
- **1 tsp** ground cinnamon
- **1/2 tsp** ground nutmeg
- **1/2 tsp** ground ginger
- **1 cup** vegetable oil
- **1 cup** granulated sugar
- **1 cup** packed brown sugar
- **4 large** eggs
- **2 cups** grated carrots (about 4 medium carrots)
- **1/2 cup** crushed pineapple, drained
- **1/2 cup** chopped walnuts or pecans (optional)
- **1/2 cup** raisins (optional)
- **1 tsp** vanilla extract

#### For the Cream Cheese Frosting:
- **8 oz** cream cheese, softened
- **1/2 cup** unsalted butter, softened
- **4 cups** powdered sugar
- **1 tsp** vanilla extract

### Instructions

#### For the Cake:
1. **Preheat your oven** to 350°F (175°C). Grease and flour two 9-inch round cake pans.
2. In a medium bowl, **whisk together** the flour, baking powder, baking soda, salt, cinnamon, nutmeg, and ginger. Set aside.
3. In a large bowl, **mix together** the vegetable oil, granulated sugar, and brown sugar until well combined.
4. **Add the eggs**, one at a time, beating well after each addition. Stir in the grated carrots, crushed pineapple, walnuts, raisins, and vanilla extract.
5. Gradually **add the dry ingredients** to the wet mixture, stirring just until combined.
6. **Divide the batter** evenly between the two prepared cake pans.
7. **Bake** for 25-30 minutes, or until a toothpick inserted into the center comes out clean.
8. Allow the cakes to **cool in the pans** for 10 minutes, then transfer to wire racks to cool completely.

#### For the Cream Cheese Frosting:
1. In a medium bowl, **beat the softened cream cheese and butter** together until smooth and creamy.
2. Gradually add the powdered sugar, continuing to beat until fully combined and fluffy.
3. Stir in the vanilla extract.

### Assembly:
1. Once the cakes are completely cool, place one layer on a serving plate.
2. **Spread a layer of frosting** over the top of the first layer.
3. Place the second layer on top and frost the top and sides of the cake.
4. **Decorate** with additional walnuts or grated carrots, if desired.

### Enjoy!
Serve at room temperature, and enjoy a slice of this moist and flavorful carrot cake with family and friends!

In [41]:
# testing if local ollama is running and able to get a response from the model
# Constants

OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

messages = [
    {"role": "user", "content": "What is the meaning of life?"}
]

payload = {
        "model": MODEL,
        "messages": messages,
        "stream": False
    }

response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
print(response.json()['message']['content'])

The question of the meaning of life is one of the most profound and enduring questions in human history. It has been debated, explored, and contemplated by philosophers, theologians, scientists, and countless individuals throughout time. There is no one definitive answer to this question, as it can be subjective, context-dependent, and multifaceted.

That being said, here are some perspectives that might offer insights into the meaning of life:

1. **Religious or spiritual perspectives**: Many religions and spiritual traditions provide their own answers to this question. For example, in Christianity, the purpose of life is often seen as serving God and fulfilling one's role in the plan of salvation. In Buddhism, it is believed that the goal of life is to attain enlightenment and escape the cycle of rebirth.
2. **Humanistic perspectives**: Some philosophers argue that the meaning of life can be found in human experience itself. According to this view, life has value because of our relat

## Building the Website scraper and broucher builder

Here we define the required functions and class for scraping a website based on its url and returning a text which includes the title of the webpage along with its content
we then define a system message for the large language model along with two streaming functions used for each llm one for the OpenAI model and one for the local llama model.

We then finally build a stream broucher for streaming the results of the models and then use Gradio interface to be the front-end GUI for this tool.

In [42]:
# A class to represent a Webpage

class Website:
    url: str
    title: str
    text: str

    def __init__(self, url):
        self.url = url
        response = requests.get(url)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

In [43]:
# creating a system_message that will be provided to the llm
system_message = "You are an assistant that analyzes the contents of a company website landing page \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown."

In [44]:
def stream_llama(prompt):
    stream = ollama_via_openai.chat.completions.create(
        model="llama3.2",  # Replace with your local Llama model name if different
        messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": prompt}
        ],
        stream=True  # Enable streaming
    )

    response = ""
    for chunk in stream:
        if chunk.choices and chunk.choices[0].delta.content:
            text = chunk.choices[0].delta.content
            response += text
            yield response

def stream_gpt(prompt):
    messages = [
        {"role": "system", "content": system_message},
        {"role": "user", "content": prompt}
      ]
    stream = openai.chat.completions.create(
        model='gpt-4o-mini',
        messages=messages,
        stream=True
    )
    result = ""
    for chunk in stream:
        result += chunk.choices[0].delta.content or ""
        yield result

In [45]:
def stream_brochure(company_name, url, model):
    prompt = f"Please generate a company brochure for {company_name}. Here is their landing page:\n"
    prompt += Website(url).get_contents()
    if model=="GPT":
        result = stream_gpt(prompt)
    elif model=="llama3.2":
        result = stream_llama(prompt)
    else:
        raise ValueError("Unknown model")
    yield from result

In [46]:
force_dark_mode = """
function refresh() {
    const url = new URL(window.location);
    if (url.searchParams.get('__theme') !== 'dark') {
        url.searchParams.set('__theme', 'dark');
        window.location.href = url.href;
    }
}
"""

view = gr.Interface(
    fn=stream_brochure,
    inputs=[
        gr.Textbox(label="Company name:"),
        gr.Textbox(label="Landing page URL including http:// or https://"),
        gr.Dropdown(["GPT", "llama3.2"], label="Select model")],
    outputs=[gr.Markdown(label="Brochure:")],
    flagging_mode="never",
    js=force_dark_mode
)
view.launch()

* Running on local URL:  http://127.0.0.1:7866

To create a public link, set `share=True` in `launch()`.


