# Groq Brochure Maker

This notebook is an adapted version of the "Brochure Maker" project (Week 1, Day 5), configured to use the **Groq API** instead of OpenAI.

It is structured so you can easily modify the **AI Specs** (System Prompts) and **Prompts** (User Prompts).

In [5]:
# Run this cell once to install required packages
%pip install python-dotenv groq beautifulsoup4 requests

Collecting beautifulsoup4
  Downloading beautifulsoup4-4.14.3-py3-none-any.whl.metadata (3.8 kB)
Collecting requests
  Using cached requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting soupsieve>=1.6.1 (from beautifulsoup4)
  Downloading soupsieve-2.8.3-py3-none-any.whl.metadata (4.6 kB)
Collecting charset_normalizer<4,>=2 (from requests)
  Using cached charset_normalizer-3.4.4-cp314-cp314-win_amd64.whl.metadata (38 kB)
Collecting urllib3<3,>=1.21.1 (from requests)
  Using cached urllib3-2.6.3-py3-none-any.whl.metadata (6.9 kB)
Downloading beautifulsoup4-4.14.3-py3-none-any.whl (107 kB)
Using cached requests-2.32.5-py3-none-any.whl (64 kB)
Using cached charset_normalizer-3.4.4-cp314-cp314-win_amd64.whl (107 kB)
Using cached urllib3-2.6.3-py3-none-any.whl (131 kB)
Downloading soupsieve-2.8.3-py3-none-any.whl (37 kB)
Installing collected packages: urllib3, soupsieve, charset_normalizer, requests, beautifulsoup4

   ---------------------------------------- 0/5 [urllib3]
   ------

In [6]:
# Imports
import os
import json
from pathlib import Path
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from groq import Groq
from scraper import fetch_website_links, fetch_website_contents

# Load .env from current directory (ensure your .env is in the same folder as this notebook)
load_dotenv(Path.cwd() / ".env", override=True)
load_dotenv(override=True)

api_key = os.getenv('GROQ_API_KEY')

if not api_key or not api_key.startswith('gsk_'):
    print("⚠️ Please check your GROQ_API_KEY in .env (in the same folder as this notebook)")
    client = None
    MODEL = None
else:
    print("✅ Groq API key found")
    client = Groq(api_key=api_key)
    MODEL = 'llama-3.3-70b-versatile'

⚠️ Please check your GROQ_API_KEY in .env


GroqError: The api_key client option must be set either by passing api_key to the client or by setting the GROQ_API_KEY environment variable

## 1. AI Specs

Define the personality and instructions for your AI agents here.

In [None]:
# System Prompt for the Link Selector Agent
link_system_prompt = """
You are provided with a list of links found on a webpage.
You are able to decide which of the links would be most relevant to include in a brochure about the company,
such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:

{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

# System Prompt for the Brochure Creator Agent
brochure_system_prompt = """
You are an assistant that analyzes the contents of several relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
Respond in markdown without code blocks.
Include details of company culture, customers and careers/jobs if you have the information.
"""

## 2. Prompts

Define how you structure the user messages here.

In [None]:
def get_links_user_prompt(url):
    """Creates the prompt to ask the AI to select relevant links"""
    links = fetch_website_links(url)
    user_prompt = f"""
Here is the list of links on the website {url} -
Please decide which of these are relevant web links for a brochure about the company, 
respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):
"""
    user_prompt += "\n".join(links)
    return user_prompt

def get_brochure_user_prompt(company_name, url, relevant_links):
    """Creates the prompt to ask the AI to generate the brochure"""
    # Fetch main page content
    contents = fetch_website_contents(url)
    
    user_prompt = f"""
You are looking at a company called: {company_name}
Here are the contents of its landing page and other relevant pages;
use this information to build a short brochure of the company in markdown without code blocks.\n\n
"""
    result = f"## Landing Page:\n\n{contents}\n## Relevant Links:\n"
    
    # Fetch content from relevant selected links
    for link in relevant_links['links']:
        result += f"\n\n### Link: {link['type']}\n"
        result += fetch_website_contents(link["url"])
        
    user_prompt += result
    user_prompt = user_prompt[:15000] # Truncate to fit context window
    return user_prompt

## 3. Logic

The core logic that ties everything together.

In [None]:
def select_relevant_links(url):
    """Calls Groq to pick the best links"""
    print(f"Selecting relevant links for {url}...")
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(url)}
        ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    links = json.loads(result)
    print(f"Found {len(links['links'])} relevant links")
    return links

def create_brochure(company_name, url):
    """Main function to orchestrate the brochure creation"""
    # 1. Select Links
    relevant_links = select_relevant_links(url)
    
    # 2. Build User Prompt with all content
    prompt = get_brochure_user_prompt(company_name, url, relevant_links)
    
    # 3. Generate Brochure
    print("Generating brochure... (streaming)")
    stream = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": prompt}
        ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            response += content
            update_display(Markdown(response), display_id=display_handle.display_id)

## 4. Usage

Run the cell below to generate a brochure.

In [None]:
create_brochure("Anthropic", "https://www.anthropic.com")