### Import required classes and functions 

In [1]:
# imports

import os
import sys
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display

# Add parent directory of the notebook to sys.path to import functions from scraper.py file
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

# Now you can import modules from parent directory
from scraper import fetch_website_contents

# If you get an error running this cell, then please head over to the troubleshooting notebook!

### Load environment variables and set api_key for OpenAI

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


### Prepare prompts(messages) for OpenAI

In [3]:
# Define our system prompt

system_prompt = """
You are a snarkyassistant that analyzes the contents of a website,
and provides a short summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

# Define our user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.
"""

### Define a function that returns messages for the website

In [4]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

###  Define a function that summarizes the website for the given URL using OpenAI chat completion API 

In [5]:
def summarize(url):
    openai = OpenAI()
    website = fetch_website_contents(url)
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

### Check if summarize function is giving the summary of any website or not

In [6]:
summarize("https://edwarddonner.com")

"# Summary of Edward Donner's Website\n\nThis site belongs to Edward Donner, a tech enthusiast who codes, experiments with large language models (LLMs), and dabbles in DJing and amateur electronic music. He is the co-founder and CTO of Nebula.io, an AI-driven talent discovery platform used by recruiters, boasting patented technology and solid press coverage. Previously, he founded AI startup Untapt, which was acquired in 2021.\n\nThe site features unique projects like **Connect Four** and **Outsmart**, the latter being a clever arena where LLMs compete in diplomacy and deviousness.\n\n### Recent Announcements and Posts\n- **Nov 11, 2025:** The Unique Energy of an AI Live Event\n- **Sep 15, 2025:** AI in Production: Gen AI and Agentic AI on AWS at Scale\n- **May 28, 2025:** Be an AI Engineer and Leader: The Curriculum\n- **May 18, 2025:** 2025 AI Executive Briefing\n\n### Contact & Socials\nEd provides his email and social media links (LinkedIn, Twitter, Facebook) and offers a newslette

### Define a function to display the summary in the desired format

In [7]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

### Now we can call the display_summary method to display summary of any website by passing it's URL as argument. 

In [8]:
display_summary("https://www.bbc.com/news")

# BBC News Website Summary

The website is the British Broadcasting Corporation's (BBC) official news portal, offering breaking news, videos, and the latest top stories from the U.S. and around the world. It covers a broad range of topics including global conflicts (Israel-Gaza War, War in Ukraine), regional news (US & Canada, UK, Africa, Asia, Europe, Latin America, Middle East), politics, business, technology, culture, arts, travel, earth sciences, and audio/video content.

## Recent News Highlights
- New calls to release all Jeffrey Epstein files following the release of messages suggesting Epstein claimed he could "take down" Donald Trump.
- Analysis of emails from the Epstein document release that allege Trump "knew about the girls," with the White House denying and calling it a smear attempt.
- Coverage of a deadly police raid in Rio de Janeiro that failed to disrupt the power of a criminal gang despite over 120 fatalities.

The site integrates multimedia with live news, podcasts, and videos, and features in-depth and verified reporting sections.

In [9]:
display_summary("https://www.mbrdi.co.in/")

# Mercedes-Benz Research & Development India

This website is for Mercedes-Benz's R&D center in India. It appears to focus on advanced automotive research and development activities carried out locally. No specific news or announcements are provided on the site.

In [10]:
display_summary("https://cnn.com")

# Website Summary: CNN

This website is CNN, a comprehensive news platform offering the latest breaking news, videos, and in-depth coverage across various topics including US and world news, politics, business, health, entertainment, sports, science, climate, and weather. It also covers major ongoing conflicts like the Ukraine-Russia War and the Israel-Hamas War.

**Features:**
- Multimedia content: live TV, videos, and podcasts.
- Personalized news experience with account sign-in and topic subscriptions.
- Sections for specific interests such as tech innovations, lifestyle, travel, and sports.
- Tools like polls, calculators, and market updates.
- News editions available in multiple languages including Arabic and Spanish.

**User Interaction:**
- Option to provide detailed feedback on ads, including relevance and technical issues.
- Personalized newsletters and topic tracking.

No specific news or announcements were highlighted in the provided content, but the site continuously updates with current events and major global issues.

In [11]:
display_summary("https://edwarddonner.com")

# Summary of Edward Donner's Website

Edward Donner is a coder, AI enthusiast, and co-founder/CTO of Nebula.io, a company leveraging AI to revolutionize talent discovery and management. He has a background as the founder and CEO of an AI startup, untapt, acquired in 2021. The site features his experiments with large language models (LLMs), including projects like **Connect Four** and **Outsmart**, the latter being an arena where LLMs compete in diplomacy and strategy.

## News & Announcements
- **Nov 11, 2025:** The Unique Energy of an AI Live Event
- **Sep 15, 2025:** AI in Production: Gen AI and Agentic AI on AWS at Scale
- **May 28, 2025:** Be an AI Engineer and Leader: The Curriculum
- **May 18, 2025:** 2025 AI Executive Briefing

The site also offers ways to connect via social media and subscribe to a newsletter.