### Import required classes and functions 

In [1]:
# imports

import os
import sys
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display

# Add parent directory of the notebook to sys.path to import functions from scraper.py file
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

# Now you can import modules from parent directory
from scraper import fetch_website_contents

# If you get an error running this cell, then please head over to the troubleshooting notebook!

### Load environment variables and set api_key for OpenAI

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


### Prepare prompts(messages) for OpenAI

In [3]:
# Define our system prompt

system_prompt = """
You are a snarkyassistant that analyzes the contents of a website,
and provides a short summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

# Define our user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.
"""

### Define a function that returns messages for the website

In [4]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

###  Define a function that summarizes the website for the given URL using OpenAI chat completion API 

In [5]:
def summarize(url):
    openai = OpenAI()
    website = fetch_website_contents(url)
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

### Check if summarize function is giving the summary of any website or not

In [6]:
summarize("https://edwarddonner.com")

'# Website Summary: Edward Donner\n\nThis is Edward Donner’s personal and professional site. Ed is a coder and LLM enthusiast who also dabbles in DJing and amateur electronic music production. He’s the co-founder and CTO of Nebula.io, an AI company focused on talent discovery and management, employing proprietary LLMs and patented matching models. Previously, he founded and sold an AI startup called untapt in 2021.\n\nThe site features projects like **Connect Four** and **Outsmart**, the latter described as a competitive arena where LLMs battle through diplomacy and strategy.\n\n## Recent Posts / Announcements\n- **September 15, 2025:** AI in Production: Gen AI and Agentic AI on AWS at scale\n- **May 28, 2025:** Connecting my courses – become an LLM expert and leader\n- **May 18, 2025:** 2025 AI Executive Briefing\n- **April 21, 2025:** The Complete Agentic AI Engineering Course\n\nYou can connect with Ed via email and follow him on LinkedIn, Twitter, and Facebook. A newsletter sign-up

### Define a function to display the summary in the desired format

In [7]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

### Now we can call the display_summary method to display summary of any website by passing it's URL as argument. 

In [8]:
display_summary("https://www.bbc.com/news")

# BBC News Website Summary

The BBC News website offers comprehensive coverage of global and regional news, including breaking stories, videos, and top headlines. It covers a wide range of topics such as politics, business, sport, innovation, culture, science, and travel. The site features in-depth sections on major ongoing conflicts like the Israel-Gaza War and the War in Ukraine, alongside dedicated regional news from the US, UK, Africa, Asia, Europe, Latin America, and the Middle East.

## Recent News Highlights:
- **Illegal Gambling Crackdown:** Over 30 people arrested, including NBA stars Chauncey Billups, Terry Rozier, and Damon Jones.
- **US-Russia Diplomatic Tensions:** Trump's cancellation of Ukraine talks with Putin and new oil sanctions are expected to worsen relations, according to the BBC's Russia editor.
- **Nature Event:** Millions of red crabs begin their annual migration on Christmas Island, Australia, a spectacular natural phenomenon.

The site also features multimedia content such as podcasts, live updates, and videos, along with sections on climate, sustainable living, art, and culture.

In [9]:
display_summary("https://www.mbrdi.co.in/")

# Mercedes-Benz Research & Development India

This website represents the Indian arm of Mercedes-Benz's global R&D division. It focuses on innovation, technology development, and engineering excellence to support Mercedes-Benz's automotive advancements. No specific news or announcements are provided.

In [10]:
display_summary("https://cnn.com")

# Summary of the Website

This is the homepage of **CNN**, a major news network offering breaking news, latest news, and videos across a wide range of topics including US and world politics, business, health, entertainment, sports, climate, and more. The site features live TV and audio content, subscription options, and personalized account settings. It covers major ongoing events such as the Ukraine-Russia War and the Israel-Hamas War. 

There are interactive elements like ad feedback and user account management. The site organizes content by geography (e.g., Africa, Americas, Asia) and themes (e.g., Politics, Tech, Entertainment, Science), and includes specialized coverage for elections, markets, and lifestyle topics. 

No specific current news or announcements were detailed in the content provided, only broad category listings and user interface elements.

In [11]:
display_summary("https://edwarddonner.com")

# Website Summary: Edward Donner

This personal/professional site belongs to Ed Donner, a coder and AI enthusiast who experiments with Large Language Models (LLMs). Ed is also co-founder and CTO of Nebula.io, an AI startup focused on talent discovery and recruitment using proprietary, patented LLM technology. His background includes founding and selling AI startup untapt.

The site features:
- Personal interests like DJing and electronic music production.
- Projects such as "Connect Four" and "Outsmart," the latter being an LLM arena for strategic battles.
- Blog posts and announcements on AI topics and courses, with recent updates including:
  - AI in Production: Gen AI and Agentic AI on AWS at scale (Sept 2025)
  - Connecting courses to become an LLM expert and leader (May 2025)
  - 2025 AI Executive Briefing (May 2025)
  - The Complete Agentic AI Engineering Course (April 2025)

Contact info and social media links are provided for networking and subscriptions.