In [13]:
pip install -q -U google-genai

Note: you may need to restart the kernel to use updated packages.


In [14]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
import google.generativeai as genai

In [15]:
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")

if not api_key:
    print("Api key not found. Please set the GEMINI_API_KEY environment variable.")
else:
    print("API key loaded successfully.")

API key loaded successfully.


In [16]:
client = genai.GenerativeModel('gemini-2.5-pro')

In [17]:
# A class to represent a webpage

class Website:
    """
    A class to represent a website that we have scraped
    """
    def __init__(self, url):
        """
        Initializes the Website object with a URL and fetches its content.
        """
        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else 'No Title found'
        for irrelevant in soup(['script', 'style', 'img', 'input']):
            irrelevant.decompose()
        self.text = soup.get_text(separator='\n', strip=True)

In [18]:
page = Website("https://cnn.com")
print(page.title)
print(page.text[:500])

Breaking News, Latest News and Videos | CNN
Breaking News, Latest News and Videos | CNN
CNN values your feedback
1. How relevant is this ad to you?
2. Did you encounter any technical issues?
Video player was slow to load content
Video content never loaded
Ad froze or did not finish loading
Video content did not start after ad
Audio on ad was too loud
Other issues
Ad never loaded
Ad prevented/slowed the page from loading
Content moved around while ad loaded
Ad was repetitive to ads I've seen previously
Other issues
Cancel
Submit
Thank You!


System Prompt - Tells what task they are performing and what tone they should use

A user prompt - The conversation starter that they should reply to

In [19]:
# Create a system prompt for the model

sys_prompt = "You are a helpful assistant that analyses and summarizes web pages \
        and provides a concise summary of the content.\
        Respond in markdown"

In [20]:
# A function that writes user prompt that asks for summarization

def user_prompt(website):
    prompt = f"""You are a helpful assistant that analyses and summarizes web pages.
    Please provide a concise summary of the content in markdown format.

    You are looking at a website titled: {website.title}.

    The content of this website is as follows:
    {website.text}
    """
    return prompt

In [21]:
def summarize_website(url):
    website = Website(url)

    # Create the prompt
    content = user_prompt(website)

    # Generate the content
    response = client.generate_content(content)

    # Return the text from the response
    return response.text

In [22]:
summarize_website("https://cnn.com")

'Of course! Here is a concise summary of the content on the CNN homepage.\n\n***\n\n### Summary of CNN Homepage\n\nThis is the main portal for CNN, a major news organization. The page provides a broad overview of the day\'s top news stories, featuring a mix of US and international events, politics, business, and entertainment.\n\n**Top Stories:**\n*   **US Government Shutdown:** A major focus is on a government shutdown, which is causing travel delays at major airports. There are related stories about President Trump considering withholding back pay from federal workers and the political messaging from the GOP.\n*   **Supreme Court:** The court is reportedly prepared to rule against a ban on conversion therapy.\n*   **Pam Bondi Hearing:** The attorney general nominee is facing a contentious Senate confirmation hearing, with questions about National Guard deployments and various investigations.\n*   **Gaza Ceasefire Talks:** Live updates indicate that talks are making progress.\n\n**Key

In [23]:
# A Function to display this nicely in markdown

def display_summary(url):
    summary = summarize_website(url)
    display(Markdown(summary))

In [24]:
display_summary("https://cnn.com")

Of course. Here is a concise summary of the CNN webpage content in markdown format.

### Overview

This is the main homepage for CNN, a major news organization. It serves as a portal to a wide range of global and US news, featuring top headlines, live updates, videos, and various content sections.

### Top News Stories

*   **US Politics:** The lead stories focus on a potential **government shutdown**, its impact on travelers and federal workers, and related political maneuvering. There are also articles on a Supreme Court ruling concerning conversion therapy and the contentious Senate hearing for Pam Bondi.
*   **International Affairs:** **Live updates on Gaza ceasefire talks** and the ongoing **Israel-Hamas War** are prominently featured. The Ukraine-Russia war is also a key navigation topic.
*   **Donald Trump:** Several headlines involve the former president, including his comments on the shutdown, National Guard deployments, and the Epstein investigation.
*   **Investigations:** There is a dedicated section on the fallout from the **Jeffrey Epstein investigations**, including a Supreme Court appeal from Ghislaine Maxwell.

### Featured Content & Sections

*   **CNN Underscored:** A significant portion of the page is dedicated to product reviews and deals, with a heavy emphasis on **Amazon Prime Day** sales on items like shoes, electronics, and kitchen essentials.
*   **Videos:** A large collection of recent video clips covering news events, interviews, and human-interest stories.
*   **Live Updates:** Dedicated, real-time coverage of breaking news events like the Gaza talks and the government shutdown.
*   **Analysis & Exclusives:** In-depth articles and opinion pieces are available, some of which are marked for subscribers only.
*   **Other Categories:** The site is organized into broad sections including World, Business, Health, Entertainment, Sports, Climate, and Style.

### Website Features

*   **Multimedia:** The site offers content via articles, live TV, video clips, and podcasts ("Listen").
*   **Interactivity:** Users can play games (crosswords, Sudoku) and take quizzes.
*   **Personalization:** Options to sign in, create an account, and follow specific topics.