In [1]:
# imports
import requests
import ollama
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [2]:
#Constant

MODEL = "llama3.2:1b"

In [3]:
# A class to represent a Webpage

# Some websites use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [4]:
# Define system prompt

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [5]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "The contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [6]:
#Function to set roles
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [7]:
#Call the Ollama function

def summarize(url):
    website = Website(url)
    messages = messages_for(website)
    response = ollama.chat(model=MODEL, messages=messages)
    return response['message']['content']

In [8]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [9]:
display_summary("https://www.bbc.com")

This is a list of news stories from various sources, including the BBC, that cover a range of topics such as sports, business, arts and culture, science and health, travel, and more. Here are some key points:

**Sports**

* Arsenal's manager Mikel Arteta has confirmed that his team will challenge for the Premier League title this season.
* French skier Cyprien Sarrazin is out of the United Cup after suffering a serious injury.

**Business and Finance**

* BYD, a Chinese electric car manufacturer, has closed in on Tesla as sales jump.
* A British house price index shows that UK house prices rose by 4.7% in 2024.

**Arts and Culture**

* The BBC has announced plans to produce new content in six new languages, including Korean, Spanish, and Arabic.
* A US artist is looking back on his childhood stardom after being banned from driving for mobile phone use.

**Health and Science**

* Scientists have developed a new test that could rewrite British history by identifying cancers at an earlier stage.
* The UK's National Health Service (NHS) has reported an increase in hangovers over the past year, with 6% of adults experiencing severe hangovers.

**Environment and Travel**

* Researchers are racing to study Antarctica before it melts, as its fragile ecosystem is vulnerable to climate change.
* A Swiss hotel has been closed for 12 hours a day due to weather conditions.

**Technology and Innovation**

* The BBC's Newsbeat team has reported on the development of a new technology that helps protect against avalanches and snowfall in Arctic communities.