# Website Summarizer using Ollama locally

Building a website summarizer that uses Llama 3.2 running locally on my machine
Created: 3/11/2025

In [1]:
import ollama
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
# Constants
OLLAMA_API = "http://localhost:11434/api/chat" #url on local box
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [2]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [3]:
# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

ourwebsite = Website("https://espn.com")

In [4]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(ourwebsite):
    user_prompt = f"You are looking at a website titled {ourwebsite.title}"
    user_prompt += "\nThe contents of this website is as follows; \
    please provide a short summary of this website in markdown. \
    If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += ourwebsite.text
    return user_prompt

In [5]:
def messages_for(ourwebsite):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(ourwebsite)}
    ]

In [6]:
def summarize(url):
    website = Website(url)
    response = ollama.chat(
        model = MODEL,
        messages = messages_for(website)
    )
    return response['message']['content']

# Step 4: print the result
summarize("https://espn.com")

"It appears that the main section of the page is dedicated to various sports news and updates from around the world. Here are some of the top headlines and sections:\n\n**Sports News**\n\n* Jones joins Colts, to battle Richardson for QB job\n* Sources: Texans acquire Eagles' Gardner-Johnson\n* Vikings add Allen, Fries to strengthen both lines\n* 49ers' purge continues with release of DE Floyd\n* Cowboys, Turpin agree to 3-year deal\n\n**Fantasy Football**\n\n* Reactivate A League\n* Create A League\n* Join a Public League\n* Practice With a Mock Draft\n\n**NBA News**\n\n* Baron Davis found his voice as rapper Bart Oatmeal (feature article)\n* Kevin Durant has to be separated from Cam Spencer, Desmond Bane (video clip)\n\n**Hockey News**\n\n* NHL trade grades: Report cards for the biggest deals of 2025 (article)\n* Ovechkin career goal record chase: Tracker, highlights, more (tracker and video)\n\n**Baseball News**\n\n* How A's, Rays moving to minor league ballparks means for MLB (featu

In [7]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [8]:
display_summary("https://espn.com")

There are many news articles and updates on various sports and entertainment topics in the provided text. Here's a summary of some of the key headlines:

**NFL**

* Jones joins Colts to battle Richardson for QB job
* Sources: Texans acquire Eagles' Gardner-Johnson
* Vikings add Allen, Fries to strengthen both lines
* 49ers' purge continues with release of DE Floyd

**NBA**

* NBA intel: Scouts, execs on Lakers', Warriors', Mavs' next steps
* Coach changes affect production on offense and defense (Florida State)

**Hockey**

* NHL trade grades: Report cards for the biggest deals of 2025
* Ovechkin career goal record chase: Tracker, highlights, more

**Baseball**

* A's, Rays moving to minor league ballparks means for MLB
* How to watch ESPN+ on Disney+: Frequently asked questions

**Football**

* Fantasy Football Home
* NFL Free Agency
* Men's Bracketology
* Women's Bracketology
* NFL Draft
* Men's Tournament Challenge
* Women's Tournament Challenge

**Other Sports**

* Unrivaled 3-on-3
* X Games
* SEC Network

**Entertainment**

* How Baron Davis found his voice as rapper Bart Oatmeal (Baron Davis)
* Kevin Durant has to be separated from Cam Spencer, Desmond Bane (NBA game)

You can find more news and updates on these topics by visiting the ESPN website or checking out their social media channels.

# Would need to use Selenium for some pages like openai.com in order to scrape