### Web Page Summarizer using Ollama

This project uses open source LLM llama3.2

***What does it do?*** Uses webscraping to pull relevant information from a website and uses llama3.2 running locally through Ollama to summarize the content.



In [1]:
# imports
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

# Constants
OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

#### Scraping the website

In [2]:
# These headers are needed to scrape some websites properly
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)


This model expects input in a couple of prompts

A system prompt that tells them what task they are performing and what tone they should use

A user prompt -- the conversation starter that they should reply to

In [7]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

Function to create User prompt

In [6]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [9]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [22]:
def summarize(url):
    website = Website(url)
    response = ollama.chat(model=MODEL, messages=messages_for(website))
    display(Markdown(response['message']['content']))

In [23]:
import ollama


url = "https://www.britannica.com/place/India/The-Indian-Paleolithic"
summarize(url)

The text discusses the early human settlement and cultural development of India and its surrounding regions, from the Paleolithic period to the Neolithic period. Here's a summary:

**Paleolithic Period (c. 2.5 million - 10,000 BCE)**

* The Great Indian Desert was an important source of archaeological materials in the late 20th century.
* Hand axes found at Didwana, Rajasthan, and other sites date back to around 400,000 years ago.
* Mesolithic cultures are evident throughout India, with a wide variety of subsistence patterns, including hunting, gathering, fishing, herding, and small-scale agriculture.
* Many caves in central India contain rock paintings depicting various subjects, including game animals and human activities.

**Mesolithic Period (c. 10,000 - 5,000 BCE)**

* The progressive diminution in the size of stone artifacts reached its climax with the small parallel-sided blades and microliths of what has been called the Indian Mesolithic.
* Cultures during this period exhibited a wide variety of subsistence patterns, including hunting, gathering, fishing, and herding.

**Neolithic Period (c. 5,000 - 2,000 BCE)**

* The Indo-Iranian borderlands form the eastern extension of the Iranian plateau and mirror the environment of the Fertile Crescent in the Middle East.
* Excavations at Mehrgarh in Baluchistan provided evidence of early settlements on the borders of the Indus system and Baluchistan, with two major periods of occupation dating back to around 8,000 BCE.
* The first examples of pottery, granary structures, and monumental architecture appeared during this period.

**Other Sites**

* Sarai Khola (Pakistan Punjab) had the earliest known settlements, dating back to around 4,000 BCE.
* Burzahom in the Vale of Kashmir also had early settlements, with deep pit dwellings associated with ground stone axes and bone tools.
* Gufkral, another site in the Kashmir region, has been dated by radiocarbon to the 3rd millennium BCE.

Overall, this text provides a brief overview of the early human settlement and cultural development of India and its surrounding regions, from the Paleolithic period to the Neolithic period.