## Importing required libraries and assigning locally hosted Ollama using OpenAI library

In [1]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')


## Creating Object using Class Website for extracting text from the given website using BeautifulSoap library

In [2]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

## Assigning System_Prompt and User_Prompt for response and using display markdown for better readability

In [3]:
system_prompt = "You are an assistant that analyzes the contents of a website \
                and provides a short summary, ignoring text that might be navigation related. \
                Respond in markdown."

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
    please provide a short summary of this website in markdown. \
    If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

def summarize(url):
    website = Website(url) 
    response = ollama_via_openai.chat.completions.create(
    model="llama3.2",
    messages=messages_for(website)
    )
    return response.choices[0].message.content

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

## Output

In [4]:
display_summary("https://cnn.com")

Here are some news articles and topics from the CNN website:

**US Politics**

* Trump administration targets Harvard's patents (business)
* Donald Trump says Qatari jet could be ready for use as Air Force One in 6 months (politics)
* Republicans reprise anti-transgender 'Kamala is for they/them' ads for the midterms (politics)

**World News**

* China claims South China Sea island's environmental concerns were fabricated (Asia)
* Beijing denies China has 'super-embassy' plans in London (Europe)
* In India, opposition leader gets life sentence over sedition charges (India)

**Crime and Investigations**

* Man accused of killing 4 people at Montana bar is in custody following weeklong manhunt (crime)
* US District Court asks to release Epstein grand jury transcripts (justice)
* Airman charged in fatal firearm incident at Wyoming Air Force Base (crime)

**Entertainment and Lifestyle**

* Billie Eilish insists on special rules at all her concerts due to environmental concerns (entertainment)
* Private Welsh island with 19th-century fort goes on the market (real estate)
* Sex toys thrown at WNBA games reveal weird window into online culture and reopen old wounds for women's sports (features)

**Business**

* Waterslide cracks open on world's largest cruise ship, injuring one passenger (travel)
* Uber CEO announces departure from the company after 15 years (business)

**Health**

* A Nicaraguan immigrant and his family have been barricaded at home for days after he outran ICE agents (health)

**Environmental News**

* At 100 years old, Japan's atomic bomb survivors are ready to share long-buried stories about Hiroshima nuclear bombing (climate)
* Private Welsh island with 19th-century fort goes on the market (sustainability)