In [2]:
# imports

import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display

In [3]:
# Constants

OLLAMA_API = "http://localhost:11434/api/chat"
HEADERS = {"Content-Type": "application/json"}
MODEL = "llama3.2"

In [4]:
# Let's just make sure the model is loaded

#!ollama pull llama3.2

In [5]:
# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

# A class to represent a Webpage
class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [6]:
#Let's check the one website for an example
ex = Website("https://vg247.com")
print(ex.title)
print(ex.text)

VG247
Skip to main content
Monopoly Go Dice Links
Roblox Hunt code
Volleyball Legends codes
Fragpunk Tier List
Monster Hunter Wilds Guide
Roblox game codes
Basketball Showdown Codes
News
Reviews
Articles
Guides
Baldur's Gate 3
Diablo IV
Fallout 4
Elden Ring
Black Myth: Wukong
Deals
Sign in / Create account
Support us
Buy eShop codes
Home
For you
Build-A-Battlefield
I’ve watched enough leaked Battlefield Labs gameplay to convince me that Battlefield is finally getting this one fundamental thing right, at least
DICE and all the other teams currently working on the next Battlefield are, finally, building on what came before.
Sherif Saed
3 days ago
1
Temple of the mind
Want to know how Warframe got its own bizarre Sci-Fi David Bowie? Here's how Temple came to be, from a sketch to your Sortie
"This one in particular was one of the most self-indulgent, coolest set of things I could put into a Warframe"
Connor Makar
3 days ago
0
Barging Onto Market
A Nintendo Switch 2 may set you back $400 or

In [7]:
# Let's define our system prompt

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [8]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [9]:
#Function check

print(user_prompt_for(ex))

You are looking at a website titled VG247
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Skip to main content
Monopoly Go Dice Links
Roblox Hunt code
Volleyball Legends codes
Fragpunk Tier List
Monster Hunter Wilds Guide
Roblox game codes
Basketball Showdown Codes
News
Reviews
Articles
Guides
Baldur's Gate 3
Diablo IV
Fallout 4
Elden Ring
Black Myth: Wukong
Deals
Sign in / Create account
Support us
Buy eShop codes
Home
For you
Build-A-Battlefield
I’ve watched enough leaked Battlefield Labs gameplay to convince me that Battlefield is finally getting this one fundamental thing right, at least
DICE and all the other teams currently working on the next Battlefield are, finally, building on what came before.
Sherif Saed
3 days ago
1
Temple of the mind
Want to know how Warframe got its own bizarre Sci-Fi David Bowie? Here's how Temple came to be, from a sketch to your Sor

# Let's ask to Ollama

In [10]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [11]:
def summarize(url):
    website = Website(url)
    payload = {
        "model": MODEL,
        "messages": messages_for(website),
        "stream": False
    }
    response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)
    return response.json()['message']['content']

In [12]:
summarize("https://geekozmos.com")

'# Geekozmos Website Summary\n### Overview\nGeekozmos is a website founded on August 1st, 2023, focusing on tech, gaming, and entertainment news.\n\n### Latest News and Announcements\n\n*   Apple has halted sales of the Apple Watch Series 9 and Ultra 2 due to a patent dispute with Masimo.\n*   Android 15 will introduce a "Battery Health" feature, providing users insights into their device\'s performance and longevity.\n*   Insomniac Games\' next game will be Marvel\'s Venom.\n\n### Gaming Section\n\n*   PS4 games are backward compatible with the PS5.\n*   Death Stranding 2: On The Beach is scheduled for release in 2025 exclusively for PlayStation 5.\n*   Prince of Persia: The Lost Crown system requirements have been announced.\n\n### Entertainment Section\n\n*   The Deadpool 3 trailer has been released, showcasing Marvel\'s Jesus and Wade Wilson.\n*   Matthew Perry, a beloved \'Friends\' star, passed away at 54.\n*   Léa Seydoux\'s best movies are featured on the website, highlighting 

In [13]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [14]:
display_summary("https://geekozmos.com")

**Summary of Geekozmos**
=====================================

Geekozmos is a website that covers various topics including tech, gaming, and entertainment. The website was founded on August 1st, 2023, and provides news, reviews, and content on the latest developments in these fields.

### News and Announcements

*   **Android 15**: Will introduce a "Battery Health" feature, offering users insights into their device's performance and longevity.
*   **Apple Watch Series 9 and Apple Watch Ultra 2**: Sales have been halted due to a patent dispute with Masimo over blood oxygen sensing technology.
*   **Insomniac Games' Next Game**: Will be Marvel's Venom, according to leaked files from the company's servers.
*   **Deadpool 3 Trailer**: Has been released, featuring Marvel's Jesus, Wade Wilson embarking on a journey to reshape the MCU.
*   **Death Stranding 2: On The Beach**: Will be released exclusively for PlayStation 5 in 2025.

### Gaming News

*   **PS4 Games on PS5**: Yes, PS5 is backward compatible with most PS4 titles.
*   **Prince of Persia: The Lost Crown**: System requirements have been announced, shedding light on the hardware needed to unlock the full potential of this epic journey.

### Entertainment News

*   **The Last of Us Multiplayer Project**: Has been cancelled, with Naughty Dog choosing a single-player narrative focus.
*   **Matthew Perry's Passing**: Beloved 'Friends' star Matthew Perry has passed away at 54.
*   **Best Léa Seydoux Movies**: A journey through the actress's remarkable career explores her filmography.

### Tech News

*   **iOS 17 Bug**: New iPhone 15 models are experiencing a freeze due to an iOS 17 bug.
*   **Microsoft Recruiting**: Former OpenAI CEO Sam Altman is joining Microsoft amid board controversy.

# BONUS: Try with DeepSeek R1

Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B.  
This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.

Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!

In [None]:
#!ollama pull deepseek-r1:1.5b

In [15]:
MODEL = "deepseek-r1:1.5b"

In [16]:
display_summary("https://geekozmos.com")

<think>
Okay, so I'm looking at this website called Geekozmos. The user provided a list of articles and news about it, including the homepage with links to various topics like Tech, Gaming, etc. My task is to act as an AI assistant that summarizes this information in markdown format.

Hmm, first I need to understand what each section covers. On the homepage, there's a "Homepage" section which seems to list several articles and news items related to tech, gaming, entertainment, etc. Then under "Tech," there are subheadings like Battery Health Feature May Come With Android 15, Apple Halts Sales of Apple Watch Series 9 and Ultra 2, iOS 17 Bug is Causing New iPhone 15 Models to Freeze, and others.

In the "Gaming" section, there's a lot about different games like Dead Stranding 2,Prince of Persia: The Lost Crown System Requirements Announced, Insomniac Games’ Next Game Will Be Marvel’s Venom!, and Anomaly Agent Review. These seem to be news items or summaries of game updates.

Under "Entertainment," it talks about Deadpool 3 trailer release, Matthew Perry passing away at 54, the Best Léa Seydoux Movies review, and exploring Michael Caine movies.

In "News," there are a few articles like Hideo Kojima passes away, Alan Wake 2, and The Day Before’s Developer Studio closes doors. Also, some news about PC, PlayStation 5, Xbox series x, etc.

I need to condense this information into markdown without any markdown formatting but using headers for each main category: Tech, Gaming, Entertainment, News. Under each category, bullet points with the key subheadings and the relevant content from the website.

Wait, the user's instructions mention that I should act as an AI assistant with a certain level of language, perhaps more conversational than just formal or technical. So maybe I can elaborate on how I approached this task, like identifying each main section and summarizing it into markdown.

I also need to ensure that the markdown is correctly formatted, using headers and bullet points without any extra formatting like <>, which may have been used in the original response. The user provided an example of how they did it, so I should follow that structure closely.

So putting it all together: Start with a header for " Geekozmos " or something similar, but since the example used headers like "Home," perhaps just using sections separated by hyphens. Then under each section, bullet points with the main heading and key information.

I think this covers everything from the user's request to structure it properly in markdown as they provided.
</think>

** Geekozmos**

- **Homepage**: 
  - Articles on Tech, Gaming, Entertainment, etc.

- **Tech**: 
  - Battery Health Feature May Come With Android 15
  - Apple Halts Sales of Apple Watch Series 9 and Ultra 2
  - iOS 17 Bug is Causing New iPhone 15 Models to Freeze
  - Microsoft Recruits Former OpenAI CEO Sam Altman Amidst Board Controversy

- **Gaming**: 
  - Dead Stranding 2: Reveal Coming Soon?
  - Prince of Persia: The Lost Crown System Requirements Announced
  - Insomniac Games’ Next Game Will Be Marvel’s Venom!
  - Anomaly Agent Review

- **Entertainment**: 
  - Deadpool 3 Trailer Release
  - Matthew Perry Passing Away at 54
  - Best Léa Seydoux Movies Review
  - Exploring Michael Caine Movies

- **News**: 
  - Hideo Kojima Passes Away
  - Alan Wake 2 Unveiled
  - The Day Before’s Developer Studio Closes Doors