### 1. Basic Webscraper using BeautifulSoup

In [23]:
import requests
from bs4 import BeautifulSoup

headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

ed = Website("https://edwarddonner.com")
print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connec

### 2. Testing a connection to Ollama

In [24]:
import os
from dotenv import load_dotenv
import requests

load_dotenv()

messages = [{
    "role": "system",
    "content":
    "You are a helpful assistant. You reply with very short answers."
}, {"role": "user", "content": 'How are you'}]

payload = {
    "model": "llama3.2",
    "messages": messages,
    "stream": False,
}

response = requests.post(os.environ.get("OLLAMA_API"), 
                         headers={"Content-Type": "application/json"}, 
                         json=payload)

print ("Assistant:", response.json()['message']['content'])

Assistant: I'm functioning properly.


### 3. Basic Prompts

In [25]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a

### 4. Summarize the website

In [26]:
import ollama

def summarize(url):
    website = Website(url)
    messages = messages_for(website)
    response = ollama.chat(model="llama3.2", messages=messages)
    return response['message']['content']

print(summarize("https://edwarddonner.com"))

**Website Summary**

*   The website belongs to Edward Donner, a co-founder and CTO of Nebula.io, an AI company applying AI to help people discover their potential.
*   Edward is also the founder and CEO of AI startup untapt, acquired in 2021.

### Recent News and Announcements

*   **The Complete Agentic AI Engineering Course**: Announced on April 21, 2025.
*   **LLM Workshop – Hands-on with Agents – resources**: Released on December 21, 2024.
*   **Welcome, SuperDataScientists!**: Published on November 13, 2024.
*   **Mastering AI and LLM Engineering – Resources**: Launched in January 2025.


### 5. Display summary of website as markdown

In [27]:
from IPython.display import display, Markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))
    
display_summary("https://cnn.com")

**Summary of CNN Website**
==========================

### Introduction

The CNN website is a news and media organization that provides breaking news, in-depth analysis, and feature stories on a wide range of topics, including politics, business, health, entertainment, and more.

### Features

* Latest news and updates from around the world
* In-depth analysis and commentary on current events
* Exclusive interviews with newsmakers and thought leaders
* Breaking news alerts and notifications
* Video content, including live streams and documentaries
* Interactive features, such as quizzes and crosswords

### Sections

The website is organized into several sections, including:

* **News**: Latest breaking news from around the world
* **Politics**: In-depth analysis and commentary on politics and government
* **Business**: News and analysis on business and economics
* **Health**: News and features on health and wellness
* **Entertainment**: Entertainment news and reviews
* **Sports**: Sports news and highlights
* **Science**: News and features on science and technology

### Topics

The website covers a wide range of topics, including:

* **Ukraine-Russia War**: Latest updates from the conflict zone
* **Israel-Hamas War**: Analysis and commentary on the conflict
* **US-China Trade Talks**: Updates on trade negotiations between the US and China
* **Climate Change**: News and features on climate change and sustainability

### Videos

The website features a range of video content, including:

* **Live Streams**: Live coverage of news events and conferences
* **Documentaries**: In-depth documentaries on current events and issues
* **Interviews**: Exclusive interviews with newsmakers and thought leaders
* **News Analysis**: Video analysis of breaking news stories

### Interactive Features

The website offers a range of interactive features, including:

* **Quizzes**: Trivia quizzes on various topics
* **Crosswords**: Crossword puzzles for entertainment
* **Games**: Games and challenges on various topics

In [28]:
display_summary("https://anthropic.com")

# Anthropic Website Summary

## Mission and Philosophy

Anthropic aims to build AI that serves humanity's long-term well-being, prioritizing responsible development and considering the effects of powerful technologies.

### Core Views on AI Safety

* Designing powerful technologies requires bold steps forward and intentional pauses.
* Focus on building tools with human benefit at their foundation, like Claude.
* Responsible AI development involves daily research, policy work, and product design.

## Research and Projects

Anthropic publishes various research papers, articles, and announcements, including:

* **Claude 3.7 Sonnet**: The most intelligent AI model, now available.
* **Alignment faking in large language models**: A recent alignment science paper.
* **Introducing the Model Context Protocol**: A new protocol for improving model context awareness.

## Products and Tools

* **Claude**: An AI platform for building custom experiences and applications.
* **Anthropic Academy**: A learning resource for building with Claude.
* **Download apps**: Accessible through the Anthropic website.

## Events and News

* **Featured news articles**:
	+ Tracing the thoughts of a large language model (March 27, 2025)
	+ Societal impacts (March 27, 2025)
	+ Alignment faking in large language models (December 18, 2024)

## Transparency and Security

Anthropic has implemented various security measures, including:

* **Responsible scaling policy**: A guiding framework for responsible AI development.
* **ISO 42001 certification**: Anthropic has received this standard for its quality management system.

This summary provides a brief overview of the key aspects of the Anthropic website.