# 1 Imports

In [1]:
import os
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
import ollama

# 2 OLLAMA Connection

In [2]:
ollama_api = "http://localhost:11434/api/chat"
headers = {"Content-Type": "application/json"}
model = "llama3.2"

# 3 Webpage scrapping class

In [3]:
class Website:
    """
    A utility class to represent a Website that we have scrapped
    """
    url: str
    title: str
    text: str

    def __init__(self, url):
        """
        Create this Website object from the given url the BeutifulSoup library
        """

        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"

        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()

        self.text = soup.body.get_text(separator="\n", strip=True)

In [4]:
website = Website("https://edwarddonner.com")
print(website.title)
print(website.text)

Home - Edward Donner
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connec

# 4 Prompt Configuration

In [5]:
# Define system prompt

system_prompt = "You are an assistant that analyzes the contets of a website \
    and provides a short summary, ignoring text that might be navigation related. \
    Respond in markdown."

In [6]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are lookin at a website title {website.title}"
    user_prompt += """ The contents of this website is as follows; \
        please provide a short summary of this website in markdown. \
        If it includes news or annoucements, then summrarize these too.\n\n """
    user_prompt += website.text
    return user_prompt

In [7]:
# Ollama Message API structure

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

# 5 Ollama API Call

## 5.1 Summary Edwarddonner Website

In [8]:
def summarize(url):
    website = Website(url)
    messages = messages_for(website)
    response = ollama.chat(model=model, messages=messages)
    return response['message']['content']      

In [9]:
# A function to display this nicely in Jupyter output, using Markdown
def display_summary(text):
    display(Markdown(text))

In [10]:
display_summary(summarize("https://edwarddonner.com"))

**Summary of Website**
========================

### Overview

This website appears to be a personal blog or portfolio of Edward Donner, the CTO and co-founder of Nebula.io. The website showcases Donner's interests in LLMs (Large Language Models), AI, and technology.

### Featured Content

* **About**: A brief introduction to Donner, his background, and his work at Nebula.io.
* **News and Announcements**:
	+ **January 23, 2025: LLM Workshop – Hands-on with Agents – resources**: A workshop announcement for a hands-on experience with agents in LLMs.
	+ **December 21, 2024: Welcome, SuperDataScientists!**: An invitation to a community of data scientists and AI enthusiasts.
	+ **November 13, 2024: Mastering AI and LLM Engineering – Resources**: A resource page for learning about AI and LLM engineering.
	+ **October 16, 2024: From Software Engineer to AI Data Scientist – resources**: Another resource page for career development in AI.

### Other Sections

* **Connect Four**: A game section that is currently empty.
* **Outsmart**: An arena where LLMs compete against each other in a battle of diplomacy and deviousness (not yet fully functional).

Note: The website appears to be under construction, with some sections still in development.

## 5.2 Summary of CNN Website

In [11]:
display_summary(summarize("http://cnn.com"))

The provided text is a sample of CNN's website, displaying a wide range of news articles and features on various topics such as politics, technology, health, entertainment, and more.

Here are some observations about the content:

1. **Diverse topics**: The article covers a broad spectrum of subjects, indicating that CNN aims to provide comprehensive coverage of current events.
2. **Global perspective**: Stories from around the world are featured, suggesting CNN's commitment to global journalism.
3. **In-depth reporting**: Many articles include quotes, statistics, and expert opinions, indicating in-depth research and investigation.
4. **Breaking news**: The article includes up-to-date information on recent developments, such as the Ukraine-Russia War and Israel-Hamas conflict.
5. **Lifestyle and entertainment**: Segments on topics like fitness, food, sleep, mindfulness, relationships, movies, television, celebrity, and technology suggest CNN's interest in providing more holistic content for its audience.

If you were to engage with this article or explore the full range of articles on CNN's website, some potential questions to consider might be:

* How does CNN balance its coverage of global news with local reporting?
* What role do opinion pieces play in the overall editorial strategy of CNN?
* How does CNN approach investigative journalism, and what notable examples can you find in this sample?

Feel free to ask me any specific questions about this article or explore other topics related to CNN and current events!

## 5.3 Summary of InfoMoney Website

In [12]:
display_summary(summarize("http://infomoney.com.br"))

Olá! Estou aqui para ajudar.

Parece que você está procurando informações sobre investimentos, finanças e economia. Aqui estão algumas sugestões de como posso ajudá-lo:

1. **Investimentos**: Você pode querer saber mais sobre como investir em ações, títulos, fundos mútuos ou outros ativos financeiros.
2. **Finanças pessoais**: Você pode estar procurando conselhos sobre como gerenciar seus débitos, criar um orçamento, economizar dinheiro ou planejar sua situação financeira.
3. **Economia**: Você pode querer saber mais sobre tendências econômicas atuais, impacto da política econômica, crises financeiras e outras questões relacionadas à economia.
4. **Dúvidas gerais**: Você pode ter alguma dúvida específica que não encontra resposta em nenhum dos tópicos acima.

O que você gostaria de saber ou discutir?