In [1]:
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
import ollama

In [2]:
MODEL = "llama3.2"

In [3]:
class Website:
    """
    A utility class to represent a Website that we have scraped
    """
    url: str
    title: str
    text: str

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [4]:
ed = Website("https://edwarddonner.com")
print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connect
with me for

In [5]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [6]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "The contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [7]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [8]:
def summarize(url):
    website = Website(url)
    messages = messages_for(website)
    response = ollama.chat(model=MODEL, messages=messages)
    return response['message']['content']

In [9]:
summarize("https://edwarddonner.com")



In [10]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [11]:
display_summary("https://edwarddonner.com")

# Summary of Edward Donner's Website

### Overview
Edward Donner is a writer, DJ, and entrepreneur who is passionate about Artificial Intelligence (AI) and Language Models (LLMs). He co-founded Nebula.io, an AI startup that applies AI to help people discover their potential.

### Key Points

* **About**: Ed shares his interests in writing code, DJing, and experimenting with LLMs. He also mentions his experience as the founder and CEO of AI startup untapt.
* **Work**: He currently leads Nebula.io, applying AI to talent management and sourcing.
* **Expertise**: He has patented their matching model and developed a platform using groundbreaking proprietary LLMs.

### Recent Announcements

* November 13, 2024: Mastering AI and LLM Engineering – Resources
* October 16, 2024: From Software Engineer to AI Data Scientist – resources
* August 6, 2024: Outsmart LLM Arena – a battle of diplomacy and deviousness
* June 26, 2024: Choosing the Right LLM: Toolkit and Resources

### Contact Information
Edward Donner can be reached at ed [at] edwarddonner [dot] com or on his social media profiles:

* LinkedIn: 
* Twitter:
* Facebook:

In [12]:
display_summary("https://cnn.com")

This appears to be a sample CNN website, and it includes various news articles, videos, and features on a wide range of topics, including politics, business, sports, entertainment, and more.

If you're interested in finding specific information or topics, here are some suggestions:

1. **Search the site**: You can use the search bar at the top of the page to find specific articles, videos, or features.
2. **Browse categories**: Click on the "Categories" dropdown menu to explore news articles and features organized by topic, such as politics, business, sports, entertainment, and more.
3. **Watch live TV**: CNN offers 24/7 live coverage of various news events, including politics, business, sports, and more. You can watch live TV on the "Live" tab or click on specific channels (e.g., "CNN News").
4. **Listen to podcasts**: CNN has a range of audio podcasts covering topics such as science, space, entertainment, and more.
5. **Explore sections**: Click on individual sections, such as "World," "Business," "Sports," or "Entertainment," to find relevant articles, videos, and features.

Some specific article titles and links included in the sample website are:

* "A price too high to pay? Pollution from Ghana’s gold rush sparks maternal health fears"
* "Nearly 1,000 endangered animals repatriated to Madagascar in anti-trafficking landmark"
* "The most delicious Turkish dishes"
* "Przewalski’s horse: Back from the brink"

Feel free to explore and find more content of interest!

In [13]:
display_summary("https://anthropic.com")

**Summary of Anthropic Website**
==========================

### Mission and Values

* Anthropic is an AI safety and research company based in San Francisco.
* The company aims to create reliable, beneficial AI systems through interdisciplinary research.

### Products and Research

* Claude: An intelligent AI model with various products, including Claude 3.5 Sonnet and Haiku.
* Constitutional AI: Harmlessness from AI Feedback (research article).
* Core Views on AI Safety: When, Why, What, and How (announcements).

### Company Information

* Anthropic is a pioneer in AI safety research and development.

### News and Announcements

* **Introducing computer use**: A new update to Claude 3.5 Sonnet and Haiku.
* **Alignment**: Research on Constitutional AI: Harmlessness from AI Feedback was published (December 15, 2022).
* **Core Views on AI Safety**: The company shared its core views on AI safety in March 2023.

### About Anthropic

* The company is led by an interdisciplinary team with experience across ML, physics, policy, and product.
* See open roles for career opportunities.

In [14]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown in Spanish."

display_summary("https://edwarddonner.com")

**Resumen del sitio web**
=====================================

El sitio web de Edward Donner se centra en la aplicación de inteligencia artificial (IA) y lenguaje natural (LLM), con un énfasis en el desarrollo de tecnologías para ayudar a las personas a descubrir su potencial.

**Noticias e información reciente**
---------------------------------

*   Se lanzó "Outsmart LLM Arena", una arena donde los modelos de lenguaje artificiales compiten en un duelo de diplomacia y astucia.
*   En octubre del 2024 se publicaron recursos para "Maestros de AI y Ingeniería de LLM".
*   En agosto del 2024 se lanzó una serie de recursos para los ingenieros de software que buscan convertirse en expertos en datos de IA.
*   El sitio web tiene un blog donde se comparten actualizaciones sobre las últimas noticias e innovaciones relacionadas con la IA y LLM.

**Información general**
----------------------

El sitio web es propiedad de Edward Donner, quien es el co-fundador y CTO de Nebula.io. La empresa está trabajando en aplicar la IA para ayudar a las personas a descubrir su potencial y pursuing sus razones de ser.