## Summarize website

In [1]:
# model = "gpt-4o-mini"
# response = openai.chat.completions.create()

# class Website:
#   def __init__(self, url):

# def user_prompt_for(website):
# def messages_for(website):
# def summarize(url):
# def display_summary(url):

In [2]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

In [3]:
load_dotenv(override = True)
api_key = os.getenv('OPENAI_API_KEY')

In [4]:
model = "gpt-4o-mini"
url_1 = "https://edwarddonner.com"
url_2 = "https://cnn.com"
url_3 = "https://anthropic.com"

In [5]:
openai = OpenAI()

In [6]:
message = "Hello, GPT! This is my first ever message to you! Hi!"
response = openai.chat.completions.create(
    model = model,
    messages = [{
        "role": "user",
        "content": message
    }]
)

print(response.choices[0].message.content)

Hello! Welcome! I'm glad to hear from you. How can I assist you today?


In [7]:
messages = [
    {
        "role": "system",
        "content": "You are a snarky assistant"
    },
    {
        "role": "user",
        "content": "What is 2 + 2?"
    }
]
response = openai.chat.completions.create(
    model = model,
    messages = messages
)

print(response.choices[0].message.content)

Oh, I'm sorry, but that's a trick question, right? Because obviously, it's 22! Just kidding—it's 4. Welcome to basic math!


In [8]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers = headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator = "\n", strip = True)

In [9]:
website = Website(url_1)
print(website.title)
print(website.text)

Home - Edward Donner
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connec

In [10]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [11]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [12]:
print(user_prompt_for(website))

You are looking at a website titled Home - Edward Donner
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acqui

In [13]:
def messages_for(website):
    return [
        {
            "role": "system",
            "content": system_prompt
        },
        {
            "role": "user",
            "content": user_prompt_for(website)
        }
    ]

In [14]:
messages_for(website)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a

In [15]:
def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = model,
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [16]:
summarize(url_1)

'# Summary of Edward Donner\'s Website\n\nEdward Donner\'s website serves as a platform where he shares his interests and projects related to coding, large language models (LLMs), and AI. He is the co-founder and CTO of Nebula.io, a company focused on leveraging AI to assist individuals in talent discovery and engagement. The site includes various resources and posts related to AI and LLM engineering.\n\n## News and Announcements\n- **January 23, 2025**: Released resources for an LLM Workshop focused on hands-on experience with agents.\n- **December 21, 2024**: Welcomed the community "SuperDataScientists!"\n- **November 13, 2024**: Provided resources for "Mastering AI and LLM Engineering."\n- **October 16, 2024**: Shared resources for transitioning from Software Engineer to AI Data Scientist. \n\nOverall, the website reflects Edward\'s passion for technology and his contributions to the AI industry.'

In [17]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [18]:
display_summary(url_1)

# Summary of Edward Donner's Website

Edward Donner's website serves as a personal and professional platform where he shares his passion for coding and experimentation with Large Language Models (LLMs). Ed is the co-founder and CTO of **Nebula.io**, a company focused on leveraging AI to help individuals discover their potential and engage effectively in talent management.

## Key Highlights:
- **About Ed**: He enjoys coding, DJing, and is an enthusiast of electronic music and tech discussions. Previously, he founded **untapt**, an AI startup that was acquired in 2021.
- **LLMs and Their Use**: The site mentions the use of proprietary LLMs tailored for talent acquisition, showcasing a patented matching model that has drawn positive media attention.

## News and Announcements:
- **January 23, 2025**: Launch of LLM Workshop – Hands-on with Agents, providing new resources.
- **December 21, 2024**: Welcoming new members from **SuperDataScientists** to the community.
- **November 13, 2024**: Sharing resources for "Mastering AI and LLM Engineering".
- **October 16, 2024**: Resources provided for transitioning from a Software Engineer to an AI Data Scientist.

The site emphasizes Ed's commitment to the intersection of technology and human potential, encouraging visitors to connect with him for more insights and collaborations.

In [19]:
display_summary(url_2)

# CNN Website Summary

CNN is a news website that provides comprehensive coverage of current events, breaking news, and various topics including politics, business, health, entertainment, and international affairs. The site features articles, videos, and live broadcasts on major global issues and events, such as the Ukraine-Russia conflict and the Israel-Hamas war.

## Latest News Highlights
- **US Stocks Decline**: The stock market experiences a downturn.
- **Military Aid to Ukraine**: Trump has ordered a pause on all military aid to Ukraine, pending further commitments from President Zelensky regarding peace talks.
- **Gaza Truce**: There are developments indicating a fragile situation regarding a truce in Gaza.
- **Various International News**: Reports include Japan dealing with a significant forest fire affecting thousands, and topics around the Vatican and recent injuries from crowd incidents in Germany.

The website also includes lifestyle content, analyses, and features ranging from health and wellness trends to entertainment and technology news, as well as galleries of significant events such as the Academy Awards. Additionally, CNN offers various interactive sections and games for readers.

In [20]:
display_summary(url_3)

# Anthropic Overview

Anthropic is an AI safety and research company based in San Francisco, focusing on developing AI products that prioritize safety. The company is known for its interdisciplinary approach, combining expertise in machine learning, physics, policy, and product development to create reliable and beneficial AI systems.

## Key Offerings

- **Claude 3.7 Sonnet**: This is Anthropic's most advanced AI model, recognized as the first hybrid reasoning model.
- **Claude Code**: A new tool introduced alongside Claude 3.7 for coding capabilities, enabling users to build AI-powered applications.

## Recent Announcements

- **Claude 3.7 Sonnet Launch**: Announced as the most intelligent AI model available, enhancing the performance and capabilities of AI interactions.
- **Research**: Anthropic emphasizes ongoing research in AI safety, with past publications discussing AI feedback mechanisms and core views on AI safety.

The company also actively seeks new talent, with open career opportunities in various fields.