In [6]:

%pip install requests python-dotenv beautifulsoup4 ipython openai


Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting requests
  Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting charset-normalizer<4,>=2 (from requests)
  Downloading charset_normalizer-3.4.1-cp39-cp39-win_amd64.whl.metadata (36 kB)
Collecting urllib3<3,>=1.21.1 (from requests)
  Downloading urllib3-2.3.0-py3-none-any.whl.metadata (6.5 kB)
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Downloading charset_normalizer-3.4.1-cp39-cp39-win_amd64.whl (102 kB)
Downloading urllib3-2.3.0-py3-none-any.whl (128 kB)
Installing collected packages: urllib3, python-dotenv, charset-normalizer, requests
Successfully installed charset-normalizer-3.4.1 python-dotenv-1.0.1 requests-2.32.3 urllib3-2.3.0
Note: you may need to restart the kernel to use updated packages.


In [7]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

In [54]:

# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")

An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook


In [66]:


openai = OpenAI(
  base_url="https://openrouter.ai/api/v1",
  api_key=api_key,
)
model="mistralai/mistral-7b-instruct:free"



In [58]:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    def __init__(self, url):
        self.url = url 
        response = requests.get(url, headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [59]:
ed = Website("https://edwarddonner.com")
print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connect
with me for

In [60]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [61]:
def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [62]:
print(user_prompt_for(ed))

You are looking at a website titled Home - Edward Donner
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.

In [71]:
messages = [
    {"role": "system", "content": ""},
    {"role": "user", "content": "What is 2 + 2?"}
]

In [72]:
# To give you a preview -- calling OpenAI with system and user messages:

response = openai.chat.completions.create(model=model, messages=messages)
print(response.choices[0].message.content)

 The sum of 2 and 2 is 4.


In [67]:
# And now: call the OpenAI API. You will get very familiar with this!
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]
def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = model,
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [68]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [69]:
display_summary("https://cnn.com")

 The website is CNN's homepage, featuring breaking news, live updates, videos, and articles on various topics such as US, World, Politics, Business, Health, Entertainment, Style, Travel, Sports, Science, Climate, and Weather. The website also offers live TV, podcasts, and newsletters.

Currently, the website is covering the sentencing of former US President Donald Trump in a criminal hush money case, the ongoing wildfires in Los Angeles, and the situation in Ukraine-Russia War and Israel-Hamas War. Other featured stories include a man arrested at the Delhi airport with a crocodile skull in his luggage, a flip of a coin determining a woman's future, and a US swimmer losing his Olympic medals in the Palisades wildfire.

The website also provides sections for Space and Science, Global Travel, Global Business, Style, and Sports. It offers exclusive content, photos, and videos, and allows users to sign in to their CNN account, follow specific topics, and provide feedback on ads they encounter on the website.

In [73]:
display_summary("https://cnn.com")

 The website is CNN's homepage, featuring breaking news, live updates, videos, and articles on various topics such as US, World, Politics, Business, Health, Entertainment, Style, Travel, Sports, Science, Climate, and Weather. The website also offers live TV, podcasts, and newsletters.

Currently, the website is covering the sentencing of former US President Donald Trump in a criminal hush money case, the ongoing wildfires in Los Angeles, and the situation in Ukraine-Russia War and Israel-Hamas War. Other featured stories include a man arrested at the Delhi airport with a crocodile skull in his luggage, a flip of a coin determining a woman's future, and a US swimmer losing his Olympic medals in the Palisades wildfire.

The website also provides sections for Space and Science, Global Travel, Global Business, Style, and Sports. It offers exclusive content, photos, and videos, and allows users to sign in to their CNN account, follow specific topics, and provide feedback on ads they encounter on the website.

In [74]:
display_summary("https://anthropic.com")

 This website, titled "Home \ Anthropic", is dedicated to Anthropic, an AI safety and research company based in San Francisco. The main focus of the website is their AI model, Claude 3.5 Sonnet, which is their most intelligent AI model and is now available. The company offers an API for building AI-powered applications and custom experiences using Claude.

The website also features research and announcements. One of the recent announcements is the introduction of computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku. The company has also published work on topics such as Alignment, Constitutional AI: Harmlessness from AI Feedback, and Core Views on AI Safety.

The website provides information about the team, pricing, and careers at Anthropic. Interested parties can find open roles on the website. The company also has a section for customers, press inquiries, support, and a status page for availability.

The website includes links to the company's social media profiles on Twitter, LinkedIn, and YouTube, as well as terms of service, privacy policy, usage policy, responsible disclosure policy, compliance, and privacy choices.