## Imports

In [1]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

## OpenAI Initial Setup

In [2]:
# Load environment variables from env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check key
if not api_key:
    print("No API Key found.")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start with sk-proj-;")
elif api_key.strip() != api_key:
    print("An API key was found, but was a space or tab.")
else:
    print("API key found and looks good so far!")

API key found and looks good so far!


In [3]:
# create openai object
openai = OpenAI()

### Call to a Frontier model as preview/test

In [4]:
message = "Hello, GPT! This is my first ever message to you. Hi!"
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": message
        }
    ]
)

print(response.choices[0].message.content)

Hello! It's great to hear from you! How can I assist you today?


## First Project: Webpage Summarizer

### Webpage Scraper Class

In [6]:
# A Class to represent a webpage

# Some websites need you to use proper headers when fetching them
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using bs4
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [7]:
# Now to try fetching a website
ed = Website("https://edwarddonner.com")
print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connec

### Types of Prompts

- **system prompt** tells them what task they are performing and what tone they should use

- **user prompt** the conversation starter that they reply to

### System Prompt

In [18]:
# Define system prompt - can experiment with this

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation-related. \
Respond in markdown."

### User Prompt from Function

In [19]:
# A funtion that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website are as follows: \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [20]:
system_prompt

'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation-related. Respond in markdown.'

In [21]:
print(user_prompt_for(ed))

You are looking at a website titled Home - Edward Donner
The contents of this website are as follows: please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acqu

## Messages


```
[
    {
        "role": "system",
        "content": "system message goes here"
    },
    {
        "role": "user",
        "content": "user meesage goes here"
    }
]
```


### Passing both messages

In [22]:
messages = [
    {
        "role": "system",
        "content": "You are a snarky assistant"
    },
    {
        "role": "user",
        "content": "What is 2 + 2?"
    }
]

In [23]:
# To preview calling w system and user message/prompts

response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content)

Oh, we're starting with the easy ones, huh? Well, 2 + 2 equals 4. Shocking, I know!


### Build useful messages using function

In [24]:
# See how this funtion creates the exact format above

def messages_for(website):
    return [
        {
            "role": "system",
            "content": system_prompt
        },
        {
            "role": "user",
            "content": user_prompt_for(website)
        }
    ]

In [25]:
# Test out with ed website

messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation-related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website are as follows: please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make 

## Bring it Together

### Summarize Function

In [26]:
# And now: call the OpenAI API

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [27]:
summarize("https://edwarddonner.com")

'# Summary of the Edward Donner Website\n\nThe website is a personal space for Ed Donner, who is passionate about coding and experimenting with large language models (LLMs). He shares insights about his professional journey, which includes being the co-founder and CTO of Nebula.io, a company focused on applying AI to enhance talent discovery and management. Ed also has a background as the founder and CEO of an AI startup, untapt, which was acquired in 2021.\n\n## Recent Announcements\n- **January 23, 2025**: Resources for the "LLM Workshop – Hands-on with Agents" were shared.\n- **December 21, 2024**: A welcome message was posted for new visitors, dubbed "SuperDataScientists".\n- **November 13, 2024**: Resources for "Mastering AI and LLM Engineering" were made available.\n- **October 16, 2024**: Resources related to transitioning "From Software Engineer to AI Data Scientist" were released.\n\nIn addition to LLM exploration, Ed enjoys DJing and electronic music production, and he active

### Display Format Function

In [28]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [29]:
display_summary("https://edwarddonner.com")

# Summary of Edward Donner's Website

Edward Donner's website serves as a personal and professional platform outlining his interests and endeavors in programming, AI, and electronic music production. He is the co-founder and CTO of Nebula.io, a company focused on applying AI to enhance talent discovery and management. The site highlights his previous work as the founder and CEO of the AI startup untapt, which was acquired in 2021.

## Recent News and Announcements
- **January 23, 2025**: Post about an LLM Workshop titled "Hands-on with Agents" with resources available.
- **December 21, 2024**: Welcoming message aimed at "SuperDataScientists."
- **November 13, 2024**: Announcement of "Mastering AI and LLM Engineering" resources.
- **October 16, 2024**: Resources shared for transitioning from Software Engineer to AI Data Scientist. 

Overall, the website connects with individuals interested in AI and software engineering while providing resources and insights from his experiences in the tech industry.

### Summarize More Websites

In [30]:
display_summary("https://cnn.com")

# CNN Overview

CNN is a major news outlet providing comprehensive coverage of global and national events. The website features various sections including:

- **Breaking News**: Real-time updates on important occurrences such as the Ukraine-Russia war and the Israel-Hamas conflict.
- **Politics**: Coverage of significant political events, including updates on the Trump administration and contentious issues like tariffs and immigration policies.
- **Business and Economy**: Reports on market activities and economic trends, including major drops in stock indices and corporate financial news.
- **Health**: Articles addressing public health concerns like vaccination rates and healthcare system challenges.
- **Entertainment and Lifestyle**: Updates on celebrity news, cultural trends, and lifestyle content.
- **Science and Technology**: Insights into scientific discoveries, technological advancements, and space exploration.

### Recent Highlights:
- Major stock market decline following trade tensions with China.
- Charge against UK comedian Russell Brand for sexual assault.
- Discussions surrounding the impacts of Trump's administration on the economy.
- Concerns over healthcare policies impacting senior citizens and increasing deportation rates.

### Featured Articles:
- **Analysis of Economic Policies**: Discussions on the ramifications of Trump's tariffs and their effects on American consumers and the global market.
- **Weather Reports**: Severe weather warnings for central US states, predicting significant flooding.
- **Health Warnings**: Reports on the declining vaccination rates for measles, raising alarms among health officials.

This summary encapsulates the diverse content CNN offers, providing readers with a window into current events across various sectors.

In [31]:
display_summary("https://anthropic.com")

# Summary of "Just a moment..."

The website prompts users to enable JavaScript and cookies to access its content. There are no specific news or announcements provided on the page.

## Try Yourself

### Email Subject Line

Write something that will take the contents of an email, and will suggest an approprite short subject line for the email

In [32]:
# Just going to create some misc email samples

order_question = "Thank you for your purchase! \nHi Susan, we're getting your order ready to be shipped. " \
"We will notify you when it has been sent. \nOrder Summary\nOrder 8675309\nArduino Starter Kit x 1"


course_info = "Hello! It's Angela here, your instructor on the 100 Days of Python course. " \
"I have something really exciting to show you. For this entire year, I've been hard at work " \
"creating the most interactive programming course in the world. And thanks to the new " \
"features on Auditorium, I'm getting closer to my vision of creating fully immersive courses " \
"that are all about learning through doing. \nAnd as an existing user of Auditorium, you get to " \
"have early access to this incredible course - Learn Javascript in 1 Month. \nWith over 30 hours " \
"of interactive content, you'll feel more like you're playing a puzzle game than learning a " \
"programming language. There are over 200 coding exercises, 20+ quizzes, 15+ code visualisations, " \
"20+ projects and much much more waiting for you to discover.\nUse my coupon PRETENDCODE and get" \
"this entire course for $9.99!"



In [33]:
# function to create User Prompt from email content samples

def email_user_prompt(email):
    user_prompt = "You are looking at the body of an email"
    user_prompt += "\nThe contents of the email are as follows: "
    "please suggest an appropriate and short subject line for this email."
    user_prompt += email
    return user_prompt

In [34]:
# Step 1: Create your prompts

system_prompt_em = "You are an assistant that reads the contents of an email, " \
"and suggests an appropriate short subject line for that email. Respond in markdown"

# Create messages function using email instead of website but with own user prompt function
def make_msg(email):
    return [
        {
            "role": "system",
            "content": system_prompt_em
        },
        {
            "role": "user",
            "content": email_user_prompt(email)
        }
    ]


In [35]:
make_msg(order_question)

[{'role': 'system',
  'content': 'You are an assistant that reads the contents of an email, and suggests an appropriate short subject line for that email. Respond in markdown'},
 {'role': 'user',
  'content': "You are looking at the body of an email\nThe contents of the email are as follows: Thank you for your purchase! \nHi Susan, we're getting your order ready to be shipped. We will notify you when it has been sent. \nOrder Summary\nOrder 8675309\nArduino Starter Kit x 1"}]

In [36]:
def subject_getter(email):
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = make_msg(email)
    )
    return response.choices[0].message.content

In [37]:
subject_getter(order_question)

'**Subject: Order Confirmation for Arduino Starter Kit**'

In [38]:
subject_getter(course_info)

'```markdown\nSubject: Early Access to Exciting New Course: Learn Javascript for $9.99!\n```'