In [1]:
# imports

import os
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

# Connecting to OpenAI (or Ollama)

The next cell is where we load in the environment variables in your `.env` file and connect to OpenAI.  

We will use Ollama in another notebook.

## Troubleshooting if you have problems:

If you get a "Name Error" - have you run all cells from the top down? Head over to the Python Foundations guide for a bulletproof way to find and fix all Name Errors.
Any concerns about API costs? See my notes in the README - costs should be minimal, and you can control it at every point. You can also use Ollama as a free alternative, which we discuss soon.

In [3]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


# Let's make a quick call to a Frontier model to get started, as a preview!

In [4]:
# To give you a preview -- calling OpenAI with these messages is this easy. Any problems, head over to the Troubleshooting notebook.

message = "Hello, GPT! This is my first ever message to you! Hi!"

messages = [{"role": "user", "content": message}]

messages


[{'role': 'user',
  'content': 'Hello, GPT! This is my first ever message to you! Hi!'}]

In [5]:
openai = OpenAI()

response = openai.chat.completions.create(model="gpt-5-nano", messages=messages)
response.choices[0].message.content

'Hi there! Nice to meet you—welcome to the chat.\n\nI’m here to help with questions, explanations, writing and editing, coding, planning, brainstorming, translations, and more. What would you like to do today? If you’re not sure, tell me a bit about your interests or a task you have in mind, and I’ll tailor my help.'

In [10]:
from openai import OpenAI

response = openai.chat.completions.create(
    model="gpt-4o-mini",   
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a short poem about Boston in 3 lines."}
    ]
)

print(response.choices[0].message.content)


In Boston's streets where history breathes,  
Brick pathways whisper tales of old,  
As harbor winds share secrets bold.


## OK onwards with our first project

In [25]:
url = "https://www.salesforce.com/in/?ir=1"

In [31]:
import requests
from bs4 import BeautifulSoup

def fetch_website_contents(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")

    # Extract visible text
    text = " ".join([p.get_text() for p in soup.find_all("p")])
    website=text[:1000]  # read first 1000 chars
    return website

## Types of prompts

You may know this already - but if not, you will get very familiar with it!

Models like GPT have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [19]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = """
You are a snarkyassistant that analyzes the contents of a website,
and provides a short, snarky, humorous summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

In [20]:
# Define our user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.

"""

## Messages

The API from OpenAI expects to receive messages in a particular structure.
Many of the other APIs share this structure:

```python
[
    {"role": "system", "content": "system message goes here"},
    {"role": "user", "content": "user message goes here"}
]
```
To give you a preview, the next 2 cells make a rather simple call - we won't stretch the mighty GPT (yet!)

In [None]:
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

response = openai.chat.completions.create(model="gpt-4.1-nano", messages=messages)
response.choices[0].message.content

## And now let's build useful messages for GPT-4.1-mini, using a function

In [40]:
# See how this function creates exactly the format above

def messages_for(site):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + site}
    ]

In [41]:
# Try this out, and then try for a few more websites
site = fetch_website_contents(url)
messages_for(site)

[{'role': 'system',
  'content': '\nYou are a snarkyassistant that analyzes the contents of a website,\nand provides a short, snarky, humorous summary, ignoring text that might be navigation related.\nRespond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n'},
 {'role': 'user',
  'content': '\nHere are the contents of a website.\nProvide a short summary of this website.\nIf it includes news or announcements, then summarize these too.\n\n                                      \nLimited time offer for Indian SMBs - 40% off Sales Cloud, Marketing Cloud & Service Cloud.                            \n \n                    Agentforce transforms Sales, Service, Commerce, Marketing, IT, and more by uniting apps, data,\n                    and\n                    agents on one trusted platform. Now every department is an engine for growing customer success — from\n                    Sales following up on every lead instantly to Service delivering 24/7 e

## Time to bring it together - the API for OpenAI is very simple!

In [42]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = fetch_website_contents(url)
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [43]:
summarize("https://edwarddonner.com")

'# Ed’s Nerdy Playground\n\nWelcome to Ed’s corner of the internet, where coding, LLM obsessions, and *very* amateur electronic music collide. Ed claims he’s the co-founder and CTO of Nebula.io, which is basically using AI to help people find their life’s purpose — no big deal. Previously, he was busy selling his AI startup, untapt, in 2021.\n\nIf you’re lucky enough to have survived his endless LLM monologues, you might’ve ended up taking his surprisingly popular Udemy courses (yeah, 400k students, no biggie). He promises barely to spam you if you’re visiting from there. Basically, it’s like hanging out with that one friend who knows too much about AI and won’t stop talking—but hey, now he’s charging you for it.'

In [44]:
# A function to display this nicely in the output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [45]:
display_summary("https://edwarddonner.com")

# Summary of Ed's Site

Welcome to Ed's digital playground where code meets chaos and LLMs dominate the conversation—whether you asked for it or not. Ed, the co-founder and CTO of Nebula.io, is on a mission to use AI for discovering human potential, which sounds way more noble than just chasing VC dollars. Previously, he founded an AI startup called untapt, which got snatched up in 2021—so yeah, he knows his stuff.

When he's not geeking out over large language models or drowning in Hacker News threads he barely comprehends, Ed makes amateur electronic music (brace yourself) and unexpectedly created Udemy courses that have somehow amassed 400,000 students worldwide. So if you stumbled here after one of those courses, congrats, you’re in the right place to get some occasional, genuinely valuable emails instead of spam.

No breaking news here, just a solid blend of tech expertise, AI evangelism, and a hint of humblebrag.

# Let's try more websites

Note that this will only work on websites that can be scraped using this simplistic approach.

Websites that are rendered with Javascript, like React apps, won't show up. See the community-contributions folder for a Selenium implementation that gets around this. You'll need to read up on installing Selenium (ask ChatGPT!)

Also Websites protected with CloudFront (and similar) may give 403 errors - many thanks Andy J for pointing this out.

But many websites will work just fine!

In [46]:
display_summary("https://cnn.com")

### CNN Website Summary:  
Ah yes, CNN, the place where news never sleeps and scrolling through headlines is a national pastime. This site basically offers you a 24/7 news buffet with a sprinkle of QR codes to addict you to their app—because who doesn't want breaking news in their pocket at all hours? Classic corporate footer vibes included, reminding you that they own this news kingdom till 2026 and beyond. No major announcements here, just the usual "Here’s our brand, and yeah, download our app!" pitch. Stay informed or mildly overwhelmed, your choice.

In [47]:
display_summary("https://anthropic.com")

### Anthropic: AI with a Conscience

So, Anthropic is like the *Mother Teresa* of AI companies—aiming to make sure AI doesn't go all "Skynet" on us. They’ve cooked up “Claude,” apparently the best AI model for coding, agents, and enterprise workflows, because why settle for average when you can have AI that's basically your overachieving office buddy?

Their vibe? Bold innovation *plus* thoughtful pauses to avoid turning humanity into a cautionary tale. They fancy themselves as champions of responsible AI development, juggling research, policy, and product design like a tech-savvy circus act.

Oh, and yes, cookies—they do those too, mostly to stalk you for a better user experience (or market you stuff). Because what's the internet without some digital crumbs?

In [None]:
# Step 1: Create your prompts

system_prompt = "something here"
user_prompt = """
    Lots of text
    Can be pasted here
"""

# Step 2: Make the messages list

messages = [] # fill this in

# Step 3: Call OpenAI
# response =

# Step 4: print the result
# print(