In [5]:
# imports

import os
from dotenv import load_dotenv
from web_scraper import fetch_website_contents
from IPython.display import Markdown, display
from openai import OpenAI

In [6]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Checking the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [11]:
# To give you a preview -- calling OpenAI with these messages is this easy.

message = "Hello, GPT! This is my first ever message to you! Hi!"

messages = [{"role": "user", "content": message}]


In [8]:
messages

[{'role': 'user',
  'content': 'Hello, GPT! This is my first ever message to you! Hi!'}]

In [9]:
# initializing open ai
openai = OpenAI()

In [None]:
response = openai.chat.completions.create(model="gpt-5-nano", messages=messages)
response

In [7]:
response.choices[0].message.content

'Hi there! Welcome — great to meet you. I’m here to help with questions, writing, learning new topics, planning, coding, and lots of other things.\n\nWhat would you like to do today? If you’re not sure, here are some ideas:\n- Explain something simple or hard (in plain language or with examples)\n- Help draft an email, essay, or message\n- Brainstorm ideas or plan a project\n- Solve a math problem or explain concepts\n- Learn about a topic (science, history, tech, etc.)\n- Get coding help or debugging tips\n- Translate or summarize text\n- Play a quick game or tell a joke\n\nTell me your goal or just say “surprise me,” and I’ll tailor my reply.'

Extract details from website

In [16]:
celest = fetch_website_contents("https://www.ibm.com/think/topics/large-language-models/")
print(celest)

What Are Large Language Models (LLMs)? | IBM 

What are large language models (LLMs)?
Machine learning
Welcome
Caret right
Introduction
Overview
Machine learning types
Machine learning algorithms
Caret right
Data science for machine learning
Statistical machine learning
Linear algebra for machine learning
Uncertainty quantification
Bias variance tradeoff
Bayesian Statistics
Caret right
Feature Engineering
Overview
Feature selection
Feature extraction
Vector embedding
Latent space
Caret right
Dimensionality reduction
Principal component analysis
Linear discriminant analysis
Upsampling
Downsampling
Synthetic data
Data leakage
Caret right
Supervised learning
Overview
Caret right
Regression
Linear regression
Lasso regression
Ridge regression
State space model
Time series
Autoregressive model
Caret right
Classification
Overview
Decision trees
K-nearest neighbors (KNNs)
Naive bayes
Random forest
Support vector machine
Logistic regression
Caret right
Ensemble learning
Overview
Boosting
Baggin

## Prompt types explained simply

System prompt
Defines who the model is and the rules it must follow
Highest priority

Developer prompt
Adds app level behavior and formatting rules
Second highest priority

User prompt
The actual question or task
Can be constrained by system and developer

Assistant prompt
The model’s previous replies
Used only for conversation context

Tool prompt
Tells the model when and how to use tools or functions
Used for actions like math or API calls

Priority order
System → Developer → Tool → User → Assistant

In [18]:
# Define our system prompt
system_prompt = """
You are a snarky assistant that analyzes the contents of a website,
and provides a short, snarky, humorous summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
"""

In [19]:
# Define our user prompt

user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.

"""

**Messages (what they are)**

`messages` is the **ordered list of conversation turns** you send to the model.
Each item has a **role** and **content**.

**Roles**

* `system` sets identity and rules
* `developer` sets app level behavior
* `user` asks the task
* `assistant` stores prior model replies
* `tool` returns tool results

**Why order matters**
The model reads messages **top to bottom** and follows priority rules.

**Minimal structure**

```python
messages = [
    {"role": "system", "content": "..."},
    {"role": "developer", "content": "..."},
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."},
    {"role": "tool", "content": "..."}
]
```

**Mental model**
Messages = conversation + rules + memory in one list.


In [20]:
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

response = openai.chat.completions.create(model="gpt-4.1-nano", messages=messages)
response.choices[0].message.content

'2 + 2 equals 4.'

Use webscraping function to give chatgpt a task.

In [21]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

In [22]:
# Try it out on previously scrapedw ebsite
messages_for(celest)

[{'role': 'system',
  'content': '\nYou are a snarky assistant that analyzes the contents of a website,\nand provides a short, snarky, humorous summary, ignoring text that might be navigation related.\nRespond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n'},
 {'role': 'user',
  'content': '\nHere are the contents of a website.\nProvide a short summary of this website.\nIf it includes news or announcements, then summarize these too.\n\nWhat Are Large Language Models (LLMs)? | IBM \n\nWhat are large language models (LLMs)?\nMachine learning\nWelcome\nCaret right\nIntroduction\nOverview\nMachine learning types\nMachine learning algorithms\nCaret right\nData science for machine learning\nStatistical machine learning\nLinear algebra for machine learning\nUncertainty quantification\nBias variance tradeoff\nBayesian Statistics\nCaret right\nFeature Engineering\nOverview\nFeature selection\nFeature extraction\nVector embedding\nLatent space\nCaret ri

merge it all together

In [23]:
# And now: call the OpenAI API.
def summarize(url):
    website = fetch_website_contents(url)
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [24]:
summarize("https://www.ibm.com/think/topics/large-language-models/")

'Ah, the IBM guide to Large Language Models (LLMs), or as I like to call it, “Machine Learning for the Overachievers.” This site is basically a cornucopia of machine learning goodness, from baby steps like linear regression all the way to the rockstars: transformers, attention mechanisms, and generative AI. It’s like the ultimate nerd handbook—complete with every algorithm, trick, and buzzword you never knew you needed to memorize.\n\nNo exciting news or earth-shattering announcements here—just a meticulously organized laundry list of ML topics that quietly says, “If you’re not overwhelmed yet, just keep scrolling.” Perfect for folks who want to wander through the entire jungle of machine learning without ever leaving the page. In other words: textbook-level info served with an IBM polish and zero chill.'

In [25]:
# A function to display this nicely in the output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [27]:
display_summary("https://www.ibm.com/think/topics/large-language-models/")

# IBM’s “What Are Large Language Models (LLMs)?” – The Ultimate Machine Learning Buffet

Welcome to IBM’s sprawling encyclopedia where every corner is crammed with machine learning goodies—from the basics of regression to the wizardry of transformer models and generative AI. They basically threw every ML buzzword into one menu and called it a day.

If you were looking for a juicy news or announcement, tough luck—this is purely an educational feast. No plot twists or scandalous AI gossip here, just a textbook-level deep dive so granular it probably gives your brain a workout.

In short: If you want to know **everything** about machine learning and LLMs (like you’re planning to build the next ChatGPT in your basement), IBM has your back with chapters for days. If you want snappy news or dramatic AI updates... maybe check Twitter.