### Import libraries

In [1]:
import os                                 # Standard library to access system environment variables
from dotenv import load_dotenv            # Loads environment variables from a .env file
from scraper import fetch_website_contents # Custom module to scrape and extract website content
from IPython.display import Markdown, display # Utilities for rich text rendering in Jupyter Notebooks
from openai import OpenAI                 # Official OpenAI client for API interactions

# Initialize environment variables and the OpenAI client before proceeding.

### Load environment variables

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)                 # Load .env file and overwrite existing environment variables
api_key = os.getenv('OPENAI_API_KEY')      # Retrieve the OpenAI API key from environment variables

### Types of prompts

Models like GPT have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [None]:
system_prompt = """
You are a detail oriented assistant that analyzes the contents of a website,
and provides a summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.

"""

In [None]:
user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.

"""

## And now let's build useful messages for GPT-4.1-mini, using a function

In [15]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

In [17]:
# Try this out, and then try for a few more websites

messages_for(roy)

[{'role': 'system',
  'content': '\nYou are a kind assistant that analyzes the contents of a website,\nand provides a kind and detailed summary, ignoring text that might be navigation related.\nRespond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n'},
 {'role': 'user',
  'content': "\nHere are the contents of a website.\nProvide a short summary of this website.\nIf it includes news or announcements, then summarize these too.\nRoy Lee\n\nRoy Lee\nRoy Lee's playground\nHOME\nABOUT\nCATEGORIES\nTAGS\nARCHIVES\nHome\nRoy Lee\nCancel\nIntro to Quantization\nComplete Guide to LLM Quantization & Compression: Summary 1. Overview & Key Statistics Objective: Enable large-scale models (70B+) to run on standard gaming laptops or small GPU environ...\nDec 30, 2025\nLLM\nIntro to Multiprocessing\n1. Overview of Multiprocessing Key Concepts Definition: A technique that replicates processes to run independently within separate memory spaces. Structure: Each p

In [21]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = fetch_website_contents(url)
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [24]:
summarize("https://www.roy-lee-ai.com")

'# Website Summary: Roy Lee\'s Playground\n\nThis website serves as a personal blog and knowledge hub curated by Roy Lee, primarily focused on advanced topics related to machine learning (ML), large language models (LLMs), and programming concepts.\n\n## Main Content\n\n- **Tutorials and Guides:** The site hosts detailed articles like "Complete Guide to LLM Quantization & Compression," "Intro to Multiprocessing," "Introduction to Asynchronous Programming," and advice on "How to Choose the Right LLM."\n- **Educational Projects:** There is content aimed at beginners, such as "My First ML Project," which guides readers through their initial machine learning algorithm based on a noted Korean textbook.\n- **Focus Areas:** Key themes include LLM operations (LLMOps), asynchronous programming with asyncio, multiprocessing, quantization, and general machine learning principles.\n\n## News and Announcements\n\n- The site notes when content has been recently updated, highlighting articles such as