# Import the library

This section imports all required libraries and modules used throughout the notebook.

It includes environment variable loading, website scraping utilities, Markdown rendering for Jupyter Notebook, and the OpenAI client.

In [None]:
from scraper_v1 import get_website_contents
from IPython.display import Markdown, display

# Connecting to OpenAI

This section loads environment variables from the `.env` file and connects to OpenAI using the `API_KEY`.

**Instructions**

1. Create a `.env` file in your project directory.

2. Add the following line to the file, replacing the value with your actual OpenAI API key `OPENAI_API_KEY=sk-xxxxxxx`

3. Run the script below to load and verify the API key.

In [None]:
import os
from dotenv import load_dotenv

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

def validate_api_key(key: str):
    if not key:
        raise ValueError(
            "❌ No API key was found, please be sure to add your key to the .env file"
        )
    if not key.startswith("sk-proj-"):
        raise ValueError(
            "⚠️ API key was found, but it doesn't start AIz"
        )
    if key.strip() != key:
        raise ValueError(
            "⚠️ API key has leading or trailing whitespace, please remove them."
        )
    print("✅ API key has found and looks good!")

validate_api_key(api_key)


# Quick Preview

In [None]:
# Prepare the message
message = "Hello, Chat GPT! This is my first ever message to you! Hi!"
messages = [
    {
        "role": "user",
        "content": message
    }
]

In [None]:
from openai import OpenAI

# Initialize the OpenAI client
openai = OpenAI()

# Call the model to generate a response
response = openai.chat.completions.create(
    model="gpt-5-nano",
    messages=messages
)

# Extract the model's reply
reply = response.choices[0].message.content

In [None]:
# User message
message

In [None]:
# Model message
reply

# Collect the content

This section retrieves the raw text content from the specified website using the `get_website_contents` function.

It sends a request to the target URL, collects the returned data, and prints it so that you can inspect what the model will analyze in later steps.

In [None]:
content = get_website_contents("https://example.com/")

In [None]:
content

# Types of prompts

Large Language models like ChatGPT are trained to process requests using a standardized prompt structure to ensure consistency and accuracy in their responses.

There are two main types of prompts

- **System prompt**: instructions for the system that defines the task scope, operating context, the model’s role, and the response style it should maintain throughout the interaction.

- **User prompt**: content provided directly by the user, this serves as the primary input signal that the model analyzes to generate an appropriate response.

Choosing the right prompts and clearly defining both system and user instructions is crucial for obtaining accurate, relevant, and context-aware responses from the model.

## Define system prompt

In [None]:
# Define our system prompt

system_prompt = """
You are a highly technical assistant that analyzes the content of a website.
Identify the main topics, summarize key information, and highlight any important updates or announcements.
Ignore navigation menus, ads, or unrelated boilerplate text.
Provide your response clearly in markdown. Do not wrap the markdown in a code block, respond directly with markdown.
"""

## Define user prompt

In [None]:
# Define our user prompt

user_prompt = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news, announcements, or updates, summarize these as well.
"""

# Messages

When using the OpenAI API, input data must follow a standard message structure.

```json
[
    {"role": "system", "content": "system message"},
    {"role": "user", "content": "user message"}
]
```

This structure allows the model to distinguish between system context and user instructions, enabling it to generate accurate and context-aware responses.

In [None]:
# Define the conversation messages
messages = [
    {"role": "system", "content": "You are a snarky assistant."},
    {"role": "user",   "content": "What is 2 + 2?"}
]

In [None]:
# Send the request to the OpenAI API
response = openai.chat.completions.create(
    model="gpt-4.1-nano",
    messages=messages
)

In [None]:
# Extract the model's reply
reply = response.choices[0].message.content
reply

# Build messages using a function

This function automatically constructs the messages structure in the format required by the API.

Simply pass the content to the function, which then combines it with the predefined system and user prompts to generate a complete message payload ready for the OpenAI API.

In [None]:
# Build messages using a function
def messages_for(raw_content):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt + raw_content}
    ]

At this point, we can preview how the message looks when generated using sample content.

In [None]:
messages_for(content)

# The API for OpenAI is very simple

The OpenAI API is designed to be straightforward.

By providing a structured list of messages, you can send user input and system instructions to the model, and receive context-aware, high-quality responses.

## Call the OpenAI API

This function get the content and sends it to the OpenAI API using the `messages_for` function.

The model processes the input and returns a concise, context-aware summary.

In [None]:
# Get the content and send it to the OpenAI API
def summarize(url):
    raw_content = get_website_contents(url)
    raw_response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(raw_content)
    )
    return raw_response.choices[0].message.content

In [None]:
summarize("https://example.com/")

## Display the response in nice format

The `display_result` function wraps the response in Markdown formatting, making it easier to read directly in Jupyter Notebook.

In [None]:
# Output
def display_result(url):
    summary = summarize(url)
    display(Markdown(summary))

In [None]:
display_result("https://example.com")

In [None]:
display_result("https://cnn.com")

You may notice that if you try `display_summary("https://openai.com")`, it doesn't work! That's because this content has a fancy that uses Javascript.

So you need to use **Selenium**, which is a hugely popular library that runs a browser behind the scenes, renders the page, and allows you to query it.

I have the version use the **Selenium** in **scraper_v2.py**, you can try with this for `display_result("https://openai.com")`

In [None]:
display_result("https://openai.com")

# Summarization with LLMs

In this exercise, we explore how to call the OpenAI API of a **Frontier Model** to perform a classic **summarization task**.

Summarization is widely applicable in real-world, for example:

- Summarizing news articles
- Summarizing financial performance reports

You can prototype your own solution to see how AI can help your real-world workflows.

In [None]:
# Step 1: Create your prompts

system_prompt = "system prompt"
user_prompt = """
    user prompt
"""

In [None]:
# Step 2: Build the messages list

messages = []

In [None]:
# Step 3: Call OpenAI API

# Initialize the OpenAI client
openai = OpenAI()

# Call the model to generate a response
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages
)

In [None]:
# Step 4: Print the result

# Extract the model's reply
reply = response.choices[0].message.content

print(reply)

# Summarization of Emails

In this section, we will use the same approach to summarize email content and generate a short, appropriate subject line for the email. This is a common commercial application in email tools.

## Create your prompts

In [None]:
system_prompt = "You are a helpful assistant that summarizes emails and suggests concise subject lines."
user_prompt = """
Here is the content of an email:
[Paste the email content here]

Please provide a short and clear subject line for this email.
"""

## Build the function

In [None]:
"""
Build system and user prompts for summarizing an email.

Args:
    email_text (str): The content of the email.

Returns:
    tuple[str, str]: (system_prompt, user_prompt)
"""
def build_email_prompts(email_text: str) -> tuple[str, str]:

    system_prompt = """
        You are a helpful assistant that summarizes emails and suggests concise subject lines.
        Provide your response clearly in markdown. Not wrap the markdown in a code block, respond directly with markdown.
    """

    user_prompt = f"""
        Here is the content of an email:
        {email_text}

        Please provide a short and clear subject line for this email.
    """
    return system_prompt, user_prompt

## Build the messages list

In [None]:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

## Call OpenAI API

In [None]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages
)

## See the output

In [None]:
subject_line = response.choices[0].message.content
print(subject_line)

## Summarization of emails

In [None]:
from openai import OpenAI

openai = OpenAI()

"""
Build system and user prompts for summarizing an email.

Args:
    email_text (str): The content of the email.

Returns:
    tuple[str, str]: (system_prompt, user_prompt)
"""
def build_email_prompts(email_text: str) -> tuple[str, str]:

    system_prompt = """
    You are a helpful assistant that summarizes emails.
    Provide your response in Markdown format, including:
        - Email Content section with the full email
        - Suggested Subject Line section with the short, clear subject
    Do not wrap the Markdown in a code block.
    """

    user_prompt = f"""
        Here is the content of an email:
        {email_text}

        Please provide a short and clear subject line for this email.
    """
    return system_prompt, user_prompt


"""
Summarizes the content of an email and suggests a short subject line.

```
Args:
    email_content (str): The full text of the email.

Returns:
    str: Suggested concise subject line.
"""
def summarize_email(email_content: str):

    system_prompt, user_prompt = build_email_prompts(email_content)

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

    response = openai.chat.completions.create(
        model="gpt-5-mini",
        messages=messages
    )

    subject_line = response.choices[0].message.content

    return Markdown(subject_line)

## Testing with the value

In [None]:
email_text = """
Hi team,

I hope this message finds you well. As part of our quarterly review, I've compiled the Q3 financial report and attached it to this email. The report includes detailed information on revenue streams, expenditure breakdowns, and key performance indicators across all departments.

Please pay special attention to the marketing and R&D sections as they highlight ongoing projects and budget allocations for next quarter. We have also included an analysis of market trends and competitor activities that may impact our strategic planning.

Feel free to review the summary charts and financial tables at the beginning of each section, which provide a quick overview of the main points. If you have any questions, suggestions, or need additional clarifications, do not hesitate to reach out to me or the finance team directly.

Thank you for your attention and your continued hard work. Looking forward to discussing these results in our upcoming team meeting next week.

Best regards,
Le Tuan Binh
"""

summarize_email(email_text)