# Lab1: Summarize a website on the server

### Import libraries

In [1]:
import os                                 # Standard library to access system environment variables
from dotenv import load_dotenv            # Loads environment variables from a .env file
from scraper import fetch_website_contents # Custom module to scrape and extract website content
from IPython.display import Markdown, display # Utilities for rich text rendering in Jupyter Notebooks
from openai import OpenAI                 # Official OpenAI client for API interactions

# Initialize environment variables and the OpenAI client before proceeding.

### Load environment variables

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)                 # Load .env file and overwrite existing environment variables
api_key = os.getenv('OPENAI_API_KEY')      # Retrieve the OpenAI API key from environment variables

In [3]:
openai = OpenAI()

### Types of prompts

Models like GPT have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [4]:
system_prompt = """
You are a detail oriented assistant that analyzes the contents of a website,
and provides a summary, ignoring text that might be navigation related.
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.

"""

In [5]:
user_prompt_prefix = """
Here are the contents of a website.
Provide a short summary of this website.
If it includes news or announcements, then summarize these too.

"""

### And now let's build useful messages for GPT-4.1-mini, using a function

In [6]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + website}
    ]

### Call API

In [7]:
def summarize(url):
    """
    Fetches content from a given URL and generates a summary using a GPT model.
    """
    # Scrape or pull the text content from the specified website URL
    website = fetch_website_contents(url)
    
    # Request a summary from the OpenAI Chat Completion API
    # Note: Ensure "gpt-4.1-mini" is the correct model name (e.g., "gpt-4o-mini")
    response = openai.chat.completions.create(
        model = "gpt-4.1-mini",
        messages = messages_for(website) # Construct the prompt with the fetched content
    )
    
    # Extract and return the final text content from the model's response
    return response.choices[0].message.content

In [8]:
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [10]:
display_summary("https://www.roy-lee-ai.com/about")

# Summary of Roy Lee's Website

This website serves as a personal and professional introduction to Roy Lee, detailing his educational background, career experiences, and interests.

## Key Information:

- **Background**: Roy Lee holds a B.S. in Data Analytics from Ohio State University and is currently pursuing an M.S. in Computer Science specializing in AI at Georgia Institute of Technology. He has work authorization in the U.S. and is seeking a full-time position or internship in AI/LLM.

- **Education at Ohio State**: Took extensive coursework in math, statistics, and computer science. Participated in research and internships that helped build a foundation for data science and AI engineering. Emphasized the value of teamwork learned from these experiences.

- **Career Experiences**:
  - **Turing** - LLM Trainer: Created and managed complex training instructions and scenarios for foundational Large Language Models, including client-specific model training focused on instruction-following and reasoning.
  - **Uber AI Solutions** - AI Trainer: Annotated data for Reinforcement Learning with Human Feedback (RLHF), performed quality assurance, and translated data into actionable business insights.
  - **Honda America Inc.** - Student ML Researcher: Built machine learning models to forecast profits, handled datasets with high missing data rates, and presented findings to senior managers.
  - **Fisher College of Business at Ohio State** - Research Assistant: Worked on a machine learning project involving human name recognition using Python and the spaCy NLP library.

- **Personal Interests**: Working out, creating digital content, reading, and watching Netflix.

## News or Announcements:
No specific news or announcements are provided on the website. The content is primarily a professional portfolio and background summary.