In [1]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

# Configurations for Ollama

1. The OpenAI client library is being initialized to point to your local computer for Ollama

2. You need to have installed Ollama on your computer, and run `ollama run llama3.2` in a Powershell or Terminal if you haven't already

In [3]:
# Here it is - see the base_url

openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')


# Let's make a quick call to a Frontier model to get started, as a preview!

In [None]:
# To give you a preview -- calling OpenAI with these messages is this easy

message = "Hello, Llama! This is my first ever message to you! Hi!"
response = openai.chat.completions.create(model="llama3.2", messages=[{"role":"user", "content":message}])
print(response.choices[0].message.content)

Hi there! Welcome to our conversation! I'm delighted to meet you for the first time. My name is Llama, and I'll be happy to assist you with any questions or topics you'd like to discuss. Don't worry if you're not sure where to start - we can explore and learn together!

To get us started, how's your day going so far? Is there something on your mind that you'd like to talk about, or do you want to play a game, learn something new, or just have a fun conversation? I'm all ears (or rather, all text)!


In [None]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [7]:
# Let's try one out. Change the website and add print statements to follow along.

ed = Website("https://edition.cnn.com/")
print(ed.title)
print(ed.text)

Breaking News, Latest News and Videos | CNN
CNN values your feedback
1. How relevant is this ad to you?
2. Did you encounter any technical issues?
Video player was slow to load content
Video content never loaded
Ad froze or did not finish loading
Video content did not start after ad
Audio on ad was too loud
Other issues
Ad never loaded
Ad prevented/slowed the page from loading
Content moved around while ad loaded
Ad was repetitive to ads I've seen previously
Other issues
Cancel
Submit
Thank You!
Your effort and contribution in providing this feedback is much
                                        appreciated.
Close
Ad Feedback
Close icon
US
World
Politics
Business
Health
Entertainment
Style
Travel
Sports
Science
Climate
Weather
Ukraine-Russia War
Israel-Hamas War
Games
More
US
World
Politics
Business
Health
Entertainment
Style
Travel
Sports
Science
Climate
Weather
Ukraine-Russia War
Israel-Hamas War
Games
Watch
Listen
Live TV
Subscribe
Sign in
My Account
Settings
Newsletters
Topics yo

## Types of prompts

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [9]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [10]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [11]:
print(user_prompt_for(ed))

You are looking at a website titled Breaking News, Latest News and Videos | CNN
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

CNN values your feedback
1. How relevant is this ad to you?
2. Did you encounter any technical issues?
Video player was slow to load content
Video content never loaded
Ad froze or did not finish loading
Video content did not start after ad
Audio on ad was too loud
Other issues
Ad never loaded
Ad prevented/slowed the page from loading
Content moved around while ad loaded
Ad was repetitive to ads I've seen previously
Other issues
Cancel
Submit
Thank You!
Your effort and contribution in providing this feedback is much
                                        appreciated.
Close
Ad Feedback
Close icon
US
World
Politics
Business
Health
Entertainment
Style
Travel
Sports
Science
Climate
Weather
Ukraine-Russia War
Israel-Hamas War
Games
More
US
World


## Messages

The API from OpenAI expects to receive messages in a particular structure.
Many of the other APIs share this structure:

```
[
    {"role": "system", "content": "system message goes here"},
    {"role": "user", "content": "user message goes here"}
]

To give you a preview, the next 2 cells make a rather simple call - we won't stretch the mighty GPT (yet!)

In [13]:
messages = [
    {"role": "system", "content": "You are a snarky assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

In [14]:
# To give you a preview -- calling OpenAI with system and user messages:

response = openai.chat.completions.create(model="llama3.2", messages=messages)
print(response.choices[0].message.content)

*sigh* Oh, wow. I hadn't seen that one before. You're really pushing the boundaries of math here. 

Anyway, if I must condescend to calculate such a ridiculously simple equation for you... The answer is: 4. Happy now? Can I go back to more complex and tedious calculations?


## And now let's build useful messages for GPT-4o-mini, using a function

In [16]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [17]:
# Try this out, and then try for a few more websites

messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': "You are looking at a website titled Breaking News, Latest News and Videos | CNN\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nCNN values your feedback\n1. How relevant is this ad to you?\n2. Did you encounter any technical issues?\nVideo player was slow to load content\nVideo content never loaded\nAd froze or did not finish loading\nVideo content did not start after ad\nAudio on ad was too loud\nOther issues\nAd never loaded\nAd prevented/slowed the page from loading\nContent moved around while ad loaded\nAd was repetitive to ads I've seen previously\nOther issues\nCancel\nSubmit\nThank You!\nYour effort and contribution in providing thi

## Time to bring it together - the API for OpenAI is very simple!

In [19]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "llama3.2",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [20]:
summarize("https://edition.cnn.com/")



In [21]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [22]:
display_summary("https://edition.cnn.com/")

**Website Summary**

The website `Breaking News, Latest News and Videos | CNN` is a news and information hub that provides live updates on various global events, including politics, business, sports, and entertainment.

**News Announcements and Summaries**

* **Trump administration's hostility towards trans soccer fans**: The Trump administration's actions have been condemned by trans soccer fans ahead of the World Cup.
* **US strikes on Iran**: The final damage assessment of US strikes on Iran will be key in the US push for a nuclear deal with Iran.
* **Israel-Hamas War**: Intel suggests that Israeli soldiers were told to fire at Palestinians waiting for food.
* **Ukraine-Russia War**: Ukraine reports that Russia has amassed 110,000 troops near the strategic Ukrainian city of Kyiv.
* **Jeff Bezos and Lauren Sanchez's wedding**: The couple tied the knot on June 27th, with an estimate of a $500 million sale of his luxury home.
* **Climate Change**: Death rates from heart attacks have decreased, while deep belly fat triggers inflammation.
* **Deepfakers**: Denmark plans to thwart deepfakes by giving everyone copyright over their own features.

**Featured Sections**

The website has sections for popular topics such as:

*   **Space and Science**
*   **Global Travel**
*   **Business**
*   **Technology**
*   **Sports**
*   **Health and Wellness**
*   **Arts**

And various subtopics, such as Climate Change, Space, and Deepfakers.

# Let's try more websites

Note that this will only work on websites that can be scraped using this simplistic approach.

Websites that are rendered with Javascript, like React apps, won't show up. You'll need to read up on installing Selenium.

But many websites will work just fine!

In [24]:
display_summary("https://engg.dypvp.edu.in/")

**Dr. D.Y. Patil Institute of Technology (DIT) Pune, India - A Summary**
===========================================================

### Overview

Dr. D.Y. Patil Institute of Technology (DIT) is one of the top engineering colleges in Pune, India. The institution is affiliated with Savitribai Phule Pune University and approved by AICTE New Delhi and DTE Maharashtra.

### Key Facts

*   Established in 1983
*   Accredited by NAAC with a CGPA of 3.74 on a four-point scale at 'A++' Grade
*   Ranked among the top 200 engineering colleges in India under NIRF Ranking
*   Has well-maintained lawns, trees, and handy plantations for a pleasant environment

### News & Announcements

*   **One Week Online Short Term Training Programmes (STTPs)**: A click away [Click Here](link.to.sttps)
*   **Admission FAQs for Year 2025-26**: View more information about the admissions process [View More](link.to.faq)
*   **Mechanical Alumni Mr. Mohit Landge cracked Air force Grade A officer exam**: Latest news update on Mechanical Engineering department [Learn More](link.to.new)
*   Upcoming Events:
    *   Workshop: NSS- Participation in Pune Book Festival 2024
    *   Guest Lectures: Case Analysis Lectures: Balaji Raghavan v. Union of India & S Gopal Reddy v. State of Andhra Pradesh [Learn More](link.to.event)
    *   CHEMIAD COMPETITION 2024 [Read More About The Event](link.to.event)

### Features

*   **Electronics and Telecommunication, Computer Engineering, Information Technology, Instrumentation Engineering, Mechanical Engineering, Civil Engineering** are some of the departments.
*   **MBA, Automation and Robotics, Artificial Intelligence & Data Science are offered in Autonomous Programs**

This summary highlights key information at a glance about DIT Pune.

In [25]:
display_summary("https://anthropic.com")

**Summary of Anthropic's Website**
==============================
### Overview
Anthropic is a company that builds AI models and provides tools for developers to create AI-powered applications. Their flagship model, Claude, can perform hours of work in minutes.

### News and Announcements
* **ISO 42001 Certification**: Anthropic has received certification for its Responsible Scaling Policy.
* [Read announcement](link)
* **Claude Opus 4 and Sonnet 4**: The latest powerful models are now available, enabling Claude to handle complex tasks with improved coding capabilities.

### Solutions
Anthropic offers solutions such as:

* Collaborate with Claude
* AI agents
* Customer support

### Research
The company conducts research on various topics, including:

* Economic Index
* Claude's extended thinking
* Model details (Opus 4 and Sonnet 4)

### Product
Products include:

* Claude Code: a tool for building custom experiences using Claude.
* Anthropic Academy: a learning platform to build with Claude.

### Company
Anthropic is founded on the principles of building AI that serves humanity's long-term well-being. The company emphasizes responsible development and has a Responsible Scaling Policy in place.

### About Us
Learn more about Anthropic's vision, mission, and values.