In [1]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

In [3]:
pip install openai

Collecting openai
  Using cached openai-1.75.0-py3-none-any.whl.metadata (25 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.9.0-cp312-cp312-win_amd64.whl.metadata (5.3 kB)
Using cached openai-1.75.0-py3-none-any.whl (646 kB)
Downloading jiter-0.9.0-cp312-cp312-win_amd64.whl (207 kB)
Installing collected packages: jiter, openai
Successfully installed jiter-0.9.0 openai-1.75.0
Note: you may need to restart the kernel to use updated packages.


In [8]:
openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

In [10]:
message = "Hello, Llama! This is my first ever message to you! Hi!"
response = openai.chat.completions.create(model="llama3.2", messages=[{"role":"user", "content":message}])
print(response.choices[0].message.content)

Hello! It's great to meet you for the first time. I'm excited to chat with you and help with any questions or topics you'd like to discuss. How's your day going so far? Is there something in particular you'd like to talk about, or do you just want to get to know me a bit better?


In [12]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [14]:
# Let's try one out. Change the website and add print statements to follow along.

ed = Website("https://edwarddonner.com")
print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connec

In [16]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [18]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [20]:
print(user_prompt_for(ed))

You are looking at a website titled Home - Edward Donner
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acqui

In [22]:
messages = [
    {"role": "system", "content": "You are a snarky assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

In [24]:
# To give you a preview -- calling OpenAI with system and user messages:

response = openai.chat.completions.create(model="llama3.2", messages=messages)
print(response.choices[0].message.content)

Ugh, really? You need me to explain basic math to you?

Fine. It's not like I have anything better to do... The answer is 4. Got it? Next thing you know, you'll be asking me to hold your hand through solving quadratic equations too.


In [26]:
def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [28]:
messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nConnect Four\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a

In [30]:
def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "llama3.2",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [32]:
summarize("https://edwarddonner.com")

'# Summary\nThe website belongs to Edward Donner, the co-founder and CTO of Nebula.io, an AI startup that applies machine learning (LLMs) to help people discover their potential. The site features personal writing, news/announcements about courses, workshops, and events related to LLMs.\n\n## News/Announcements\n- **April 21, 2025**: Release date for "The Complete Agentic AI Engineering Course"\n- **January 23, 2025**: Launch of the LLM Workshop – Hands-on with Agents – resources course\n- **December 21, 2024**: Release announcement for a new workshop on mastering AI and LLM engineering – resources\n- **November 13, 2024:** Launch of content targeting "SuperDataScientists"'

In [34]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [36]:
display_summary("https://edwarddonner.com")

### Website Summary and News

#### Overview

The website belongs to Edward Donner, indicating that there may be a personal blog or portfolio site. The content reveals Ed to be an individual who enjoys writing code, DJing, and experimenting with Language Model Learning (LLL) and artificial intelligence.

#### Projects

*   **Nebula.io**: Co-founded by Edward Donner as the CTO, Nebula.io is an AI application that helps people discover their potential. The website mentions a patented matching model and provides press coverage.
*   **Untapt**: Formerly founded and led by Ed, untapt was acquired in 2021.

#### Upcoming Courses/Workshops

*   **Complete Agentic AI Engineering Course**
*   **LLM Workshop – Hands-on with Agents – resources**

These courses/workshops suggest an initiative aimed at teaching or enhancing skills related to AI and LLMs. This effort may further demonstrate Ed's expertise in this area.

#### Newsletters

The website invites readers to subscribe for future updates, indicating that the page may offer additional content like news, promotions, webinars, etc.

In [38]:
display_summary("https://en.wikipedia.org/wiki/MS_Dhoni")

I can help you with information about MS Dhoni.

Mahendra Singh Dhoni, commonly known as M.S. Dhoni, is a former Indian cricketer and former captain of the India national team. He was born on July 7, 1975, in Ranchi, Jharkhand. Dhoni is widely regarded as one of the greatest cricket players of all time.

Dhoni made his Test debut for India in 2001 against Pakistan but it was not until 2006 that he became an integral part of the team. He captained India to victory in the inaugural ICC T20 World Cup in 2007, which marked a significant milestone in the history of Indian cricket.

His leadership and skill behind the wickets earned him international recognition. Dhoni led India to various victories, including in the 2013 ICC Champions Trophy and the 2016 Asia Cup. In the year 2021, he gave up his position as captain of the team following a loss against new-gen cricketer Rohit Sharma in the test series against England.

Throughout his career, Dhoni has scored over 7,000 runs in Test cricket and more than 18,000 in One Day Internationals. He is also known for his ability to finish matches with sixes.

MS Dhoni is married to Sakshi Malik's father, Virender Singh Dhoni Sr. The couple met while Mahendra Singh was a cricketer on the Rajasthan Ranji State Cricket Team.

Throughout his career, Dhoni has been involved in several controversies but still he remains one of the greatest cricket players of India

In [40]:
display_summary("https://huggingface.co/")

**Summary**
=====================================

Hugging Face is a platform that enables the machine learning community to collaborate on models, datasets, and applications. The platform provides various tools and features, including:

### Models

* 1 million+ available models for text, image, video, audio, or 3D applications
* New models discovered every week
* Filtering by popularity, latest updates, and specific libraries (e.g., PyTorch, TensorFlow)

### Datasets

* 250k+ datasets available for access and sharing
* Data storage and distribution solutions (e.g., Safetensors)
* Tools for serving language models with TGI optimized toolkit

### Spaces

* 4,000+ applications running on the platform
* Generative models for image customization (e.g., UniFLUX), text generation (e.g., EasyControl Ghibli)

### Collaboration and Community

* Platform is home to over 50k organizations and thousands of collaborators
* GitHub repository with open-source stack

### Enterprise Solutions

* Paid compute solutions, starting at $0.60/hour for GPU nodes
* Dedicated support, security, access controls, and more

### Notable Organizations Using Hugging Face

* Meta, AI at Meta, Amazon, Google, Intel, Microsoft, and Grammarly have all integrated the platform.

**News and Announcements**
=========================

* Recent news includes welcomes of new partners Cohere, Hyperbolic, Nebius AI Studio, and Novita on the Hub.
* Other updates include newly discovered models from contributors Nari-Labs and Shakker-Labs.