In [1]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

In [2]:
# Load environment variables in a file called .env

load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')
openai = OpenAI()


In [3]:
# A class to represent a Webpage

class Website:
    """
    A utility class to represent a Website that we have scraped
    """
    url: str
    title: str
    text: str

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [15]:
# Let's try one out

hf = Website("https://huggingface.co")
print(hf.title)
print(hf.text)

Hugging Face – The AI community building the future.
Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
NEW
Use Ollama with GGUF Models from the HF Hub
The AI community building the future.
The platform where the machine learning community collaborates on models, datasets, and applications.
Trending on
this week
Models
Collov-Labs/Monetico
Updated
6 days ago
•
1.56k
•
521
microsoft/OmniParser
Updated
1 day ago
•
4.12k
•
926
stabilityai/stable-diffusion-3.5-large
Updated
12 days ago
•
153k
•
979
genmo/mochi-1-preview
Updated
2 days ago
•
807
stabilityai/stable-diffusion-3.5-medium
Updated
3 days ago
•
16.7k
•
244
Browse 400k+ models
Spaces
Running
on
Zero
433
📈
IC Light V2
Running
on
Zero
782
🏃
Stable Diffusion 3.5 Large
Generate images with SD3.5
Running
on
CPU Upgrade
4.76k
👕
Kolors Virtual Try-On
Running
on
Zero
1.06k
🗣️
F5-TTS
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Running
on
Zero
4.82k
🖥️
FLUX.1 [dev]
Browse 150k+ applications
Datase

In [16]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [17]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "The contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [18]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]


In [19]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [20]:
summarize("https://huggingface.co")

'# Hugging Face Website Summary\n\nHugging Face is a collaborative platform centered on the machine learning community, offering a wide range of resources to build and share machine learning models, datasets, and applications. It serves as a hub for creators to explore and collaborate on over 400,000 models and 100,000 datasets across various modalities including text, image, video, audio, and 3D.\n\n## Key Features:\n- **Models and Datasets**: Users can access and contribute to a vast library of machine learning models and datasets.\n- **Spaces**: Interactive applications showcasing various models are available, with several running demos.\n- **Open Source Tools**: Hugging Face provides an array of open-source libraries including Transformers, Diffusers, and Safetensors among others, supporting state-of-the-art ML applications.\n- **Enterprise Solutions**: Paid compute and enterprise offerings allow organizations to deploy optimized inference endpoints and access advanced features sui

In [21]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))


In [22]:
display_summary("https://huggingface.co")

# Hugging Face Summary

The Hugging Face website serves as a collaborative platform dedicated to the machine learning community, focusing on models, datasets, and various applications. 

## Key Features
- **Models:** Offers access to over 400,000 machine learning models, trending examples include:
  - **Collov-Labs/Monetico** (1.56k updates)
  - **microsoft/OmniParser** (4.12k updates)
  - **stabilityai/stable-diffusion-3.5-large** (153k updates)

- **Spaces:** This section allows users to run numerous applications, including those for image generation and voice cloning, with some popular examples listed:
  - **Stable Diffusion 3.5 Large**
  - **F5-TTS**

- **Datasets:** Users can browse and utilize over 100,000 datasets tailored for various tasks; recent updates include:
  - **fka/awesome-chatgpt-prompts** (7.93k updates)
  - **BAAI/Infinity-MM** (31.5k updates)

## New Announcement
- **Use Ollama with GGUF Models:** Highlights recent improvements regarding model compatibility and usage.

## Community Engagement
- The platform encourages users to share their work and collaborate, aiming to build a robust machine learning portfolio and enhance collaborative tools.

## Pricing
- Offers a range of compute solutions starting at $0.60/hour for GPU, and enterprise solutions beginning at $20/user/month, indicating a focus on both individual and organizational needs in machine learning.

Overall, Hugging Face positions itself as a central hub for machine learning practitioners to accelerate development, deployment, and collaboration in AI.

In [23]:
display_summary("https://docs.vllm.ai/en/latest")

# Summary of vLLM Website

The vLLM website serves as the official documentation and resource hub for the vLLM library, which is designed for efficient and easy serving of large language models (LLMs). It emphasizes high-performance capabilities such as state-of-the-art serving throughput, efficient memory management, and support for various hardware platforms, including NVIDIA, AMD, and Intel.

## Key Features:

- **Fast and Efficient**: Implements PagedAttention for memory management and supports continuous batching for improved throughput.
- **Flexible Deployment**: Compatible with popular APIs and various deployment options like Docker, Kubernetes, and multiple cloud services.
- **Model Support**: Offers seamless integration with HuggingFace models and supports multiple quantization protocols.
- **Community Engagement**: Hosts events like vLLM Meetups to foster community participation.

## Documentation Highlights:

- Installation guides for multiple platforms (ROCm, OpenVINO, CPU, etc.)
- Quickstart tutorials and examples to help users get up and running quickly.
- Detailed usage documentation for different deployment scenarios and model integrations.

The website also links to external resources like a blog post introducing PagedAttention and a research paper discussing their throughput enhancements.

For further information, users can explore the community-oriented features and follow updates related to the library’s development.