# Booter Website Classification using AI

Last updated: 11/02/2025

Several academic papers have explored the classification of websites that offer DDoS-as-a-Service attacks, commonly known as Booter Websites. Notable examples include:
1. J. J. Chromik, J. J. Santanna, A. Sperotto, and A. Pras. **Booter Websites Characterization: Towards a List of Threats.** Brazilian Symposium on Computer Networks and Distributed Systems (SBRC), 2015.
2. J. J. Santanna, R. de O. Schmidt, D. Tuncer, J. de Vries, L. Granville, and A. Pras. **Booter Blacklist: Unveiling DDoS-for-Hire Websites.** International Conference on Network and Service Management (CNSM), 2016.
3. J. J. Santanna, R. de O. Schmidt, D. Tuncer, J. de Vries, L. Zambenedetti Granville, and A. Pras. **Booter List Generation: The Basis for Investigating DDoS-for-Hire Websites.** International Journal on Network Management (IJNM), 2017.

These studies were consolidated into **Chapter 2** of:
- J. J. Santanna. **DDoS-as-a-Service: Investigating Booter Websites.** PhD Thesis, University of Twente.
  - Download here: https://bit.ly/jjsantanna_thesis
  - Source code: https://github.com/jjsantanna/Booter-black-List/tree/master/Classifier

### Opportunity
#### With advancements in Large Language Models (LLMs), we aim to explore whether LLMs provide a more efficient and accurate approach for the automated classification of Booter Websites.

### Methodology

1. Get list of URLs to be classified as booter or not:
    1. Offline.
        1. Get an offline list of URLs (ex. https://github.com/jjsantanna/booters_ecosystem_analysis/blob/master/booterblacklist.csv)
        2. Get latest snapshot of an URL from Web Archive (https://webcf.waybackmachine.org/); Check the status 301 & 302 (redirect); Get the redirected URL if exist
    2. Online.
        1. VPN connection from different location (from a list of countries)
        2. Get a list of URLs/Websites using Google search for 'booter', 'stresser'
    
2. Classify URL whether a booter or not: 
    1. Visual approach.
        1. Take a screenshot of the landing page of an URL
        2. Use a Visual (or multimodal) LLM to classify the image as a Booter Webpage
    2. Text approach.
        1. Scrape URL
        2. Use a Text LLM to classify the content whether a Booter Webpage or not

## 1.A.a. Get an offline list of booter websites from the https://bit.ly/jjsantanna_thesis

In [30]:
# !pip install pandas
import pandas as pd

url = "https://githubraw.com/jjsantanna/booters_ecosystem_analysis/master/booterblacklist.csv"
df = pd.read_csv(url, storage_options={"User-Agent": "Mozilla/5.0"},index_col=0, names=["id", "booter_url"], header=None).reset_index(drop=True)
df.head()

Unnamed: 0,booter_url
0,0x-booter.pw
1,123boot.pro
2,1606-stresser.net
3,9yrbrfyd.esy.es
4,absolut-stresser.net


## 1.A.b. Get lastest snapshot of an URL from Web Archive

In [192]:
import requests
from waybackpy import WaybackMachineCDXServerAPI
from urllib.parse import unquote

def get_latest_archived_info(url):
    # Initialize the CDX Server API
    cdx = WaybackMachineCDXServerAPI(url)
    
    # Retrieve the list of snapshots
    snapshots = list(cdx.snapshots())
    
    if snapshots:
        # Get the latest snapshot
        latest_snapshot = snapshots[-1]
        
        # Extract the archive URL
        archive_url = latest_snapshot.archive_url
        
        # Extract the HTTP status code
        status_code = latest_snapshot.statuscode
        
        # Check if the status code indicates a redirect (3xx)
        if status_code.startswith('3'):
            try:
                # Make a GET request to follow all redirects
                response = requests.get(archive_url, 
                                     allow_redirects=True,
                                     timeout=10)  # Added timeout
                
                # Get the final URL after all redirects
                final_url = response.url
                
                # If it's a Wayback Machine URL, try to extract the original URL
                if 'web.archive.org' in final_url:
                    # Extract the original URL from the Wayback Machine URL
                    parts = final_url.split('web.archive.org/web/')
                    if len(parts) > 1:
                        timestamp_and_url = parts[1]
                        # Remove the timestamp (first 14 characters) to get the original URL
                        redirect_url = unquote(timestamp_and_url[14:])
                    else:
                        redirect_url = final_url
                else:
                    redirect_url = final_url
                
                return {
                    'archive_url': archive_url,
                    'status_code': status_code,
                    'redirect_url': redirect_url
                }
            
            except requests.exceptions.RequestException as e:
                return {
                    'archive_url': archive_url,
                    'status_code': status_code,
                    'redirect_url': None,
                    'error': str(e)
                }
        else:
            return {
                'archive_url': archive_url,
                'status_code': status_code,
                'redirect_url': None
            }
    else:
        return "No archived version found."

Example:

In [193]:
url = df['booter_url'][50]
url

'beststresser.com'

In [196]:
# url = 'https://stressers.zone/'

In [195]:
archived_url = get_latest_archived_info(url)
archived_url

{'archive_url': 'https://web.archive.org/web/20241225182607/https://stressers.zone/',
 'status_code': '-',
 'redirect_url': None}

## 2.A.a Take a screenshot of the landingpage of an URL

In [188]:
# !pip install selenium pillow webdriver-manager

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from PIL import Image
import time
import datetime
import re

def format_filename_from_url(url):
    # Remove '://' and replace '.' with '_'
    formatted_url = url.replace('://', '_')
    formatted_url = formatted_url.replace('/', '')
    formatted_url = formatted_url.replace('.', '_')  # Replace all '.' with '_'
    # Get current date in YYMMDD format
    date_str = datetime.datetime.now().strftime('%y%m%d')
    return f"{formatted_url}_{date_str}.png"

def capture_screenshot(url, width=1920, height=1080):
    output_filename = format_filename_from_url(url)
    
    # Configure Selenium WebDriver
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--disable-gpu")
    chrome_options.add_argument(f"--window-size={width},{height}")
    chrome_options.add_argument("--hide-scrollbars")

    # Automatically install the correct ChromeDriver
    service = Service(ChromeDriverManager().install())
    driver = webdriver.Chrome(service=service, options=chrome_options)

    try:
        driver.get(url)
        time.sleep(3)  # Allow the page to load

        # Get total page height and resize window
        total_height = driver.execute_script("return document.body.scrollHeight")
        driver.set_window_size(width, total_height)
        time.sleep(2)

        # Capture and save screenshot
        driver.save_screenshot(output_filename)

        # Optimize image quality using Pillow
        img = Image.open(output_filename)
        img.save(output_filename, quality=100)
        return output_filename

    except Exception as e:
        print("Error:", e)
    
    finally:
        driver.quit()

EXAMPLE:

In [197]:
url = archived_url['archive_url']
print(url)

screenshot_path = capture_screenshot(url)
print(screenshot_path)

!open $screenshot_path

https://web.archive.org/web/20241225182607/https://stressers.zone/
https_web_archive_orgweb20241225182607https_stressers_zone_250211.png


## 2.A.b Use a Ollama + LLaVA to classify the image as a Booter Webpage 

In [198]:
OLLAMA_API_URL = "http://localhost:11434/api/generate"

In [199]:
system_prompt1 = """You are an AI assistant trained to analyze webpage images and classify them as either promoting booter (DDoS-for-hire) services or not. 

Some characteristics of Booter websites are:
- The page explicitly offers DDoS-for-hire services (e.g., 'Booter', 'DDoS attack', 'IP stresser').
- There is a login page (booters always require accounts).
- The page has subscription plans (e.g., pricing tiers like 'Basic', 'Premium', 'VIP' for attack durations).
- Subscription-based attack services with unclear ethical use.
- Existence of 'Registration' button
- Payment options for attacks (cryptocurrency, PayPal, etc.).
- Marketing suggests malicious use (e.g., ‘Take down your enemies!’).
- The page may presents itself as a ‘stress tester’ but lacks strong disclaimers about legal use.

Anything different from a Booter website should get a 'False' classification.

Output Format (JSON). Respond strictly in JSON format as follows:
{
  "booter": true | false,
  "confidence": "high" | "low",
  "reason": "describe based on what you've classified this website"
}

Ensure your response is accurate and concise, avoiding unnecessary speculation."""

In [201]:
system_prompt3 = """You are an AI assistant trained to analyze screenshots of webpages and paying close attention to the following characteristics.
- Does the page offers or promotes DDoS, DDoS attack, IP stresser, Booter (instead of only describing what it is)?
- Is there a login?
- Is there a registration or sign up?
- Does the page contais subscription plans, ex. pricing tiers like 'Basic', 'Premium', 'VIP'?
- Does the page contains attack or stress duration?
- Does the page describes attack power, ex. in Gbps, Gb/s, or Tbps?
- Does the page contains concurrency number, ex. 2, 4, 10?
- Does the page contains payment options, ex. cryptocurrency, PayPal ?
- Does the page contains network protocol names, ex. TCP, UPD?
- Does the page contains methods of attack, ex. TCP, UPD?
- Does the page contains link to the terms of service page or similar?

Output Format (JSON). Respond strictly in JSON format as follows:
{
  "promotes_ddos": true | false,
  "login": true | false,
  "registration": true | false,
  "subscription_plans": true | false,
  "attack_duration": true | false,
  "attack_power": true | false,
  "payment_options": true | false,
  "network_protocols": true | false,
  "attack_methods": true | false,
  "tos": true | false,
}

Ensure your response is accurate and concise, avoiding unnecessary speculation."""

In [202]:
import base64
from pathlib import Path
import requests
import json
import time
import os

def encode_image_to_base64(image_path):
    # Keep the base64 encoding pure and simple
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

def classify_image(image_path, prompt):
    # Encode image to base64
    base64_image = encode_image_to_base64(image_path)
    
    # Prepare the request with cache control headers
    url = f"http://localhost:11434/api/generate?t={int(time.time())}"  
    headers = {
        'Cache-Control': 'no-cache, no-store, must-revalidate',
        'Pragma': 'no-cache',
        'Expires': '0'
    }
    
    payload = {
        "model": "llava",
        "prompt": prompt,
        "images": [base64_image],
        "stream": False
    }
    
    try:
        # Send request to Ollama with cache control headers
        response = requests.post(url, json=payload, headers=headers)
        response.raise_for_status()
        
        # Parse the response
        result = response.json()
        return result['response']
        
    except requests.exceptions.RequestException as e:
        return f"Error occurred: {str(e)}"

### Example

In [203]:
url = archived_url['archive_url']
# url = 'https://www.akamai.com/glossary/what-is-a-ddos-booter'
print(url)

screenshot_path = capture_screenshot(url)
print(screenshot_path)

!open $screenshot_path

image_analysis = classify_image(image_path, system_prompt3)
print(image_analysis)

https://web.archive.org/web/20241225182607/https://stressers.zone/
https_web_archive_orgweb20241225182607https_stressers_zone_250211.png
 ```json
{
  "promotes_ddos": false,
  "login": true,
  "registration": true,
  "subscription_plans": false,
  "attack_duration": false,
  "attack_power": false,
  "payment_options": false,
  "network_protocols": false,
  "attack_methods": false,
  "tos": true
}
``` 


## 2.B.a. Scrapping URLs

In [206]:
# !pip install crawl4ai
# !crawl4ai-setup
import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig, CacheMode

async def crawl4ai_crawl(url: str):
    browser_conf = BrowserConfig(headless=True)  # Run in headless mode
    run_conf = CrawlerRunConfig(cache_mode=CacheMode.BYPASS)

    async with AsyncWebCrawler(config=browser_conf) as crawler:
        result = await crawler.arun(url=url, config=run_conf)

        if result.success:
            return result.markdown_v2.raw_markdown  # Return extracted content
        else:
            return f"Error: {result.error_message}"  # Handle errors gracefully

Example:

In [212]:
import nest_asyncio
nest_asyncio.apply()

url = "https://stressers.zone/"
scrapped_text = asyncio.run(crawl4ai_crawl(url))
print(scrapped_text)

[INIT].... → Crawl4AI 0.4.248
[FETCH]... ↓ https://stressers.zone/... | Status: True | Time: 0.60s
[SCRAPE].. ◆ Processed https://stressers.zone/... | Time: 12ms
[COMPLETE] ● https://stressers.zone/... | Status: True | Total: 0.61s
## [🚀 STRESSER.ZONE](https://stressers.zone/</> "back-to-index")
  * [](https://stressers.zone/<#>)
  * [](https://stressers.zone/<#>)
  * [](https://stressers.zone/<#>)
  * [](https://stressers.zone/<#>)
  * [ Login](https://stressers.zone/<login> "login")
  * [ Sign Up](https://stressers.zone/<register> "register")


# DDoS IP Stresser / IP Booter
## `STRESSERS.ZONE` is the best free IP Stresser / DDoS Booter service in 2025.
[ Register now](https://stressers.zone/<register> "register") [ Learn more](https://stressers.zone/<#features> "features") [ Preview](https://stressers.zone/<#preview> "preview")
## **🔰 IMPORTANT NOTICE 🔰️**
Our previous domain, STRESSER.ZONE, is now dead. We are now operating under new domains: **[STRESSERS.ZONE](https://stressers.zo

In [217]:
# !pip install ollama 
import ollama

def ollama(model, system_prompt, user_prompt):
    import ollama  # https://pypi.org/project/ollama/
    import time
    import json

    try:
        # Start interaction with the model
        start_time = time.time()

        response = ollama.chat(
            model=model,
            messages=[
                {'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': user_prompt}
            ],
            format='json'
        )

        execution_time = time.time() - start_time

        # Parse and append execution time
        response_content = json.loads(response['message']['content'])
        response_content['execution_time'] = execution_time

        return response_content

    except ollama.ResponseError as e:
        print('Error:', e.error)
        if e.status_code == 404:
            ollama.pull(model)
            print("Re-run this and it will work! We pulled the model for you!") 
            return None

In [218]:
system_prompt4 = """You are an AI assistant trained to analyze webpages formated in markdown (input) and paying close attention to the following characteristics.
- Does the page offers or promotes DDoS, DDoS attack, IP stresser, Booter (instead of only describing what it is)?
- Is there a login?
- Is there a registration or sign up?
- Does the page contais subscription plans, ex. pricing tiers like 'Basic', 'Premium', 'VIP'?
- Does the page contains attack or stress duration?
- Does the page describes attack power, ex. in Gbps, Gb/s, or Tbps?
- Does the page contains concurrency number, ex. 2, 4, 10?
- Does the page contains payment options, ex. cryptocurrency, PayPal ?
- Does the page contains network protocol names, ex. TCP, UPD?
- Does the page contains methods of attack, ex. TCP, UPD?
- Does the page contains link to the terms of service page or similar?

Based on the previous analysis can you conclude that this is a Booter Website offering DDoS attacks as a service ('booter_conclusion')? 
Please describe your reasoning for this conclusion ('booter_reason') and describe your confidence level as 'high' or 'low'.

Output Format (JSON). Respond strictly in JSON format as follows:
{
  "promotes_ddos": true | false,
  "login": true | false,
  "registration": true | false,
  "subscription_plans": true | false,
  "attack_duration": true | false,
  "attack_power": true | false,
  "payment_options": true | false,
  "network_protocols": true | false,
  "attack_methods": true | false,
  "tos": true | false,
  "booter_conclusion": true | false,
  "booter_reason": "describe your reasons",
  "confidence level": 'high'| 'low'
}

Ensure your response is accurate and concise, avoiding unnecessary speculation."""

In [219]:
model = 'llama3.2'
system_prompt = system_prompt4
user_prompt = scrapped_text
ollama(model, system_prompt, user_prompt)

{'promotes_ddos': True,
 'login': True,
 'registration': True,
 'subscription_plans': True,
 'attack_duration': False,
 'attack_power': True,
 'payment_options': True,
 'network_protocols': False,
 'attack_methods': False,
 'tos': True,
 'booter_conclusion': True,
 'booter_reason': "This website promotes DDoS IP Stressing services with various subscription plans, including a 'Free Plan' that offers limited concurrents and attack time. The website also provides information on TCP bypass methods, UDP amplified methods, and JS challenge bypass methods. Additionally, it lists payment options and networks protocols are not described.",
 'confidence level': 'high',
 'execution_time': 12.473520994186401}