# Benchmarking the latest version of subwiz

Version 1.0.1 of subwiz has been newly released and incorporates the following changes:

⊕ **New model weights**  
The model has been retrained on a dataset 5x larger than before. It now uses the entire apex domain, including the TLD, during prediction for improved accuracy.

⊕ **Recursive resolution**  
A recursive process is now used, where subdomains found at each iteration are automatically fed into the next. This continues until no new subdomains are discovered or a user-specified maximum recursion depth (max-recursion) is reached, enabling much deeper and more exhaustive enumeration.

⊕ **Multi-apex support**  
Inference can now be performed with multiple apex domains simultaneously using the `--multi-apex` flag.

⊕ **Quality of life improvements**  
New silent (`-s`) and quiet (`-q`) flags reduce verbosity for less cluttered output.


We start by defining some utility functions and install both versions of subwiz for the benchmark:

## Utils

In [None]:
import os
import sys
import importlib
import subprocess
import random
import asyncio

import aiodns
import nest_asyncio
from rich.console import Console
from rich.table import Table

# Allow nested event loops for Jupyter
nest_asyncio.apply()

console = Console()

def install_subwiz_dependencies(repo_dir: str):
    """Installs the dependencies from the requirements.txt file in the given repository directory."""
    import subprocess
    import sys

    # Install dependencies from repo_dir/requirements.txt
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", os.path.join(repo_dir, "requirements.txt")])

def print_statistics(statisitcs: dict):
    """Prints a summary of the benchmark results."""
    # Compute average statistics
    total_apex = len(statisitcs)
    n_subfinder_subs = sum(stats["subfinder"] for stats in statisitcs.values())
    n_subfinder_plus_v1_subs = sum(stats["subfinder_plus_v1"] for stats in statisitcs.values())
    n_subfinder_plus_v2_subs = sum(stats["subfinder_plus_v2"] for stats in statisitcs.values())
    n_subfinder_plus_v2_maxrec_subs = sum(stats["subfinder_plus_v2_maxrec"] for stats in statisitcs.values())
    
    avg_subfinder_subs = n_subfinder_subs / total_apex
    avg_subfinder_plus_v1_subs = (n_subfinder_subs + n_subfinder_plus_v1_subs) / total_apex
    avg_subfinder_plus_v2_subs = (n_subfinder_subs + n_subfinder_plus_v2_subs) / total_apex
    avg_subfinder_plus_v2_maxrec_subs = (n_subfinder_subs + n_subfinder_plus_v2_maxrec_subs) / total_apex

    inc_subfinder_plus_v1 = 100 * (avg_subfinder_plus_v1_subs / avg_subfinder_subs - 1)
    inc_subfinder_plus_v2 = 100 * (avg_subfinder_plus_v2_subs / avg_subfinder_subs - 1)
    inc_subfinder_plus_v2_maxrec = 100 * (avg_subfinder_plus_v2_maxrec_subs / avg_subfinder_subs - 1)

    # Create summary table
    table = Table(title="Benchmark Results Summary", show_header=True, header_style="white")
    table.add_column("Metric", style="bright_cyan", no_wrap=True)
    table.add_column("Value", justify="right", style="bright_cyan")
    table.add_column("Percentage increase", justify="right", style="bright_cyan")

    table.add_row("Total apex domains", str(total_apex), "")
    table.add_row(
        "Average subfinder subdomains per apex", f"{avg_subfinder_subs:.2f}", f"{0:.2f}",
    )
    table.add_row(
        "Average subfinder + subwiz v1 subdomains per apex", f"{avg_subfinder_plus_v1_subs:.2f}", f"{inc_subfinder_plus_v1:.2f}",
        )
    table.add_row(
        "Average subfinder + subwiz v2 subdomains per apex", f"{avg_subfinder_plus_v2_subs:.2f}", f"{inc_subfinder_plus_v2:.2f}",
        )
    table.add_row(
        "Average subfinder + subwiz v2 + max recursion subdomains per apex", f"{avg_subfinder_plus_v2_maxrec_subs:.2f}", f"{inc_subfinder_plus_v2_maxrec:.2f}",
        )
    
    console.print(table)

async def _check_wildcard(parent_domain: str, resolver: aiodns.DNSResolver) -> bool:
    """Check if a parent domain has wildcard DNS by testing a gibberish subdomain.
    
    Args:
        parent_domain: The parent domain to test (e.g., "subdomain.example.com")
        resolver: DNS resolver instance
        
    Returns:
        True if a gibberish subdomain resolves (indicating wildcard), False otherwise
    """
    random_sub = "kasjdjxcxmwmrehwidfjksnmdfsf"
    test_subdomain = f"{random_sub}.{parent_domain}"
    
    try:
        await resolver.query(test_subdomain, "A")
        return True  # Gibberish subdomain resolved, so wildcard exists
    except (aiodns.error.DNSError, Exception):
        return False  # Gibberish subdomain didn't resolve, no wildcard

def filter_wildcards(subdomains: list[str]) -> list[str]:
    """Filter out subdomains that are under wildcard DNS records.
    
    For each subdomain, removes the innermost (leftmost) subdomain level and tests
    if a gibberish subdomain at that parent level resolves. If it does, the original
    subdomain is filtered out as it's likely just a wildcard redirect.
    
    Args:
        subdomains: List of subdomain strings to filter
        
    Returns:
        List of subdomains that are not under wildcard DNS records
        
    Examples:
        >>> filter_wildcards(["test.example.com", "api.example.com"])
        # If "kasjdjxcxmwmrehwidfjksnmdfsf.example.com" resolves, both are filtered
        # If it doesn't resolve, both are kept
    """
    if not subdomains:
        return []
    
    # Group subdomains by their parent domain to avoid duplicate DNS queries
    parent_domains = {}
    for subdomain in subdomains:
        parts = subdomain.split(".")
        if len(parts) <= 2:
            # This is already an apex domain or has no subdomain, skip
            continue
        
        # Remove innermost (leftmost) subdomain level
        parent_domain = ".".join(parts[1:])
        if parent_domain not in parent_domains:
            parent_domains[parent_domain] = []
        parent_domains[parent_domain].append(subdomain)
    
    # Check each parent domain for wildcards
    async def check_all():
        resolver = aiodns.DNSResolver(
            nameservers=["1.1.1.1", "1.0.0.1", "8.8.8.8"],
            timeout=3,
            tries=1
        )
        
        wildcard_parents = set()
        tasks = [
            _check_wildcard(parent, resolver) 
            for parent in parent_domains.keys()
        ]
        results = await asyncio.gather(*tasks)
        
        for parent, has_wildcard in zip(parent_domains.keys(), results):
            if has_wildcard:
                wildcard_parents.add(parent)
        
        return wildcard_parents
    
    # Run the async checks
    wildcard_parents = asyncio.run(check_all())
    
    # Filter out subdomains whose parent domain has a wildcard
    filtered = []
    for subdomain in subdomains:
        parts = subdomain.split(".")
        if len(parts) <= 2:
            # Keep apex domains
            filtered.append(subdomain)
            continue
        
        parent_domain = ".".join(parts[1:])
        if parent_domain not in wildcard_parents:
            filtered.append(subdomain)
    
    return filtered

## Mount the packages

In [32]:
def pip_install(package, version):
    subprocess.check_call([sys.executable, "-m", "pip", "install", f"{package}=={version}"])

pip_install("subwiz", "0.4.1")
SUBWIZ_V1_PKG = importlib.import_module("subwiz")

pip_install("subwiz", "1.0.1")
SUBWIZ_V2_PKG = importlib.import_module("subwiz")

Collecting subwiz==0.4.1
  Using cached subwiz-0.4.1-py3-none-any.whl.metadata (4.9 kB)
Using cached subwiz-0.4.1-py3-none-any.whl (16 kB)
Installing collected packages: subwiz
  Attempting uninstall: subwiz
    Found existing installation: subwiz 1.0.1
    Uninstalling subwiz-1.0.1:
      Successfully uninstalled subwiz-1.0.1
Successfully installed subwiz-0.4.1
Collecting subwiz==1.0.1
  Using cached subwiz-1.0.1-py3-none-any.whl.metadata (5.0 kB)
Using cached subwiz-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: subwiz
  Attempting uninstall: subwiz
    Found existing installation: subwiz 0.4.1
    Uninstalling subwiz-0.4.1:
      Successfully uninstalled subwiz-0.4.1
Successfully installed subwiz-1.0.1


## Benchmark overview

We compare four subdomain discovery pipelines per apex domain that aim at testing the subwiz model in a realistic setting, were a user firsts discover a set of domains using traditional tools (e.g. Subfinder, Amass, Gobuster) and then use this output as input to the subwiz model. The problem is a difficult one since these tools might have already discovered most of the subdomains for a given apex domain, however it remains useful in a setting of exhaustive subdomain discovery.
Here are the specific quantities compared:

- **Subfinder subdomains**: This baseline captures the unique resolved subdomains identified by running the `subfinder` tool for each apex domain. Since running subfinder across many domains can be very time-consuming, we have already performed this step and stored the results in `benchmark_dataset.json`, but feel free to do it yourself. This data is from May 2025.

- **Subfinder subdomains --> subwiz v0**: Starting with the subdomains discovered by Subfinder as seed inputs, subwiz v0 (version 0.4.1) generates additional candidate subdomains. This version only uses the input subdomains themselves as context for generation, without incorporating the apex domain information. We use `max-recursion=1` given that v0 initially presented no recursion and these changes came later with v1.

- **Subfinder subdomains --> subwiz v1**: Starting with the Subfinder results as seed inputs, subwiz v1 generates candidates using the improved model with apex domain context. Additionally, this pipeline uses the default recursive generation (`max_recursion=5`): newly discovered subdomains from each iteration are automatically fed back as inputs for the next iteration, allowing the model to discover deeper nested subdomains that might only be found by building upon previously generated candidates.

- **Subfinder subdomains --> subwiz v1 + maximum recursion**: Starting with the Subfinder results as seed inputs, subwiz v1 generates candidates using the improved model with apex domain context. Additionally, this pipeline uses the maximum allowed recursion (`max_recursion=50`) to protray full potential.

In [None]:
import json
from tqdm import tqdm

random.seed(42)

BENCHMARK_DATASET_JSON = "benchmark_dataset.json"
SUBWIZ_EXECUTION_PARAMS = {
    "temperature": 0.1,
    "no_resolve": False,
    "force_download": False,
    "device": "cuda",
}

with open(BENCHMARK_DATASET_JSON, "r") as f:
        data = json.load(f)
        data = dict(random.sample(list(data.items()), 100))


statisitcs = {}
pbar = tqdm(data.items(), desc="Processing")
for apex_domain, subfinder_subdomains in pbar:
    pbar.set_description(f"Processing {apex_domain}")
    # Benchmark v1
    new_subdomains_v1 = SUBWIZ_V1_PKG.run(
        input_domains=subfinder_subdomains,
        **SUBWIZ_EXECUTION_PARAMS,
    )
    new_subdomains_v1 = filter_wildcards(new_subdomains_v1)

    # Benchmark v2
    new_subdomains_v2 = SUBWIZ_V2_PKG.run(
        input_domains=subfinder_subdomains,
        **SUBWIZ_EXECUTION_PARAMS,
     )
    new_subdomains_v2 = filter_wildcards(new_subdomains_v2)

    # Benchmark v2 + maximum recursion
    new_subdomains_v2_maxrec = SUBWIZ_V2_PKG.run(
        input_domains=subfinder_subdomains,
        **SUBWIZ_EXECUTION_PARAMS,
        max_recursion=50,
     )
    new_subdomains_v2_maxrec = filter_wildcards(new_subdomains_v2_maxrec)

    # Store statistics
    statisitcs[apex_domain] = {
        "subfinder": len(subfinder_subdomains),
        "subfinder_plus_v1": len(new_subdomains_v1),
        "subfinder_plus_v2": len(new_subdomains_v2),
        "subfinder_plus_v2_maxrec": len(new_subdomains_v2_maxrec),
    }

print_statistics(statisitcs)

Processing channelpartnersconference.com: 100%|██████████| 100/100 [52:11<00:00, 31.32s/it]    


NOTE: In the code we refer to version 0.4.1 as v1 and 1.0.1 to v2.
However, we now refer to them as version 0 and version 1, respectively, but is left
unchanged in the code to avoid re-execution of the table.

## Analysis of Results

The benchmark results reveal several important insights about the evolution of subwiz and the impact of its key features:

### **Improvement from version 0 to version 1**
The transition from the original subwiz to the retrained version of subwiz shows a slight improvement in the number of subdomains found.
The former achieves a `5.38`% increase over the baseline while the latter version achieves a `7.58`% increase.

The `2.20` percentage point improvement suggests that incorporating the full apex domain (including TLD) as context provides a small but measurable benefit. This is particularly notable given that Subfinder has already discovered most of the "low-hanging fruit" subdomains, making incremental improvements more challenging.

### **Maximum performance of subwiz**
This is particularly useful in a scenario where we are targeting one single or a small group of apex domains, hence heavier compute loads are accepted for individual apex domains.
Thus, the idea here is to give a sense of the maximum capability the newer version of subwiz brings.

Maximum recursive generation shows an improvement over single-pass and default 5 times generation.
Version 1 achieves 37.17 subdomains per apex (`7.58`% increase over baseline), while with recursion at max_recursion=50, it discovers 49.20 subdomains per apex, representing a `42.40`% increase over the baseline.
By iteratively using newly discovered subdomains as inputs for subsequent passes, the new model can uncover deeper nested structures and subdomains that are only reachable through intermediate discoveries.