Skip to content

MikeH1021/vulnerability-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10,179 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ZAP Batch Scan Extension

High-throughput passive vulnerability scanning service built on OWASP ZAP. Designed for scanning thousands of domains efficiently with rotating proxies and reliable alert capture.

Note: This is a specialized fork of OWASP ZAP configured specifically for high-volume batch scanning. For general-purpose web security testing, use the official ZAP release.


Key Features

  • High Throughput: ~7.8 jobs/second, 11K+ domains in 23 minutes
  • 100% Alert Capture: Verified at scale with 10,746 concurrent jobs
  • 150 Concurrent Workers: Parallel scanning with per-worker proxy isolation
  • Rotating Proxies: Built-in proxy rotation to avoid WAF blocking
  • REST API: Simple HTTP API for job submission, status, and results
  • Webhook Support: Automatic result delivery to external endpoints
  • Domain-Based Queuing: Prevents overwhelming single targets

Performance Benchmarks

Metric Value
Throughput ~7.8 jobs/second
4-hour capacity ~112,000 domains
Daily capacity ~674,000 domains
Alert capture rate 100%
Avg job duration 17.2 seconds
Concurrent workers 150

Test Results (11K Stress Test):

Completed: 10,746 jobs
Failed: 863 (network errors - expected)
Alert capture: 100% throughout
Duration: 23 minutes

Quick Start

Prerequisites

  • Java 17+
  • 150 rotating proxies (e.g., Webshare.io - $0.30/IP/month)

1. Clone and Build

git clone -b batchscan-extension https://github.com/MikeH1021/zaproxy.git
cd zaproxy

# Build
./gradlew :zap:compileJava
./gradlew :zap:distCore

# Extract
cd zap/build/distributions
jar xf ZAP_2.18.0-SNAPSHOT_Core.zip
cp ../../mainAddOns/*.zap ZAP_2.18.0-SNAPSHOT/plugin/
chmod +x ZAP_2.18.0-SNAPSHOT/zap.sh
cd ../../..

2. Configure Proxies

Create config/proxies.json:

{
  "proxies": [
    {
      "host": "proxy1.example.com",
      "port": 8080,
      "username": "user",
      "password": "pass"
    },
    {
      "host": "proxy2.example.com",
      "port": 8080,
      "username": "user",
      "password": "pass"
    }
  ]
}

Or generate from a proxy provider's list:

python3 scripts/generate-proxy-config.py --file proxies-raw.txt --output config/proxies.json

3. Start the Service

./start-zap-with-proxies.sh

Output:

Starting ZAP daemon...
  - Workers: 150
  - Passive scan threads: 150
  - Proxies: /home/mike/zaproxy/config/proxies.json
ZAP started with PID 12345
Waiting for ZAP to be ready...
✓ ZAP is ready!
✓ Slow rules disabled (90003, 10017, 10096, 10116)

API URL: http://localhost:8080

4. Submit Scans

# Submit a single job
curl "http://localhost:8080/JSON/batchscan/action/submit/?apikey=test-api-key&url=https://example.com&mode=spider&timeout=30"

# Response
{"jobId":"abc-123","state":"queued","etaSeconds":"5"}

5. Get Results

# Check status
curl "http://localhost:8080/JSON/batchscan/view/status/?apikey=test-api-key&jobId=abc-123"

# Get full results
curl "http://localhost:8080/JSON/batchscan/view/result/?apikey=test-api-key&jobId=abc-123"

API Reference

Base URL: http://localhost:8080/JSON/batchscan/

Actions (Submit/Control)

Endpoint Parameters Description
action/submit/ url, mode, timeout Submit a scan job
action/cancel/ jobId Cancel a running job
action/clearCompleted/ - Remove old completed jobs
action/resetStats/ - Reset queue statistics

Views (Query)

Endpoint Parameters Description
view/status/ jobId Get job status and metadata
view/result/ jobId Get scan results with alerts
view/jobs/ - List all jobs
view/queueStats/ - Queue statistics (running, queued, completed, failed)
view/workerStats/ - Worker pool statistics

Response Examples

Submit Response:

{
  "jobId": "4b206d80-fcf1-49d8-ae7d-c4be24f0c3ab",
  "state": "queued",
  "etaSeconds": "5"
}

Status Response:

{
  "jobId": "4b206d80-fcf1-49d8-ae7d-c4be24f0c3ab",
  "state": "completed",
  "domain": "example.com",
  "proxy": "proxy-042 (192.168.1.42:8080)",
  "workerId": "worker-42",
  "totalAlerts": "6"
}

Result Response:

{
  "result": [
    {
      "jobId": "4b206d80-fcf1-49d8-ae7d-c4be24f0c3ab",
      "domain": "example.com",
      "state": "completed",
      "urlsDiscovered": "15",
      "alertsHigh": "0",
      "alertsMedium": "2",
      "alertsLow": "3",
      "alertsInfo": "1",
      "totalDurationMs": "12500"
    },
    {
      "topFindings": [
        {
          "name": "Content Security Policy (CSP) Header Not Set",
          "risk": "Medium",
          "confidence": "High",
          "url": "https://example.com",
          "ruleId": "10038",
          "solution": "Ensure that your web server, application server, load balancer, etc. is configured to set the Content-Security-Policy header."
        },
        {
          "name": "Strict-Transport-Security Header Not Set",
          "risk": "Low",
          "confidence": "High",
          "url": "https://example.com",
          "ruleId": "10035"
        }
      ]
    }
  ]
}

Queue Stats Response:

{
  "running": "150",
  "queued": "5432",
  "completed": "3500",
  "failed": "125",
  "avgDurationMs": "17206"
}

Configuration

Startup Options

./zap.sh -daemon -port 8080 \
  -config api.key=your-api-key \
  -config batchscan.workerPoolSize=150 \
  -config batchscan.proxyConfigPath=/path/to/proxies.json \
  -config pscans.threads=150

All Parameters

Parameter Default Description
batchscan.workerPoolSize 150 Number of concurrent scan workers
batchscan.proxyConfigPath "" Path to proxies.json
batchscan.spiderMaxDepth 2 Spider crawl depth
batchscan.spiderMaxDurationSec 10 Spider timeout per job
batchscan.maxJobDurationSec 30 Overall job timeout
batchscan.httpTimeoutSec 5 HTTP connection timeout
batchscan.pscanWaitTimeoutSec 15 Wait for passive scan completion
batchscan.alertCap 100 Max alerts to collect per job
batchscan.topFindingsLimit 20 Top findings in result
pscans.threads 150 Passive scan threads (CRITICAL)

Important: Use pscans.threads (with 's'), not pscan.threads. Wrong key = 13 threads = alert capture fails at scale.


Architecture

┌─────────────────────────────────────────────────────────────┐
│                    ExtensionBatchScan                        │
├─────────────────────────────────────────────────────────────┤
│  BatchScanAPI          REST endpoints (/JSON/batchscan/*)   │
│  BatchScanParam        Configuration management              │
├─────────────────────────────────────────────────────────────┤
│  WorkerPool            150 concurrent worker threads         │
│    └─ ScanWorker       Job execution (seed, spider, alerts) │
├─────────────────────────────────────────────────────────────┤
│  JobScheduler          Job dispatch + timeout watchdog       │
│    ├─ JobQueue         Thread-safe priority queue            │
│    └─ DomainLockManager  One job per domain at a time       │
├─────────────────────────────────────────────────────────────┤
│  ProxyManager          Proxy rotation and health tracking    │
│  WebhookManager        Async result delivery                 │
└─────────────────────────────────────────────────────────────┘

Key Design Decisions

  1. Per-Worker Proxy Isolation: Each worker has its own ConnectionParam, eliminating global locking during HTTP requests.

  2. Domain-Based Alert Querying: Alerts are queried from the database by domain after passive scan completes, rather than event-driven collection. More reliable at scale.

  3. Alert ID Windowing: Each job records the starting alert ID and only collects alerts generated after that point, preventing cross-job contamination.

  4. Parameter Scanner Disabled: The core ParamScanner is disabled in daemon mode as it takes 39-97 seconds per page, causing queue backup.


Webhook Integration

Configure webhooks in ~/.ZAP/webhook-settings.json:

{
  "webhooks": [
    {
      "url": "https://your-endpoint.com/results",
      "events": ["job.completed", "job.failed"],
      "headers": {
        "Authorization": "Bearer your-token"
      }
    }
  ]
}

Webhook payload:

{
  "event": "job.completed",
  "jobId": "abc-123",
  "domain": "example.com",
  "alertsHigh": 0,
  "alertsMedium": 2,
  "alertsLow": 3,
  "alertsInfo": 1,
  "topFindings": [...]
}

Proxy Setup

Recommended Provider

  • Webshare.io: $0.30/IP/month, reliable US datacenter proxies
  • Get 150 IPs for ~$45/month

Rate Limiting Safety

With 150 proxies at ~7.8 jobs/second:

  • Per-proxy rate: ~0.05 requests/second
  • Well under typical WAF limits (1-10 req/sec)

Test Proxy Health

./scripts/test-proxies.sh config/proxies.json

Troubleshooting

Alert capture dropping below 80%

  1. Check pscan threads config: Must be pscans.threads=150 (with 's')
  2. Check logs for slow rules: grep "took.*seconds" /tmp/zap-proxy.log
  3. Rollback if needed: git checkout batchscan-v1.0-stable

High failure rate (>15%)

  • Check proxy health with ./scripts/test-proxies.sh
  • Some failures are normal (DNS, timeouts, blocked IPs)

"No seeds available for Spider" warnings

Normal for ~5-10% of jobs due to network issues. Not a bug if most jobs succeed.

Jobs stuck in running state

The timeout watchdog should cancel them after 30s. If not, check JobScheduler logs.


Project Structure

zaproxy/
├── config/
│   └── proxies.json              # Your proxy configuration
├── scripts/
│   ├── generate-proxy-config.py  # Convert proxy list to JSON
│   └── test-proxies.sh           # Test proxy connectivity
├── start-zap-with-proxies.sh     # Main startup script
├── CLAUDE.md                     # Detailed technical documentation
├── Rollback.md                   # Recovery instructions
└── zap/src/main/java/org/zaproxy/zap/extension/batchscan/
    ├── ExtensionBatchScan.java   # Main extension entry point
    ├── BatchScanAPI.java         # REST API implementation
    ├── BatchScanParam.java       # Configuration parameters
    ├── model/
    │   ├── ScanJob.java          # Job state and metadata
    │   ├── ScanJobStatus.java    # Status enum
    │   ├── ScanJobResult.java    # Result with alerts
    │   └── ProxyConfig.java      # Proxy configuration model
    ├── queue/
    │   ├── JobQueue.java         # Thread-safe job queue
    │   ├── JobScheduler.java     # Dispatch and timeouts
    │   └── DomainLockManager.java # Per-domain locking
    ├── worker/
    │   ├── ScanWorker.java       # Job execution logic
    │   └── WorkerPool.java       # Worker thread management
    ├── proxy/
    │   └── ProxyManager.java     # Proxy rotation
    └── webhook/
        ├── WebhookManager.java   # Delivery management
        ├── WebhookConfig.java    # Webhook settings
        └── WebhookDelivery.java  # HTTP delivery

Stable Version & Rollback

Current Stable: batchscan-v1.0-stable

# Rollback to verified stable version
git checkout batchscan-v1.0-stable

# Rebuild
./gradlew :zap:compileJava && ./gradlew :zap:distCore
cd zap/build/distributions
rm -rf ZAP_2.18.0-SNAPSHOT
jar xf ZAP_2.18.0-SNAPSHOT_Core.zip
cp ../../mainAddOns/*.zap ZAP_2.18.0-SNAPSHOT/plugin/

See Rollback.md for detailed recovery instructions.


License

Apache License 2.0 (same as OWASP ZAP)


Acknowledgments

Built on OWASP ZAP - the world's most widely used web app scanner.

This fork is maintained for high-volume batch scanning use cases. For general web security testing, use the official ZAP release.

About

High-throughput passive vulnerability scanning service

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors