Skip to content

scriptedstatement/windows-triage-mcp

Repository files navigation

Windows Triage MCP Server

A Model Context Protocol (MCP) server providing offline forensic file and artifact triage capabilities for Claude Code and other MCP-compatible AI assistants.

Installation Options

Option A: As Part of Claude-IR (Recommended)

This MCP is designed as a component of the Claude-IR AI-assisted incident response workstation.

git clone https://github.com/scriptedstatement/claude-ir.git
cd claude-ir
./setup.sh
claude

Benefits of Claude-IR installation:

  • Guided setup with component selection
  • Pre-configured MCP integration
  • Works alongside forensic-rag-mcp (knowledge search) and opencti-mcp (threat intel)
  • Forensic discipline rules and investigation workflows

Option B: Standalone Installation

Use standalone when you only need file/process triage without the full IR workstation.

See the Quick Start section below for standalone setup.

For detailed setup guidance: See SETUP.md


Overview

This server enables AI assistants to instantly validate files, processes, and persistence mechanisms against curated Windows baselines - all running locally without external API dependencies.

Important: These tools assist triage — they do not replace forensic analysis. An EXPECTED verdict means the file exists in a clean Windows baseline, not that it is safe. An UNKNOWN verdict means the file is not in our database, not that it is suspicious. Always corroborate findings with additional evidence sources.

Key Capabilities:

  • File Baseline Validation - Check files against Windows baseline records with hash verification
  • Protected Process Detection - Flag system process names (svchost, lsass) in wrong locations
  • LOLBin Detection - Identify living-off-the-land binaries with abuse techniques
  • Process Tree Analysis - Validate parent-child relationships with injection detection
  • Filename Heuristics - Detect double extensions, high entropy, control characters, space padding
  • Unicode Evasion Detection - Catch RLO attacks, homoglyphs, leet speak, typosquatting
  • Vulnerable Driver Detection - Check against known-vulnerable driver samples (LOLDrivers)
  • DLL Hijacking Detection - Identify hijackable DLLs with vulnerable executables

For threat intelligence (hash/IOC reputation), use opencti-mcp separately.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Claude Code / AI Assistant               │
│                                │                                │
│                           MCP Protocol                          │
│                                ▼                                │
│    ┌────────────────────────────────────────────────────────┐   │
│    │              Windows Triage MCP Server                │   │
│    │                     (OFFLINE ONLY)                     │   │
│    │                                                        │   │
│    │   ┌──────────────┐  ┌──────────────┐                   │   │
│    │   │known_good.db │  │  context.db  │                   │   │
│    │   │  (baseline)  │  │(risk context)│                   │   │
│    │   └──────────────┘  └──────────────┘                   │   │
│    └────────────────────────────────────────────────────────┘   │
│                                                                  │
│    For threat intelligence → use opencti-mcp separately          │
└─────────────────────────────────────────────────────────────────┘

Database Design

Database Size Purpose Data Sources
known_good.db ~5.6GB File baselines, services, tasks, autoruns VanillaWindowsReference
known_good_registry.db ~12GB Full registry baseline (6M+ entries) VanillaWindowsRegistryHives (optional)
context.db ~2.4MB Risk enrichment (LOLBins, drivers, patterns) LOLBAS, LOLDrivers, HijackLibs, YAML configs

Quick Start (Standalone)

For Claude-IR users, follow the guided setup instead (see Installation Options above).

1. Clone Repository and Data Sources

git clone https://github.com/scriptedstatement/windows-triage-mcp.git
cd windows-triage-mcp
mkdir -p data/sources && cd data/sources

# Required: Windows file baselines
git clone https://github.com/AndrewRathbun/VanillaWindowsReference.git

# Required: Security context databases
git clone https://github.com/LOLBAS-Project/LOLBAS.git
git clone https://github.com/magicsword-io/LOLDrivers.git
git clone https://github.com/wietze/HijackLibs.git

# Optional: Full registry baseline (creates 12GB database)
git clone https://github.com/AndrewRathbun/VanillaWindowsRegistryHives.git

2. Install and Initialize

cd ../..  # Back to windows-triage-mcp root

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install the package
pip install -e .

# Initialize database schemas
python scripts/init_databases.py

# Import all data sources
python scripts/import_all.py

3. Configure MCP Server

Add to Claude Code's MCP settings (~/.claude/mcp.json):

{
    "mcpServers": {
        "windows-triage": {
            "command": "python",
            "args": ["-m", "windows_triage.server"],
            "cwd": "/path/to/windows-triage-mcp"
        }
    }
}

Note: The "command": "python" above assumes python resolves to your virtual environment. For reliability, use the full venv path instead: "/path/to/windows-triage-mcp/.venv/bin/python"

4. Verify Installation

# Run the test suite
pytest tests/ -v

# Check database statistics
python -c "
from windows_triage.db import KnownGoodDB, ContextDB
kg = KnownGoodDB('data/known_good.db')
ctx = ContextDB('data/context.db')
print('known_good.db:', kg.get_stats())
print('context.db:', ctx.get_stats())
"

MCP Tools Reference (13 tools)

Primary Triage Tool

Tool Description Use Case
check_file Comprehensive file triage - validates path/filename/hash against baselines, detects protected process masquerading, LOLBins, Unicode evasion, known tools Windows system file validation, detecting trojanized binaries, protected process spoofing

Process Analysis

Tool Description Use Case
check_process_tree Validate parent-child relationships using 3 approaches: injection detection (never-spawns), suspicious parent blacklist (80 patterns), valid parent whitelist Detecting Office/browser spawning shells, process injection via lsass/dwm, DCOM lateral movement
analyze_filename Detect double extensions, high entropy (>4.5), short names (≤2 chars), space padding, control characters, Unicode evasion, known tools Finding evasion attempts (e.g., invoice.pdf.exe, svch0st.exe)

Threat Detection

Tool Description Use Case
check_lolbin Check if file is a living-off-the-land binary Identifying abuse potential for legitimate Windows tools
check_hijackable_dll Check if DLL is vulnerable to hijacking (T1574.001) DLL hijacking detection in incident response
check_pipe Check if named pipe is suspicious or known Windows pipe C2 detection via named pipe analysis
check_hash Check hash against vulnerable driver database (LOLDrivers) BYOVD attack detection

Persistence Validation

Tool Description Use Case
check_service Validate Windows service against baseline Service persistence detection
check_scheduled_task Validate scheduled task against baseline Task scheduler persistence detection
check_autorun Validate registry autorun against baseline Registry persistence detection
check_registry Validate arbitrary registry key/value against full baseline (optional 12GB db) Deep registry forensics

Utility

Tool Description Use Case
get_db_stats Return database statistics Verify data population and coverage
get_health Return server health status including uptime, database connectivity, and cache stats Monitoring and debugging

Verdict System

The server returns structured verdicts with confidence levels:

Verdict Meaning Example
SUSPICIOUS Pattern match or anomaly detected svchost.exe in wrong directory, vulnerable driver
EXPECTED_LOLBIN Known Windows file AND is a LOLBin certutil.exe (LOLBin)
EXPECTED Filename found in Windows baseline notepad.exe in System32
UNKNOWN Not in any database (neutral) Third-party application

Note: For MALICIOUS verdicts (threat intel matches), use opencti-mcp which can query threat intelligence databases.

Verdict Priority

When multiple signals are present, verdicts follow this priority (highest to lowest):

  1. SUSPICIOUS - Critical findings:
    • Hash mismatch on known path (trojanized binary)
    • Unicode evasion (RLO, homoglyphs, zero-width chars)
    • Double extension (invoice.pdf.exe)
    • Known attack tool pattern (mimikatz.exe)
    • Protected process name in wrong location (svchost.exe outside System32)
    • Process injection detected (lsass/dwm spawning children)
    • Suspicious parent process (Word/Excel spawning cmd.exe)
    • Vulnerable driver detected
  2. EXPECTED_LOLBIN - Baseline match + LOLBin (legitimate but abusable)
  3. EXPECTED - Baseline match, no risk factors
  4. UNKNOWN - No database matches (neutral - may be legitimate software)

Key Design: UNKNOWN is Neutral

UNKNOWN does NOT mean suspicious. Our baseline cannot cover all legitimate software. Only flag as SUSPICIOUS when actual indicators are present.

Detection Capabilities

Process Tree Validation (check_process_tree)

Uses three complementary detection approaches:

Approach Description Example
Injection Detection Processes marked never_spawns_children spawning ANY child = critical lsass.exe, dwm.exe, audiodg.exe spawning anything
Suspicious Parent Blacklist 80 known-bad parents across 12 categories winword.exe → cmd.exe, chrome.exe → powershell.exe
Valid Parent Whitelist System processes must have specific parents svchost.exe must have services.exe as parent

Suspicious Parent Categories (80 total):

  • Microsoft Office (10): winword, excel, powerpnt, outlook, etc.
  • Browsers (9): chrome, firefox, msedge, iexplore, brave, etc.
  • PDF Readers (6): acrord32, acrobat, foxitreader, etc.
  • Java (3): java, javaw, javaws (Log4j, deserialization)
  • Collaboration (7): teams, slack, zoom, discord, etc.
  • Media Players (4): wmplayer, vlc, mpc-hc
  • Archive Tools (5): winrar, 7zfm, winzip, peazip
  • Text Editors (3): notepad, notepad++, wordpad
  • Image Viewers (4): photos, irfanview, xnview
  • LOLBins (7): regsvr32, rundll32, mshta, certutil, etc.
  • DCOM Abuse (9): mmc, dllhost, wmiprvse, scrcons, etc. (T1021.003)
  • System Services (13): lsass, csrss, smss, spoolsv, etc. (injection targets)

Never-Spawns-Children Processes: If these spawn ANY child process, it indicates process injection:

  • lsass.exe - Credential theft target
  • dwm.exe - Desktop Window Manager
  • audiodg.exe - Audio Device Graph Isolation
  • fontdrvhost.exe - Font Driver Host
  • lsaiso.exe - Credential Guard LSA

Filename Analysis (analyze_filename)

Detection Threshold Severity Example
Double Extension .doc/.pdf/.jpg + .exe/.ps1/.bat Critical invoice.pdf.exe
High Entropy >4.5 for names >6 chars Medium aX7kL9mQ.exe
Short Name ≤2 characters for executables Medium a.exe, x.dll
Space Padding 8+ consecutive spaces High doc.pdf .exe
Trailing Spaces 3+ spaces before extension High file .exe
Control Characters \x00-\x1F, \x7F Critical Invisible chars

Unicode Evasion Detection

Type Characters Detected Example
Bidirectional Override RLO (U+202E), LRO (U+202D), plus 7 other bidi controls exe.fdp displayed as pdf.exe
Zero-Width Characters ZWSP (U+200B), ZWNJ (U+200C), ZWJ (U+200D), BOM (U+FEFF), Soft Hyphen (U+00AD), Word Joiner (U+2060) sv​chost.exe (invisible char)
Homoglyphs 30+ Cyrillic/Greek letters resembling Latin svсhost.exe (Cyrillic 'с')
Mixed Scripts Latin + Cyrillic/Greek/Armenian/Hebrew/Arabic svchost.exe with mixed alphabets
Leet Speak 0→o, 1→i/l, 3→e, 4→a, 5→s, 7→t, 8→b, @→a, $→s, !→i svch0st.exe, 1sass.exe
Typosquatting Levenshtein distance ≤2 from protected names svchots.exe, scvhost.exe

Protected Process Names

These process names trigger additional validation (must be in system paths): csrss.exe, dwm.exe, lsass.exe, lsaiso.exe, services.exe, smss.exe, svchost.exe, wininit.exe, winlogon.exe

Testing

# Run all tests (5500+ tests)
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=windows_triage --cov-report=term-missing

# Run specific test categories
pytest tests/test_analysis_*.py -v          # Analysis module tests
pytest tests/test_db_*.py -v                # Database operation tests
pytest tests/test_server.py -v              # MCP server handler tests
pytest tests/test_config.py -v              # Configuration tests

# Quick check - stop on first failure
pytest tests/ -x -q

Project Structure

windows-triage-mcp/
├── pyproject.toml                 # Package configuration
├── requirements.txt               # Python dependencies
├── README.md                      # This file
│
├── src/windows_triage/           # Main package
│   ├── server.py                  # MCP server (13 tools)
│   ├── config.py                  # Centralized configuration with env var support
│   ├── exceptions.py              # Custom exception hierarchy
│   ├── db/                        # Database operations
│   │   ├── known_good.py          # File/service/task/autorun lookups (LRU cached)
│   │   ├── context.py             # LOLBin/DLL/pipe/driver lookups (LRU cached)
│   │   └── schemas.py             # SQL schema definitions
│   ├── analysis/                  # Analysis modules
│   │   ├── paths.py               # Path normalization
│   │   ├── hashes.py              # Hash detection (MD5/SHA1/SHA256)
│   │   ├── unicode.py             # Unicode evasion detection
│   │   ├── filename.py            # Filename heuristics
│   │   └── verdicts.py            # Verdict calculation logic
│   └── importers/                 # Data source importers
│       ├── lolbas.py              # LOLBAS YAML importer
│       ├── loldrivers.py          # LOLDrivers JSON importer
│       ├── hijacklibs.py          # HijackLibs CSV importer
│       └── process_expectations.py # Process tree rules
│
├── scripts/                       # Setup and import scripts
│   ├── init_databases.py          # Create database schemas
│   ├── import_all.py              # Run all importers
│   ├── import_files.py            # Import file baselines
│   ├── import_context.py          # Import LOLBins/drivers/DLLs
│   ├── import_registry_full.py    # Import full registry (optional)
│   ├── import_registry_extractions.py # Import services/tasks/autoruns
│   └── update_sources.py          # Update git data sources
│
├── data/                          # Data files (gitignored except YAML)
│   ├── process_expectations.yaml  # Process tree validation rules
│   ├── sources/                   # Cloned git repositories
│   ├── known_good.db              # File baselines (~5.6GB)
│   ├── known_good_registry.db     # Registry baselines (~12GB, optional)
│   └── context.db                 # Risk context (~2.4MB)
│
└── tests/                         # Unit tests (5500+ tests)
    ├── test_analysis_*.py         # Analysis module tests
    ├── test_db_*.py               # Database operation tests
    ├── test_server.py             # MCP server handler tests
    ├── test_config.py             # Configuration validation tests
    ├── test_importers*.py         # Data importer tests
    └── test_*.py                  # Integration and stress tests

Data Sources

Baseline Data (known_good.db)

  • VanillaWindowsReference - File paths and hashes from clean Windows installations across 200+ OS versions (Windows 10/11, Server 2016-2022)

Registry Baseline (known_good_registry.db) - Optional

  • VanillaWindowsRegistryHives - Full registry exports (NTUSER, SYSTEM, SOFTWARE) from clean Windows installations. Creates a 12GB database with 6M+ entries for deep registry forensics.

Security Context (context.db)

Integration with Other MCPs

This server is designed to work alongside other MCP servers for comprehensive triage:

windows-triage-mcp (this server) forensic-rag-mcp opencti-mcp
Baseline validation IR knowledge search Hash reputation
LOLBin detection Detection rules IOC lookups
Process tree analysis Forensic artifacts Threat actor context
Vulnerable driver detection MITRE ATT&CK Malware family info
Unicode evasion detection Tool references CVE details

Claude-IR provides this integration out of the box. For standalone users, you can manually configure multiple MCP servers in your mcp.json.

Recommended workflow:

  1. Use windows-triage-mcp for baseline/anomaly detection
  2. Use forensic-rag-mcp for investigation guidance
  3. Use opencti-mcp for threat intelligence lookups (requires OpenCTI instance)
  4. Combine results for comprehensive verdict

Design Decisions

  1. Offline only - No external API dependencies; use opencti-mcp for threat intel

  2. Two local databases - known_good.db (baselines) + context.db (risk enrichment)

  3. Path and hash validation - VanillaWindowsReference provides both path coverage (2.7M files) and hash coverage (8M hashes across MD5/SHA256); check_file validates path first, then optionally verifies hash for trojanized binary detection

  4. Verdict hierarchy - SUSPICIOUS > EXPECTED_LOLBIN > EXPECTED > UNKNOWN prevents false negatives from masking true positives

  5. Protected process detection - System process names (svchost.exe, lsass.exe) in non-system paths trigger SUSPICIOUS regardless of baseline status

  6. UNKNOWN is neutral - Not being in our baseline is not suspicious; many legitimate applications won't be covered

Configuration

All settings are loaded via Config dataclass (config.py) with validation in __post_init__. Environment variables use the WT_ prefix and are read by get_config(). Set WT_LOG_FORMAT=json for structured JSON logging.

Variable Default Description
WT_DATA_DIR ./data Base data directory
WT_KNOWN_GOOD_DB {data_dir}/known_good.db Path to baseline database
WT_CONTEXT_DB {data_dir}/context.db Path to context database
WT_REGISTRY_DB {data_dir}/known_good_registry.db Path to optional registry baseline DB
WT_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
WT_LOG_FORMAT text Log format: "text" or "json"
WT_CACHE_SIZE 10000 LRU cache size for lookups (0 to disable)
WT_SKIP_DB_VALIDATION false Skip DB validation on startup
WT_MAX_PATH_LENGTH 4096 Max path input length
WT_MAX_HASH_LENGTH 128 Max hash input length

Example configuration:

export WT_LOG_LEVEL=DEBUG
export WT_CACHE_SIZE=50000
python -m windows_triage.server

Security

This MCP server handles untrusted input from AI assistants:

  • Input validation - Length limits and null byte checks on all arguments
  • Custom exceptions - Separate exception types for validation vs internal errors
  • Parameterized SQL - No string concatenation in queries
  • No shell execution - No user input passed to shell
  • Read-only databases - Production mode opens databases in read-only mode
  • LRU caching - Configurable cache size for performance optimization
  • Sanitized errors - ValidationError messages safe to return; internal errors logged only
  • Startup validation - Database connectivity verified at startup

Acknowledgments

Architecture and direction by Steve Anson. Implementation by Claude Code (Anthropic).

License

MIT

About

MCP server for offline forensic file/hash/indicator triage

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published