Version: 2.4.0
Autonomous multi-agent cyber threat intelligence system with professional analytical tradecraft, 17 MCP server integrations, and built-in anti-hallucination verification.
CTI Agent is a prompt-orchestrated, multi-agent threat intelligence system that:
- Collects threat intelligence from 17 MCP servers across 6 categories
- Enriches IOCs with multi-source data and dynamic routing
- Analyzes threats using Diamond Model and ACH frameworks
- Challenges assessments through adversarial Devil's Advocate debate
- Verifies every claim independently before reporting (anti-hallucination)
- Produces calibrated intelligence reports, STIX 2.1 bundles, and ATT&CK layers
- Improves itself through quantitative evaluation and prompt optimization
┌──────────────────────────────────────────────────────────────────────┐
│ CTI Agent v2.4.0 │
│ Multi-Agent Intelligence with Adversarial Review │
└──────────────────────────────────────────────────────────────────────┘
Collector → Analyst → [Devil's Advocate ↔ Analyst] → Verifier → Reporter
│ │ │ │ │
monitor- diamond- challenge high- re-query generate
feeds model confidence source report
enrich- ACH assessments APIs STIX 2.1
iocs analysis propose alt. detect ATT&CK
hypotheses hallucination layers
cp config/.env.template config/.env
# Edit config/.env with your API keys11 of 17 MCP servers require no API keys and work immediately (vulnerability intel, abuse feeds, DNS recon, and more).
Recommended Keys (free tier available):
VT_API_KEY— VirusTotal / Google Threat IntelligenceOTX_API_KEY— AlienVault OTXABUSEIPDB_API_KEY— AbuseIPDBSHODAN_API_KEY— Shodan
Optional Keys (free tier):
TI_MINDMAP_API_KEY— TI Mindmap HUB (free from ti-mindmap-hub.com)
Most servers auto-install via uvx when Claude Code loads .mcp.json. For manual installation:
uvx gti_mcp
pip install fastmcp-threatintelSee config/mcp_server_registry.json for the full 17-server registry.
In Claude Code:
Read AGENT.md and begin a threat intelligence session.
Focus on [YOUR PRIORITY - e.g., "APT activity", "ransomware trends", "CVE-2025-XXXX"]
For the full multi-agent pipeline:
Read AGENT.md and run the orchestrate-team skill for a complete intelligence cycle.
Five specialized agents with structural separation of concerns:
| Agent | Role | Skills |
|---|---|---|
| Collector | Gather and enrich raw intelligence | monitor-feeds, enrich-iocs |
| Analyst | Produce calibrated assessments | diamond-model, ACH |
| Devil's Advocate | Challenge high-confidence judgments | ACH (adversarial) |
| Verifier | Independently re-validate all claims | verify-claims |
| Reporter | Assemble final deliverables | generate-report, STIX, ATT&CK |
This pipeline addresses two critical LLM failure modes:
- Confirmation bias — The Devil's Advocate has a structural mandate to challenge all assessments rated "highly likely" or above, argue for alternative hypotheses, and document dissenting views per ICD 203
- Hallucination — The Verifier independently re-queries source APIs for every IOC, TTP, and attribution claim. Refuted claims are quarantined and never reach the final report
collection_bundle → assessment_package → debate_record → verification_report → final deliverables
(Collector) (Analyst) (DA ↔ Analyst) (Verifier) (Reporter)
CTI_agent/
├── AGENT.md # Master orchestrator (start here)
├── agents/definitions/ # Multi-agent team definitions
│ ├── collector.md # Feed monitoring + IOC enrichment
│ ├── analyst.md # Diamond Model + ACH analysis
│ ├── devils_advocate.md # Adversarial challenge protocol
│ ├── verifier.md # Independent claim validation
│ └── reporter.md # Final product assembly
├── skills/ # 13 skill prompt files
│ ├── orchestrate-team/ # 5-agent pipeline coordinator
│ ├── monitor-feeds/ # Intelligence collection
│ ├── enrich-iocs/ # Multi-source IOC enrichment
│ ├── verify-claims/ # Claim validation + hallucination detection
│ ├── diamond-model-analysis/ # Structured intrusion analysis
│ ├── analysis-competing-hypotheses/ # Attribution hypothesis testing
│ ├── generate-report/ # ICD 203 intelligence reports
│ ├── produce-stix-bundle/ # STIX 2.1 output
│ ├── produce-attack-layers/ # ATT&CK Navigator layers
│ ├── recall-intelligence/ # Pinecone vector memory
│ ├── check-server-health/ # MCP server availability
│ ├── plan-session/ # Session planning
│ ├── self-evolving-loop/ # Meta-prompt optimization
│ └── external/ # Third-party skills
├── lib/ # Python deterministic logic (12 modules)
│ ├── team_data.py # Inter-agent data schemas
│ ├── debate.py # Devil's Advocate debate engine
│ ├── verification_pipeline.py # Claim extraction + routing
│ ├── stix_builder.py # Diamond Model → STIX 2.1
│ ├── attack_layers.py # Diamond Model → ATT&CK Navigator
│ ├── confidence_decay.py # IOC freshness half-life calculations
│ ├── actor_profiles.py # Persistent threat actor profiles
│ ├── pinecone_memory.py # Vector memory for historical context
│ ├── health_check.py # MCP server health checks
│ ├── config.py # Registry loader + routing tables
│ ├── metrics.py # Observability metrics
│ └── logging_schema.py # Structured JSONL logging
├── evaluation/ # Quality assessment graders
├── config/ # Configuration
│ ├── mcp_server_registry.json # 17 MCP servers (source of truth)
│ ├── mcp_config.json # Claude Code MCP config (generated)
│ ├── team_config.json # Multi-agent pipeline config
│ ├── skill_ownership.json # Skill conflict resolution
│ └── feeds.json # RSS fallback feeds
├── tests/ # 182 tests
├── demo/ # Demo dataset + mock MCP responses
├── docs/ # Architecture + development docs
├── state/ # Runtime state files
└── reports/ # Generated intelligence products
17 servers across 6 categories with dynamic routing and graceful degradation:
| Category | Servers | Purpose |
|---|---|---|
| Intelligence | OTX, TI Mindmap HUB, Mallory | Threat feed collection |
| Enrichment | GTI/VirusTotal, Shodan, fastmcp-threatintel, mcp-threatintel | IOC enrichment |
| Vulnerability | KEV, Vulnerability Intelligence, Nuclei | CVE intelligence |
| Malware Analysis | Ghidra, YARA, Capa, Radare2, Binwalk | Binary analysis |
| OSINT | DNSTwist | DNS recon |
| Utility | CyberChef | Data transformation |
IOC routing is configured per type (IP, domain, hash, URL, CVE) with primary/secondary/fallback chains. The check-server-health skill adjusts routing at session start based on available API keys and server status.
Structures intrusion analysis into four vertices:
- Adversary — Who conducted the attack
- Infrastructure — Systems used (C2, delivery)
- Capability — Tools and techniques (mapped to ATT&CK)
- Victim — Target of the attack
Seven-step process for rigorous attribution:
- Generate all plausible hypotheses (including deception + null)
- List all evidence from Diamond Model
- Create diagnosticity matrix
- Refine hypotheses
- Assess diagnostic evidence (focus on refuting, not confirming)
- Calculate likelihood
- Report with calibrated confidence
| Term | Probability |
|---|---|
| Almost certain | >95% |
| Highly likely | 80-95% |
| Likely | 60-80% |
| Roughly even chance | 40-60% |
| Unlikely | 20-40% |
| Highly unlikely | 5-20% |
| Remote possibility | <5% |
All claims pass through a 5-tier verification system before reaching the final report:
| Status | Confidence Multiplier | Handling |
|---|---|---|
VERIFIED_HIGH |
1.0x | Include as stated |
VERIFIED_MEDIUM |
0.75x | Include with caveat |
VERIFIED_LOW |
0.5x | Include with strong caveat |
UNVERIFIED |
0.25x | Prefix with [UNVERIFIED] |
REFUTED |
0.0x | Suppress entirely |
Hallucination detection flags:
- Claims with no source verification possible
- Refutation rate >=50% across all claims
- Attribution claims supported by only 1 source
| Product | Format | Location |
|---|---|---|
| Intelligence Reports | Markdown (ICD 203) | reports/{guid}.md |
| STIX 2.1 Bundles | JSON | reports/{guid}_stix_bundle.json |
| ATT&CK Navigator Layers | JSON (v4.5) | reports/{guid}_attack_layer.json |
| Detection Rules | Sigma / YARA | reports/{guid}_detections/ |
| IOC Packages | STIX 2.1 JSON | reports/{guid}_iocs.json |
Reports include: executive summary, key judgments with calibrated confidence, Diamond Model summary, ATT&CK mapping, ACH summary, Alternative Analysis (from Devil's Advocate debate), verified IOC tables, defensive recommendations, assumptions, intelligence gaps, and reassessment triggers.
Four graders assess output quality:
| Grader | Measures | Threshold |
|---|---|---|
ttp_coverage |
TTP extraction completeness | 0.80 |
ioc_fidelity |
IOC accuracy (0 if hallucinated) | 0.90 |
framework_compliance |
Required report sections present | 0.85 |
analytical_quality |
Reasoning quality (LLM judge) | 0.70 |
The self-evolving loop evaluates outputs, identifies underperforming skills, and triggers meta-prompt optimization with automatic rollback on regression.
# Run tests (182 passing)
uv run --with pytest python -m pytest -q
# Lint + format
ruff check . --fix && ruff format .See docs/DEVELOPMENT.md for the full development guide and docs/ARCHITECTURE.md for system architecture.
| Source | Skills | Purpose |
|---|---|---|
| gl0bal01/malware-analysis | malware-triage, dynamic-analysis, detection-engineer | Malware analysis pipeline |
| YARAHQ/yara-rule-skill | yara-rule-skill | YARA detection rule authoring |
| trailofbits/skills | variant-analysis, semgrep, static-analysis | Security analysis |
| Version | Focus |
|---|---|
| v2.4.0 | Multi-agent team (5 agents), debate engine, verification pipeline |
| v2.3.0 | Confidence decay, actor profiles, Pinecone memory, metrics |
| v2.2.0 | STIX 2.1 builder, ATT&CK layers, external skills, demo dataset |
| v2.1.0 | MCP server integrations, dynamic routing, health checks, evaluation graders |
- Core architecture + skills
- 17 MCP server integrations with dynamic routing
- STIX 2.1 + ATT&CK Navigator output
- External skill integration (malware analysis, YARA, Trail of Bits)
- Confidence decay + threat actor profiles
- Pinecone vector memory for historical context
- Multi-agent team with adversarial review
- Independent verification pipeline + hallucination detection
- SecOps SIEM integration
- SOAR playbook generation
- Web dashboard
- Historical trend analysis
- Diamond Model of Intrusion Analysis
- Psychology of Intelligence Analysis (Heuer)
- MITRE ATT&CK
- ICD 203 Analytic Standards
MIT License
CTI Agent v2.4.0 — Multi-agent intelligence team with adversarial review and independent verification.