Defense-Only Blue-Team Vulnerability Analysis Pipeline
The CVE Matter-Analysis OS is a comprehensive pipeline for analyzing vulnerabilities from the National Vulnerability Database (NVD). This system employs advanced machine learning techniques including positional alignment, stacked arbitration, epsilon-refractors for distributional shift detection, and Bayesian evidence calculation to prioritize and analyze security vulnerabilities.
Mission: Defense-only CVE analysis supporting blue-team security operations.
The pipeline consists of five main stages:
- NVD Ingest → Fetch and normalize CVE data from NVD API v2.0
- Positional Alignment → Align vulnerability embeddings across different spaces
- Stacked Arbiter → Ensemble learning for severity prediction with Pareto optimization
- Epsilon-Refractors → Detect distributional shifts in vulnerability patterns
- Bayesian Evidence → Calculate evidence scores using BIC/WAIC for prioritization
- Language: Python 3.11+
- ML/Data: NumPy, SciPy, scikit-learn
- Optional: CUDA for GPU acceleration
- Container: Docker with gVisor isolation
- Orchestration: Kubernetes (GKE), Argo Workflows
- Infrastructure: Terraform
- CI/CD: GitHub Actions
- Python 3.11 or higher
- Docker (optional, for containerized deployment)
- kubectl and access to GKE cluster (for production deployment)
- NVD API key (optional but recommended for higher rate limits)
# Clone repository
git clone https://github.com/igor-holt/Instinct.git
cd Instinct
# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
# Install dependencies
pip install -r requirements.txt
# Run pipeline scaffold
python -m src.main
# Quick CLI demo (ingest + analyze)
python -m src.cli ingest samples/cve_demo.json
python -m src.cli analyze --mode arbiter-demo
# A2A ingestion layer demo (exports CelestialBody JSONL)
python -m src.cli ingest-a2a . ./aios-layer --output artifacts/celestial_bodies.jsonlThe repository includes a Flask web application for viewing files generated by the A.U.R.O.R.A. orchestrator:
# Run the orchestrator with web interface
python flask_app.pyThen open your browser to http://localhost:5000 to view:
- Project dashboard with real-time statistics
- Generated frontend/backend files
- Design system specifications
- Message bus activity
- RESTful API endpoints
See Flask App Guide for detailed documentation.
# Build GPU-first image (override CUDA version via --build-arg CUDA_IMAGE_TAG=...)
docker build -t cve-matter:cuda .
# Run container (requires NVIDIA Container Toolkit and a GPU-enabled host)
docker run -it --rm \
--gpus all \
-e NVD_API_KEY=$NVD_API_KEY \
-v $(pwd)/samples:/app/samples:ro \
cve-matter:cuda \
python -m src.cli analyze --mode arbiter-demo
# A2A ingestion layer demo (exports CelestialBody JSONL)
python -m src.cli ingest-a2a . ./aios-layer --output artifacts/celestial_bodies.jsonlEnvironment variables:
NVD_API_KEY- API key for NVD access (optional but recommended)ALIGNMENT_R2_THRESHOLD- Minimum R² for alignment quality (default: 0.8)EPSILON_THRESHOLD- Threshold for refractor alerts (default: 0.05)EVIDENCE_METHOD- Bayesian evidence method: BIC or WAIC (default: WAIC)LOG_LEVEL- Logging verbosity: DEBUG, INFO, WARN, ERROR (default: INFO)CUDA_VISIBLE_DEVICES- GPU device selection for CUDA operations (optional)
- System Description: prompts/legendary_lidlift_v14.md
- Capsule Configurations: capsules/
- Security Policy: SECURITY.md
- Code Ownership: CODEOWNERS
This repository is configured for GitHub Copilot agents to assist with development tasks.
- Read the Agent Guide: Start with
.copilot/AGENT_GUIDE.mdfor comprehensive instructions - Review Task Definitions: Tasks are defined in
.copilot/tasks/numbered 010-090 - Follow the Workflow: Execute tasks sequentially, one PR per task
- Reference Documentation: Use system documentation in
prompts/andcapsules/
- Task 010: NVD data ingestion with delta sync and ETag support
- Task 020: Positional alignment using Procrustes and CCA
- Task 030: Stacked arbiter with Pareto knee detection
- Task 040: Epsilon-refractors for distributional shift detection
- Task 050: Bayesian evidence calculation (BIC/WAIC)
- Task 060: Notion synchronization for documentation
- Task 070: Automated capsule publishing on release
- Task 080: Optional CUDA/GPU acceleration support
- Task 090: Webhook receiver and Argo Workflows integration
- Defense-Only: All code must support defensive security purposes exclusively
- File-Anchored: Each task specifies exact files to create or modify
- One PR Per Task: Create focused, reviewable pull requests
- Security-First: Never commit secrets; scan dependencies; validate inputs
- Sequential Execution: Complete tasks in order (010 → 020 → ... → 090)
When working on a task:
## Task: [Number] - [Name]
**Rationale**: Brief explanation of changes
**Task Reference**: `.copilot/tasks/XXX_task_name.md`
### Changes Made
- Created/Modified: [file list]
- Tests added: [test files]
### Validation
- [x] Linters pass
- [x] Tests pass
- [x] Security scan clean
- [x] Acceptance criteria met
**Test Output**: [Include evidence]# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html
# Run specific module tests
pytest tests/ingest/ -v# Format code
black src/ tests/
# Check linting
flake8 src/ tests/
# Type checking
mypy src/This is a defense-only system. See SECURITY.md for:
- Security policy and guardrails
- Vulnerability reporting process
- Secure development practices
- Incident response procedures
- ❌ Exploit generation
- ❌ Offensive security operations
- ❌ Cryptographic breaking
- ❌ Malware development
- ✅ Vulnerability analysis
- ✅ Risk assessment and prioritization
- ✅ Threat detection and monitoring
- ✅ Defense planning and operations
- Review the Copilot Agent Guide
- Select a task from
.copilot/tasks/ - Implement changes following the task definition
- Create a focused PR with validation evidence
- Address code review feedback
[License TBD - Add appropriate license]
- Issues: Use GitHub Issues for bug reports and feature requests
- Security: Report vulnerabilities privately via SECURITY.md
- Documentation: See
prompts/and.copilot/directories
This system integrates concepts from:
- NVD/NIST for vulnerability data
- H-MOC (Hierarchical Multi-Objective Coordinator) for orchestration
- LID-LIFT (Layered Intelligence Defense) framework
- Academic research in Bayesian model comparison and distributional shift detection
Version: 1.0.0 Last Updated: 2025-11-13 Status: Active Development