PhalanxAI

PhalanxAI is an advanced AI-powered Intrusion Detection System (IDS) that leverages machine learning, deep learning, and explainable AI to detect and classify network security threats in real-time. The system provides comprehensive threat intelligence through MITRE ATT&CK framework mapping and detailed explanations of detected anomalies.

🌟 Key Features

🤖 Multi-Model Ensemble Detection

Random Forest Classifier: Supervised learning for attack classification
Isolation Forest: Unsupervised anomaly detection
Autoencoder Neural Network: Deep learning-based anomaly detection
Ensemble Predictions: Combines multiple models for higher accuracy

🔍 Explainable AI

SHAP (SHapley Additive exPlanations): Global and local feature importance
LIME (Local Interpretable Model-agnostic Explanations): Instance-level explanations
Top Feature Analysis: Identifies which network features contributed to detection
Human-readable alerts: Natural language explanations for security analysts

🎯 MITRE ATT&CK Integration

Maps detected attacks to MITRE ATT&CK techniques and tactics
Provides threat context and recommended mitigations
Tracks attacker techniques across the cyber kill chain
Comprehensive threat intelligence database

📊 Real-time Dashboard

Live threat monitoring and visualization
Attack distribution analytics
Severity-based alert prioritization (Critical, High, Medium, Low)
Time-series analysis of attack patterns
Model performance metrics

🔧 RESTful API

FastAPI-powered backend for high performance
Comprehensive API documentation (Swagger/OpenAPI)
Real-time network flow analysis
Model training and management endpoints
Alert management and querying

🏗️ Architecture

PhalanxAI/
├── main.py                 # FastAPI application entry point
├── config.py               # Configuration and settings
├── database.py             # SQLAlchemy models and database setup
├── requirements.txt        # Python dependencies
│
├── api/                    # REST API endpoints
│   ├── routes.py          # API route handlers
│   └── schemas.py         # Pydantic data models
│
├── models/                 # Machine Learning models
│   ├── model_manager.py   # Model orchestration and ensemble
│   ├── random_forest.py   # Random Forest classifier
│   ├── isolation_forest.py # Isolation Forest anomaly detector
│   └── autoencoder.py     # Deep learning autoencoder
│
├── data/                   # Data processing
│   ├── loaders.py         # Dataset loading utilities
│   ├── preprocessor.py    # Feature preprocessing and scaling
│   └── feature_extractor.py # Network feature extraction
│
├── explainability/         # Explainable AI components
│   ├── shap_explainer.py  # SHAP-based explanations
│   ├── lime_explainer.py  # LIME-based explanations
│   └── alert_generator.py # Generate human-readable alerts
│
├── mitre/                  # MITRE ATT&CK framework
│   ├── attack_db.py       # MITRE technique database
│   └── mapper.py          # Map attacks to MITRE techniques
│
├── static/                 # Web dashboard
│   ├── index.html         # Dashboard interface
│   ├── css/
│   └── js/
│
├── trained_models/         # Saved model files
└── sample_data/           # Sample network traffic data

🚀 Getting Started

Prerequisites

Python 3.8 or higher
PostgreSQL 12+ (optional, for persistent storage)
4GB+ RAM recommended
Linux, macOS, or Windows

Installation

Clone the repository

git clone https://github.com/Gentwocoder/PhalanxAI.git
cd phalanxai

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Configure environment variables

cp .env.example .env
# Edit .env with your database credentials and settings

Set up the database (optional)

# Create PostgreSQL database
createdb ai_ids

# Or use SQLite by updating DATABASE_URL in .env:
# DATABASE_URL=sqlite+aiosqlite:///./ai_ids.db

Quick Start

Start the application

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Access the dashboard
- Open browser: http://localhost:8000
- API documentation: http://localhost:8000/docs
Train models (first-time setup)

# Using the API
curl -X POST "http://localhost:8000/api/train" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_path": "sample_data/network_traffic.csv",
    "sample_size": 10000
  }'

📖 Usage

Using the Dashboard

Monitor Threats: View real-time alerts and statistics on the main dashboard
Analyze Alerts: Click on any alert to see detailed explanations and MITRE mappings
Explore MITRE Matrix: Navigate the MITRE ATT&CK matrix to understand attack techniques
Model Management: Train new models or view model performance metrics

Using the API

Health Check

curl http://localhost:8000/api/health

Analyze Network Flow

curl -X POST "http://localhost:8000/api/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "src_ip": "192.168.1.100",
    "dst_ip": "10.0.0.50",
    "src_port": 54321,
    "dst_port": 80,
    "protocol": "TCP",
    "flow_duration": 12000,
    "total_fwd_packets": 45,
    "total_backward_packets": 38,
    ...
  }'

Get Recent Alerts

curl http://localhost:8000/api/alerts?limit=10&severity=Critical

Train Models

curl -X POST "http://localhost:8000/api/train" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_path": "path/to/cicids2017.csv",
    "sample_size": 50000
  }'

Python SDK Example

import requests

API_URL = "http://localhost:8000/api"

# Analyze a network flow
flow_data = {
    "src_ip": "192.168.1.100",
    "dst_ip": "203.0.113.50",
    "dst_port": 443,
    "flow_duration": 5000,
    "total_fwd_packets": 25,
    # ... other features
}

response = requests.post(f"{API_URL}/predict", json=flow_data)
result = response.json()

if result["is_malicious"]:
    print(f"⚠️  ALERT: {result['attack_type']}")
    print(f"Confidence: {result['confidence']:.2%}")
    print(f"Severity: {result['severity']}")
    print(f"MITRE: {result['mitre_technique_name']}")
    print(f"Explanation: {result['explanation']}")

🎓 Supported Attack Types

PhalanxAI can detect and classify the following attack types:

Denial of Service (DoS/DDoS)
- DoS Hulk
- DoS GoldenEye
- DoS Slowhttptest
- DoS slowloris
- DDoS
Brute Force Attacks
- FTP-Patator
- SSH-Patator
- Web Attack - Brute Force
Web Attacks
- SQL Injection
- Cross-Site Scripting (XSS)
Network Reconnaissance
- Port Scanning
Command & Control
- Botnet Activity
Exploitation
- Heartbleed
- Infiltration
Anomalies
- Unknown attack patterns (zero-day detection)

📊 Model Performance

The ensemble model achieves strong performance on CICIDS2017 dataset:

Model	Accuracy	Precision	Recall	F1-Score
Random Forest	99.2%	98.8%	98.5%	98.6%
Isolation Forest	94.5%	-	-	-
Autoencoder	95.8%	-	-	-
Ensemble	99.5%	99.1%	98.9%	99.0%

🔧 Configuration

Key Configuration Parameters

Edit config.py or .env file:

# Detection Thresholds
ANOMALY_THRESHOLD = -0.5          # Isolation Forest threshold
AUTOENCODER_THRESHOLD = 0.1       # Reconstruction error threshold
CONFIDENCE_THRESHOLD = 0.7        # Minimum confidence for alerts

# Model Paths
MODEL_DIR = "trained_models"
RANDOM_FOREST_PATH = "trained_models/random_forest.joblib"
ISOLATION_FOREST_PATH = "trained_models/isolation_forest.joblib"
AUTOENCODER_PATH = "trained_models/autoencoder.pt"

# Database
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost:5432/ai_ids"

Feature Configuration

The system uses 79 network flow features from CICIDS2017 dataset:

Flow statistics (duration, packets, bytes)
Protocol flags (SYN, ACK, FIN, RST, etc.)
Inter-arrival times (IAT)
Packet length statistics
Flow rates and ratios
Window sizes
Subflow characteristics

See config.py for the complete feature list.

🧪 Training Custom Models

Using Your Own Dataset

Prepare your dataset in CSV format with the required features
Format: Include all 79 features + 'Label' column
Train models via API or Python:

from models import ModelManager
from data import DatasetLoader, DataPreprocessor

# Load data
loader = DatasetLoader("your_data_dir")
df = loader.load_cicids2017("your_dataset.csv")

# Preprocess
preprocessor = DataPreprocessor()
X_train, X_test, y_train, y_test = preprocessor.fit_transform(df)

# Train models
mm = ModelManager("trained_models")
metrics = mm.train_all(X_train, y_train, X_val=X_test)

# Save models
mm.save_all()

Supported Datasets

CICIDS2017 ✅ (Primary)
CICIDS2018 ✅ (Compatible)
NSL-KDD ⚠️ (Requires feature mapping)
UNSW-NB15 ⚠️ (Requires feature mapping)

🐛 Troubleshooting

Models Not Loading

# Check if model files exist
ls -lh trained_models/

# Retrain models
curl -X POST http://localhost:8000/api/train -H "Content-Type: application/json" \
  -d '{"dataset_path": "sample_data/traffic.csv"}'

Database Connection Errors

# Check PostgreSQL service
sudo systemctl status postgresql

# Or use SQLite instead
export DATABASE_URL="sqlite+aiosqlite:///./ai_ids.db"

Memory Issues

Reduce sample_size when training
Use batch prediction for large datasets
Adjust FEATURE_COLUMNS to use fewer features

SHAP/LIME Installation

# If explainability features fail
pip install shap lime --force-reinstall

🛣️ Roadmap

Real-time packet capture integration (Scapy/Zeek)
Support for additional datasets (NSL-KDD, UNSW-NB15)
Advanced visualization (Grafana/Kibana integration)
Model retraining on new threats
Container orchestration (Docker, Kubernetes)
Multi-tenant support
Integration with SIEM systems
Automated response actions
Cloud deployment templates (AWS, Azure, GCP)

🤝 Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Code formatting
black .
isort .

# Linting
pylint models/ api/ data/

📚 References

Datasets

CICIDS2017 - Canadian Institute for Cybersecurity
CICIDS2018

Frameworks

MITRE ATT&CK - Adversarial Tactics, Techniques & Common Knowledge
FastAPI - Modern web framework
Scikit-learn - Machine learning library
PyTorch - Deep learning framework

Explainability

SHAP - SHapley Additive exPlanations
LIME - Local Interpretable Model-agnostic Explanations

🙏 Acknowledgments

Canadian Institute for Cybersecurity for CICIDS datasets
MITRE Corporation for the ATT&CK framework
The open-source ML/AI community

📧 Contact

For questions, issues, or collaboration:

GitHub Issues: Report a bug
Email: adetoyeseoyekanmi@example.com

Built with ❤️ for cybersecurity professionals and researchers

Stay vigilant. Stay protected. Stay ahead with PhalanxAI.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
api		api
capture		capture
data		data
explainability		explainability
mitre		mitre
models		models
sample_data		sample_data
static		static
trained_models		trained_models
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
database.py		database.py
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PhalanxAI

🌟 Key Features

🤖 Multi-Model Ensemble Detection

🔍 Explainable AI

🎯 MITRE ATT&CK Integration

📊 Real-time Dashboard

🔧 RESTful API

🏗️ Architecture

🚀 Getting Started

Prerequisites

Installation

Quick Start

📖 Usage

Using the Dashboard

Using the API

Health Check

Analyze Network Flow

Get Recent Alerts

Train Models

Python SDK Example

🎓 Supported Attack Types

📊 Model Performance

🔧 Configuration

Key Configuration Parameters

Feature Configuration

🧪 Training Custom Models

Using Your Own Dataset

Supported Datasets

🐛 Troubleshooting

Models Not Loading

Database Connection Errors

Memory Issues

SHAP/LIME Installation

🛣️ Roadmap

🤝 Contributing

Development Setup

📚 References

Datasets

Frameworks

Explainability

🙏 Acknowledgments

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages