PhalanxAI is an advanced AI-powered Intrusion Detection System (IDS) that leverages machine learning, deep learning, and explainable AI to detect and classify network security threats in real-time. The system provides comprehensive threat intelligence through MITRE ATT&CK framework mapping and detailed explanations of detected anomalies.
- Random Forest Classifier: Supervised learning for attack classification
- Isolation Forest: Unsupervised anomaly detection
- Autoencoder Neural Network: Deep learning-based anomaly detection
- Ensemble Predictions: Combines multiple models for higher accuracy
- SHAP (SHapley Additive exPlanations): Global and local feature importance
- LIME (Local Interpretable Model-agnostic Explanations): Instance-level explanations
- Top Feature Analysis: Identifies which network features contributed to detection
- Human-readable alerts: Natural language explanations for security analysts
- Maps detected attacks to MITRE ATT&CK techniques and tactics
- Provides threat context and recommended mitigations
- Tracks attacker techniques across the cyber kill chain
- Comprehensive threat intelligence database
- Live threat monitoring and visualization
- Attack distribution analytics
- Severity-based alert prioritization (Critical, High, Medium, Low)
- Time-series analysis of attack patterns
- Model performance metrics
- FastAPI-powered backend for high performance
- Comprehensive API documentation (Swagger/OpenAPI)
- Real-time network flow analysis
- Model training and management endpoints
- Alert management and querying
PhalanxAI/
├── main.py # FastAPI application entry point
├── config.py # Configuration and settings
├── database.py # SQLAlchemy models and database setup
├── requirements.txt # Python dependencies
│
├── api/ # REST API endpoints
│ ├── routes.py # API route handlers
│ └── schemas.py # Pydantic data models
│
├── models/ # Machine Learning models
│ ├── model_manager.py # Model orchestration and ensemble
│ ├── random_forest.py # Random Forest classifier
│ ├── isolation_forest.py # Isolation Forest anomaly detector
│ └── autoencoder.py # Deep learning autoencoder
│
├── data/ # Data processing
│ ├── loaders.py # Dataset loading utilities
│ ├── preprocessor.py # Feature preprocessing and scaling
│ └── feature_extractor.py # Network feature extraction
│
├── explainability/ # Explainable AI components
│ ├── shap_explainer.py # SHAP-based explanations
│ ├── lime_explainer.py # LIME-based explanations
│ └── alert_generator.py # Generate human-readable alerts
│
├── mitre/ # MITRE ATT&CK framework
│ ├── attack_db.py # MITRE technique database
│ └── mapper.py # Map attacks to MITRE techniques
│
├── static/ # Web dashboard
│ ├── index.html # Dashboard interface
│ ├── css/
│ └── js/
│
├── trained_models/ # Saved model files
└── sample_data/ # Sample network traffic data
- Python 3.8 or higher
- PostgreSQL 12+ (optional, for persistent storage)
- 4GB+ RAM recommended
- Linux, macOS, or Windows
- Clone the repository
git clone https://github.com/Gentwocoder/PhalanxAI.git
cd phalanxai- Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies
pip install -r requirements.txt- Configure environment variables
cp .env.example .env
# Edit .env with your database credentials and settings- Set up the database (optional)
# Create PostgreSQL database
createdb ai_ids
# Or use SQLite by updating DATABASE_URL in .env:
# DATABASE_URL=sqlite+aiosqlite:///./ai_ids.db- Start the application
uvicorn main:app --reload --host 0.0.0.0 --port 8000-
Access the dashboard
- Open browser: http://localhost:8000
- API documentation: http://localhost:8000/docs
-
Train models (first-time setup)
# Using the API
curl -X POST "http://localhost:8000/api/train" \
-H "Content-Type: application/json" \
-d '{
"dataset_path": "sample_data/network_traffic.csv",
"sample_size": 10000
}'- Monitor Threats: View real-time alerts and statistics on the main dashboard
- Analyze Alerts: Click on any alert to see detailed explanations and MITRE mappings
- Explore MITRE Matrix: Navigate the MITRE ATT&CK matrix to understand attack techniques
- Model Management: Train new models or view model performance metrics
curl http://localhost:8000/api/healthcurl -X POST "http://localhost:8000/api/predict" \
-H "Content-Type: application/json" \
-d '{
"src_ip": "192.168.1.100",
"dst_ip": "10.0.0.50",
"src_port": 54321,
"dst_port": 80,
"protocol": "TCP",
"flow_duration": 12000,
"total_fwd_packets": 45,
"total_backward_packets": 38,
...
}'curl http://localhost:8000/api/alerts?limit=10&severity=Criticalcurl -X POST "http://localhost:8000/api/train" \
-H "Content-Type: application/json" \
-d '{
"dataset_path": "path/to/cicids2017.csv",
"sample_size": 50000
}'import requests
API_URL = "http://localhost:8000/api"
# Analyze a network flow
flow_data = {
"src_ip": "192.168.1.100",
"dst_ip": "203.0.113.50",
"dst_port": 443,
"flow_duration": 5000,
"total_fwd_packets": 25,
# ... other features
}
response = requests.post(f"{API_URL}/predict", json=flow_data)
result = response.json()
if result["is_malicious"]:
print(f"⚠️ ALERT: {result['attack_type']}")
print(f"Confidence: {result['confidence']:.2%}")
print(f"Severity: {result['severity']}")
print(f"MITRE: {result['mitre_technique_name']}")
print(f"Explanation: {result['explanation']}")PhalanxAI can detect and classify the following attack types:
-
Denial of Service (DoS/DDoS)
- DoS Hulk
- DoS GoldenEye
- DoS Slowhttptest
- DoS slowloris
- DDoS
-
Brute Force Attacks
- FTP-Patator
- SSH-Patator
- Web Attack - Brute Force
-
Web Attacks
- SQL Injection
- Cross-Site Scripting (XSS)
-
Network Reconnaissance
- Port Scanning
-
Command & Control
- Botnet Activity
-
Exploitation
- Heartbleed
- Infiltration
-
Anomalies
- Unknown attack patterns (zero-day detection)
The ensemble model achieves strong performance on CICIDS2017 dataset:
| Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| Random Forest | 99.2% | 98.8% | 98.5% | 98.6% |
| Isolation Forest | 94.5% | - | - | - |
| Autoencoder | 95.8% | - | - | - |
| Ensemble | 99.5% | 99.1% | 98.9% | 99.0% |
Edit config.py or .env file:
# Detection Thresholds
ANOMALY_THRESHOLD = -0.5 # Isolation Forest threshold
AUTOENCODER_THRESHOLD = 0.1 # Reconstruction error threshold
CONFIDENCE_THRESHOLD = 0.7 # Minimum confidence for alerts
# Model Paths
MODEL_DIR = "trained_models"
RANDOM_FOREST_PATH = "trained_models/random_forest.joblib"
ISOLATION_FOREST_PATH = "trained_models/isolation_forest.joblib"
AUTOENCODER_PATH = "trained_models/autoencoder.pt"
# Database
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost:5432/ai_ids"The system uses 79 network flow features from CICIDS2017 dataset:
- Flow statistics (duration, packets, bytes)
- Protocol flags (SYN, ACK, FIN, RST, etc.)
- Inter-arrival times (IAT)
- Packet length statistics
- Flow rates and ratios
- Window sizes
- Subflow characteristics
See config.py for the complete feature list.
- Prepare your dataset in CSV format with the required features
- Format: Include all 79 features + 'Label' column
- Train models via API or Python:
from models import ModelManager
from data import DatasetLoader, DataPreprocessor
# Load data
loader = DatasetLoader("your_data_dir")
df = loader.load_cicids2017("your_dataset.csv")
# Preprocess
preprocessor = DataPreprocessor()
X_train, X_test, y_train, y_test = preprocessor.fit_transform(df)
# Train models
mm = ModelManager("trained_models")
metrics = mm.train_all(X_train, y_train, X_val=X_test)
# Save models
mm.save_all()- CICIDS2017 ✅ (Primary)
- CICIDS2018 ✅ (Compatible)
- NSL-KDD
⚠️ (Requires feature mapping) - UNSW-NB15
⚠️ (Requires feature mapping)
# Check if model files exist
ls -lh trained_models/
# Retrain models
curl -X POST http://localhost:8000/api/train -H "Content-Type: application/json" \
-d '{"dataset_path": "sample_data/traffic.csv"}'# Check PostgreSQL service
sudo systemctl status postgresql
# Or use SQLite instead
export DATABASE_URL="sqlite+aiosqlite:///./ai_ids.db"- Reduce
sample_sizewhen training - Use batch prediction for large datasets
- Adjust
FEATURE_COLUMNSto use fewer features
# If explainability features fail
pip install shap lime --force-reinstall- Real-time packet capture integration (Scapy/Zeek)
- Support for additional datasets (NSL-KDD, UNSW-NB15)
- Advanced visualization (Grafana/Kibana integration)
- Model retraining on new threats
- Container orchestration (Docker, Kubernetes)
- Multi-tenant support
- Integration with SIEM systems
- Automated response actions
- Cloud deployment templates (AWS, Azure, GCP)
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/
# Code formatting
black .
isort .
# Linting
pylint models/ api/ data/- CICIDS2017 - Canadian Institute for Cybersecurity
- CICIDS2018
- MITRE ATT&CK - Adversarial Tactics, Techniques & Common Knowledge
- FastAPI - Modern web framework
- Scikit-learn - Machine learning library
- PyTorch - Deep learning framework
- Canadian Institute for Cybersecurity for CICIDS datasets
- MITRE Corporation for the ATT&CK framework
- The open-source ML/AI community
For questions, issues, or collaboration:
- GitHub Issues: Report a bug
- Email: adetoyeseoyekanmi@example.com
Built with ❤️ for cybersecurity professionals and researchers
Stay vigilant. Stay protected. Stay ahead with PhalanxAI.