An educational machine learning project demonstrating cybersecurity threat detection using ensemble methods.
| Model | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|
| Random Forest | 95.1% | 93.6% | 73.1% | 82.1% |
| Neural Network | 99.2% | 98.7% | 95.9% | 97.2% |
Results on synthetic dataset (15,000 samples)
- Dual ML Architecture: Random Forest + Deep Neural Network
- Interactive Dashboard: Real-time visualization with Streamlit
- Automated Training: Complete pipeline from data to deployment
- Model Comparison: Side-by-side performance analysis
- Export Capabilities: Save trained models and reports
- Python 3.10
- TensorFlow 2.13 - Neural network implementation
- Scikit-learn 1.3 - Traditional ML models
- Streamlit 1.28 - Web interface
- Pandas/NumPy - Data processing
AI_CyberGuard/
├── dashboard.py # Main Streamlit application
├── src/
│ ├── train_model.py # Model training pipeline
│ ├── data_generator.py # Synthetic data generation
│ └── analyzer.py # Data analysis utilities
├── models/ # Trained model files
│ ├── random_forest.pkl
│ ├── neural_network.h5
│ └── scaler.pkl
├── images/ # Visualizations and charts
├── reports/ # Generated reports
├── data/ # Dataset storage
└── notebooks/ # Jupyter notebooks for analysis
- Python 3.10 or higher
- pip package manager
- 4GB+ RAM recommended
- Clone the repository
git clone https://github.com/Nurmuhammedcoder/AI_CyberGuard.git
cd AI_CyberGuard- Create virtual environment
python -m venv my_cyber_env
# Windows
my_cyber_env\Scripts\activate
# Linux/macOS
source my_cyber_env/bin/activate- Install dependencies
pip install -r requirements.txtstreamlit run dashboard.pyOpen browser to http://localhost:8501
cd src
python train_model.pyThis will:
- Generate synthetic training data
- Train both Random Forest and Neural Network models
- Create visualizations
- Save models to
models/directory - Generate performance reports
- Home - System overview and status
- Train Models - Interactive model training interface
- Detect Attacks - Real-time threat analysis
- Dashboard - Performance metrics and visualizations
- 200 decision trees
- Balanced class weights
- Max depth: 15
- Training time: ~2.4 seconds
- Architecture: 128→64→32→16→1 neurons
- Activation: ReLU + Sigmoid output
- Optimizer: Adam (lr=0.001)
- Regularization: Dropout + BatchNormalization
- Training time: ~19 seconds
This project is designed for learning and demonstration purposes. It uses synthetic data to simulate network traffic patterns and attack behaviors.
- Synthetic Data: Uses generated data, not real network captures
- Simplified Features: 30 numerical features vs. complex real-world scenarios
- Binary Classification: Normal vs. Attack (real systems need multi-class)
- No Real-time Processing: Batch processing only
- Educational Scope: Not production-ready
- ✅ Basic ML models implementation
- ✅ Streamlit dashboard
- ✅ Model training pipeline
- ✅ Visualization system
- 🔄 CICIDS2017 dataset integration
- 🔄 Multi-class attack classification
- 🔄 Statistical validation framework
- 🔄 Jupyter notebook tutorials
- 📋 Real-time PCAP file processing
- 📋 REST API for model inference
- 📋 Docker containerization
- 📋 Comparative analysis with existing IDS
The project generates several visualizations:
- Confusion matrices
- Training history plots
- Feature importance charts
- Model comparison graphs
All saved to images/ directory.
This is an educational project. Suggestions and improvements are welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
MIT License - see LICENSE file for details
Nurmukhammed
- GitHub: @Nurmuhammedcoder
- Email: nekulov@internet.ru
- Dataset inspiration: Canadian Institute for Cybersecurity
- ML frameworks: TensorFlow and Scikit-learn teams
- Streamlit for excellent visualization tools
⭐ If you find this project useful, please consider giving it a star!