Machine-Learning-Driven-Exploit-Prediction-

Machine Learning-Based Exploitability Prediction for Penetration Testing A Data-Driven Approach to Prioritizing Vulnerabilities

IEEE TIFS Python 3.8+ License: MIT # 📜 License
This project is licensed under the MIT License. See LICENSE for details.

📌 Overview This repository contains the code and data pipeline for the IEEE TIFS paper:

"Machine Learning-Based Exploitability Prediction for Penetration Testing: A Data-Driven Approach"

We present a production-ready XGBoost model that predicts the likelihood of a CVE being weaponized, using features from:

National Vulnerability Database (NVD)

Exploit Database (ExploitDB)

Key innovations: ✅ 25% recall at 6% precision (optimized for high-risk triage) ✅ 62.5% reduction in missed exploits vs. random sampling ✅ FastAPI microservice for integration with pentesting tools (Metasploit/Burp Suite)

🚀 Quick Start

Install Dependencies bash pip install -r requirements.txt # Python 3.8+
Run the Jupyter Notebook bash jupyter notebook exploit_prediction.ipynb # Full pipeline: EDA → Training → Evaluation
Deploy the FastAPI Service bash uvicorn api:app --reload # Access docs at http://localhost:8000/docs 📂 Repository Structure Copy ├── data/ # Processed datasets (NVD + ExploitDB) │ ├── nvd_2024.json # Sample NVD data │ └── exploits.csv # ExploitDB records ├── models/ # Pretrained XGBoost + SMOTE │ └── exploit_model.joblib ├── api/ # FastAPI deployment │ ├── app.py # REST endpoint │ └── schemas.py # Pydantic input validation ├── exploit_prediction.ipynb # Main Colab notebook ├── requirements.txt # Python dependencies └── LICENSE # MIT License 🔍 Key Features 📊 Feature Engineering CVSS Metrics: Base score, attack vector, criticality flags

Temporal Signals: Days since publication ("golden hour" for exploits)

Class Imbalance Handling: SMOTE oversampling (1:738 ratio)

⚙️ Optimized XGBoost Model python model = XGBClassifier( scale_pos_weight=100, # Penalize false negatives 100× more max_depth=10, n_estimators=200, eval_metric='logloss' ) 🚨 Security Thresholding Recall-Optimized Decision Threshold (θ=0.10):

25% exploit detection rate

<1% false alarms

🌐 API Endpoints Endpoint Description Example Request /predict Predict exploit probability {"cve_id": "CVE-2024-1234", "cvss_score": 9.8, "days_since_published": 30} /docs Interactive OpenAPI 3.0 docs - 📈 Performance Comparison with Baselines (Test Set, n=5,979 CVEs):

Model Recall Precision F0.7-Score CVSS ≥ 7.0 8% 0.5% 0.03 Random Forest 7% 0.3% 0.02 Our XGBoost 25% 6% 0.18 SHAP Analysis: SHAP Summary Plot

🛠️ Integration with Pentesting Tools python import requests

response = requests.post( "http://localhost:8000/predict", json={"cve_id": "CVE-2024-1234", "cvss_score": 9.2, "days_since_published": 15} ) print(response.json()) # {"risk_level": "HIGH", "probability": 0.87, "threshold_used": 0.10} 📜 Citation If you use this work, please cite:

bibtex @article{your_tifs_paper, title={Machine Learning-Based Exploitability Prediction for Penetration Testing}, author={Your Name et al.}, journal={IEEE Transactions on Information Forensics and Security}, year={2024} } 📮 Contact For questions or collaborations: 📧 Email: your.email@example.com 💻 GitHub Issues: Open an issue

🚨 Disclaimer This tool is designed for defensive security only. Always comply with ethical hacking guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Exploit_API_SouceCode.ipynb		Exploit_API_SouceCode.ipynb
LICENSE		LICENSE
README.md		README.md
exploitsCSV.csv		exploitsCSV.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine-Learning-Driven-Exploit-Prediction-

About

Uh oh!

Releases

Packages

Languages

License

vertexneuralforge/Machine-Learning-Based-Exploitability-Prediction-for-Penetration-Testing

Folders and files

Latest commit

History

Repository files navigation

Machine-Learning-Driven-Exploit-Prediction-

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages