CVE Hunter is an advanced vulnerability intelligence platform that combines machine learning with a comprehensive CVE database to help security professionals, developers, and system administrators quickly identify and understand security vulnerabilities relevant to their infrastructure.
The system uses Natural Language Processing (NLP) and machine learning classification to predict vulnerability types while maintaining a powerful keyword search across the National Vulnerability Database (NVD) CVE dataset spanning from 2011-2026.
- ๐ค ML-Powered Classification: Automatically predicts vulnerability types with confidence scoring
- โก Fast Search: Keyword-based search across 14+ years of CVE data
- ๐ Advanced Filtering: Filter by severity, vulnerability type, and CVSS score
- ๐ Detailed CVE Information:
- Full CVE descriptions
- CVSS scores and severity ratings
- Common Weakness Enumeration (CWE) mapping
- Publication and modification dates
- References and external links
- ๐พ Comprehensive Database: 15+ years of standardized CVE data from NVD
- ๐จ Cyberpunk UI: Modern, fast, responsive web interface with dark theme
- ๐ Statistics Dashboard: Overview of database coverage and vulnerability trends
- ๐ One-Click Expansion: View full details with expandable CVE cards
- Flask - Python web framework
- scikit-learn - Machine learning and vectorization
- joblib - Model serialization
- CORS - Cross-origin resource sharing
- HTML5 / CSS3 - Modern semantic markup and styling
- JavaScript (Vanilla) - Dynamic interactions
- Responsive Design - Works on desktop and mobile
- JSON - CVE dataset storage
- TF-IDF Vectorizer - Text feature extraction
- ML Classification Model - Vulnerability type prediction
project/
โโโ app.py # Flask backend & API
โโโ index.html # Frontend UI
โโโ clean_raw.py # Data cleaning script
โโโ train_model.ipynb # ML model training notebook
โโโ requirements.txt # Python dependencies
โโโ model/
โ โโโ model.pkl # Trained ML classifier
โ โโโ vectorizer.pkl # TF-IDF vectorizer
โโโ README.md # This file
โโโ LICENSE # MIT License
โโโ BRAND.md # Brand guidelines
- Python 3.8 or higher
- pip (Python package manager)
-
Clone or navigate to the project directory:
cd ~/Desktop/project
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python app.py
The application will:
- Start the Flask server on
http://localhost:5000 - Automatically open the web interface in your default browser
- Load the ML model and CVE dataset
- Start the Flask server on
- Web UI: http://localhost:5000
- API Base URL: http://localhost:5000
-
Enter a search query in the search box:
- Application names:
apache,nginx,openssl - Software versions:
2.4.49,1.0.2 - Known CVEs:
log4j,heartbleed - Package names:
windows,linux
- Application names:
-
Press [SCAN] button or hit Enter
-
Results display:
- ML Prediction: Detected vulnerability type with confidence %
- Matched CVEs: Ranked by relevance
- Severity Badges: Visual severity indicators (CRITICAL, HIGH, MEDIUM, LOW)
-
Click the โถ [DETAILS] button on any CVE card
-
View expanded information:
- Complete description
- All references and links
- CWE classification
- CVSS score
- Publication dates
-
Click again to collapse
- Click [CLR] button to reset and start a new search
- Top panel shows:
- Total CVEs in database
- Top vulnerability types
- Critical vulnerability count
Search for CVEs and get ML predictions.
Request:
{
"query": "apache 2.4.49"
}Response:
{
"query": "apache 2.4.49",
"predicted_type": "Improper Input Validation",
"confidence": 92.5,
"total_matches": 45,
"results": [
{
"cve_id": "CVE-2021-41773",
"severity": "HIGH",
"cvss_score": 7.5,
"description": "...",
"vuln_type": "...",
"references": [...],
"published": "2021-10-05",
"cwe": "CWE-22"
}
]
}Get database statistics.
Response:
{
"total": 195000,
"vuln_types": {
"Improper Input Validation": 15234,
"SQL Injection": 12456,
"Cross-site Scripting": 9876,
...
},
"severities": {
"CRITICAL": 3456,
"HIGH": 24567,
"MEDIUM": 89234,
"LOW": 78234
}
}Get full details for a specific CVE.
Example: /cve/CVE-2021-41773
Response:
{
"cve_id": "CVE-2021-41773",
"description": "...",
"severity": "HIGH",
"cvss_score": 7.5,
"vuln_type": "Path Traversal",
"cwe": "CWE-22",
"published": "2021-10-05",
"modified": "2024-11-21",
"references": [...]
}To retrain the ML model with updated CVE data:
- Update the raw CVE files in
raw_data/ - Run the cleaning script:
python clean_raw.py
- Run the training notebook:
jupyter notebook train_model.ipynb
- Replace
model/model.pklandmodel/vectorizer.pklwith updated versions
To change the Flask server port, modify app.py:
if __name__ == "__main__":
Timer(1.5, open_browser).start()
app.run(debug=True, port=8000) # Change port here- National Vulnerability Database (NVD): https://nvd.nist.gov/
- CVE Dataset: JSON format from NVD API
- Time Coverage: 2011 - 2026
- Update Frequency: Latest CVE data available
Available on Kaggle: 2021-2025 All CVEs Cleaned Dataset
To use the Kaggle dataset:
- Download from the link above
- Place
cve_enriched.jsonin theclean_data/folder - Run the application as normal
- Desktop optimized (960px max-width)
- Mobile-friendly layout
- Touch-friendly buttons and controls
- Severity Colors: Critical (Red), High (Pink), Medium (Orange), Low (Green)
- ML Confidence Bar: Visual representation of model confidence
- Scanline Effect: Cyberpunk aesthetic with animated scanlines
- Grid Background: Tech-themed grid pattern
- Low-light cyberpunk aesthetic
- Easy on the eyes for extended use
- High contrast for accessibility
- Color-coded information (Cyan for primary, Pink for danger, Green for success)
See requirements.txt:
- Flask
- Flask-CORS
- scikit-learn
- joblib
Install all with:
pip install -r requirements.txtSolution: Ensure Flask is running
python app.pySolution: Check paths in app.py match your folder structure:
model = joblib.load("model/model.pkl")
vectorizer = joblib.load("model/vectorizer.pkl")Solution: Verify clean_data/cve_enriched.json exists and is valid JSON
Solution: Change port in app.py:
app.run(debug=True, port=8000)- Backend CORS enabled for frontend communication
- No authentication required (for local/trusted networks)
- For production: Add authentication, HTTPS, and security headers
- Validate and sanitize all user inputs
- Keep CVE database updated regularly
- Search Speed: < 100ms for 195,000+ CVEs
- ML Inference: < 50ms per query
- Memory Usage: ~500MB with full dataset loaded
- Browser Load Time: < 2 seconds
Found a bug or have a feature request?
- Review existing issues
- Provide clear description and reproduction steps
- Include environment details (Python version, OS, etc.)
This project is licensed under the MIT License.
See the LICENSE file for full license text.
MIT License
Copyright (c) 2026 Ben
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions...
For branding guidelines, color schemes, and design specifications, see BRAND.md.
Ben
Security Researcher & Software Developer
- Portfolio: https://www.benslates.xyz
- Year: 2026
- License: MIT
For questions, issues, or feedback:
- Check this README
- Review the troubleshooting section
- Contact via portfolio: https://www.benslates.xyz
- National Vulnerability Database (NVD) - Data source
- NIST - CVE standards and classifications
- scikit-learn - ML toolkit
- Flask - Web framework
- Open Source Community - Tools and inspiration
- Advanced filtering and faceted search
- Custom dashboards and reports
- CVE trend analysis and predictions
- Integration with security tools (Nessus, Qualys)
- Real-time CVE feed alerts
- Multi-language support
- Dark/Light theme toggle
- Export results (PDF, CSV, JSON)
- User accounts and saved searches
Last Updated: April 2026
Status: Active Development
Maintained By: Ben
Made with โค๏ธ by Ben
Empowering security professionals with AI-powered vulnerability intelligence