Skip to content

ben-slates/CVE-FINDER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

CVE HUNTER

ML-Powered Vulnerability Intelligence System

Version License Python


๐Ÿ“‹ Overview

CVE Hunter is an advanced vulnerability intelligence platform that combines machine learning with a comprehensive CVE database to help security professionals, developers, and system administrators quickly identify and understand security vulnerabilities relevant to their infrastructure.

The system uses Natural Language Processing (NLP) and machine learning classification to predict vulnerability types while maintaining a powerful keyword search across the National Vulnerability Database (NVD) CVE dataset spanning from 2011-2026.


โœจ Key Features

  • ๐Ÿค– ML-Powered Classification: Automatically predicts vulnerability types with confidence scoring
  • โšก Fast Search: Keyword-based search across 14+ years of CVE data
  • ๐Ÿ“Š Advanced Filtering: Filter by severity, vulnerability type, and CVSS score
  • ๐Ÿ” Detailed CVE Information:
    • Full CVE descriptions
    • CVSS scores and severity ratings
    • Common Weakness Enumeration (CWE) mapping
    • Publication and modification dates
    • References and external links
  • ๐Ÿ’พ Comprehensive Database: 15+ years of standardized CVE data from NVD
  • ๐ŸŽจ Cyberpunk UI: Modern, fast, responsive web interface with dark theme
  • ๐Ÿ“ˆ Statistics Dashboard: Overview of database coverage and vulnerability trends
  • ๐Ÿ”— One-Click Expansion: View full details with expandable CVE cards

๐Ÿ› ๏ธ Tech Stack

Backend

  • Flask - Python web framework
  • scikit-learn - Machine learning and vectorization
  • joblib - Model serialization
  • CORS - Cross-origin resource sharing

Frontend

  • HTML5 / CSS3 - Modern semantic markup and styling
  • JavaScript (Vanilla) - Dynamic interactions
  • Responsive Design - Works on desktop and mobile

Data & ML

  • JSON - CVE dataset storage
  • TF-IDF Vectorizer - Text feature extraction
  • ML Classification Model - Vulnerability type prediction

๐Ÿ“ฆ Project Structure

project/
โ”œโ”€โ”€ app.py                    # Flask backend & API
โ”œโ”€โ”€ index.html               # Frontend UI
โ”œโ”€โ”€ clean_raw.py             # Data cleaning script
โ”œโ”€โ”€ train_model.ipynb        # ML model training notebook
โ”œโ”€โ”€ requirements.txt         # Python dependencies
โ”œโ”€โ”€ model/
โ”‚   โ”œโ”€โ”€ model.pkl           # Trained ML classifier
โ”‚   โ””โ”€โ”€ vectorizer.pkl      # TF-IDF vectorizer
โ”œโ”€โ”€ README.md               # This file
โ”œโ”€โ”€ LICENSE                 # MIT License
โ””โ”€โ”€ BRAND.md               # Brand guidelines

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8 or higher
  • pip (Python package manager)

Installation

  1. Clone or navigate to the project directory:

    cd ~/Desktop/project
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the application:

    python app.py

    The application will:

    • Start the Flask server on http://localhost:5000
    • Automatically open the web interface in your default browser
    • Load the ML model and CVE dataset

Access the Application


๐Ÿ“– How to Use

Search for Vulnerabilities

  1. Enter a search query in the search box:

    • Application names: apache, nginx, openssl
    • Software versions: 2.4.49, 1.0.2
    • Known CVEs: log4j, heartbleed
    • Package names: windows, linux
  2. Press [SCAN] button or hit Enter

  3. Results display:

    • ML Prediction: Detected vulnerability type with confidence %
    • Matched CVEs: Ranked by relevance
    • Severity Badges: Visual severity indicators (CRITICAL, HIGH, MEDIUM, LOW)

View Full Details

  1. Click the โ–ถ [DETAILS] button on any CVE card

  2. View expanded information:

    • Complete description
    • All references and links
    • CWE classification
    • CVSS score
    • Publication dates
  3. Click again to collapse

Clear Search

  • Click [CLR] button to reset and start a new search

View Statistics

  • Top panel shows:
    • Total CVEs in database
    • Top vulnerability types
    • Critical vulnerability count

๐Ÿ”Œ API Endpoints

POST /search

Search for CVEs and get ML predictions.

Request:

{
  "query": "apache 2.4.49"
}

Response:

{
  "query": "apache 2.4.49",
  "predicted_type": "Improper Input Validation",
  "confidence": 92.5,
  "total_matches": 45,
  "results": [
    {
      "cve_id": "CVE-2021-41773",
      "severity": "HIGH",
      "cvss_score": 7.5,
      "description": "...",
      "vuln_type": "...",
      "references": [...],
      "published": "2021-10-05",
      "cwe": "CWE-22"
    }
  ]
}

GET /stats

Get database statistics.

Response:

{
  "total": 195000,
  "vuln_types": {
    "Improper Input Validation": 15234,
    "SQL Injection": 12456,
    "Cross-site Scripting": 9876,
    ...
  },
  "severities": {
    "CRITICAL": 3456,
    "HIGH": 24567,
    "MEDIUM": 89234,
    "LOW": 78234
  }
}

GET /cve/<cve_id>

Get full details for a specific CVE.

Example: /cve/CVE-2021-41773

Response:

{
  "cve_id": "CVE-2021-41773",
  "description": "...",
  "severity": "HIGH",
  "cvss_score": 7.5,
  "vuln_type": "Path Traversal",
  "cwe": "CWE-22",
  "published": "2021-10-05",
  "modified": "2024-11-21",
  "references": [...]
}

๐Ÿ”ง Configuration

Model Retraining

To retrain the ML model with updated CVE data:

  1. Update the raw CVE files in raw_data/
  2. Run the cleaning script:
    python clean_raw.py
  3. Run the training notebook:
    jupyter notebook train_model.ipynb
  4. Replace model/model.pkl and model/vectorizer.pkl with updated versions

Port Configuration

To change the Flask server port, modify app.py:

if __name__ == "__main__":
    Timer(1.5, open_browser).start()
    app.run(debug=True, port=8000)  # Change port here

๐Ÿ“Š Data Sources

  • National Vulnerability Database (NVD): https://nvd.nist.gov/
  • CVE Dataset: JSON format from NVD API
  • Time Coverage: 2011 - 2026
  • Update Frequency: Latest CVE data available

๐Ÿ—‚๏ธ Download Pre-processed Dataset

Available on Kaggle: 2021-2025 All CVEs Cleaned Dataset

To use the Kaggle dataset:

  1. Download from the link above
  2. Place cve_enriched.json in the clean_data/ folder
  3. Run the application as normal

๐ŸŽจ UI Features

Responsive Design

  • Desktop optimized (960px max-width)
  • Mobile-friendly layout
  • Touch-friendly buttons and controls

Visual Indicators

  • Severity Colors: Critical (Red), High (Pink), Medium (Orange), Low (Green)
  • ML Confidence Bar: Visual representation of model confidence
  • Scanline Effect: Cyberpunk aesthetic with animated scanlines
  • Grid Background: Tech-themed grid pattern

Dark Theme

  • Low-light cyberpunk aesthetic
  • Easy on the eyes for extended use
  • High contrast for accessibility
  • Color-coded information (Cyan for primary, Pink for danger, Green for success)

โš™๏ธ Dependencies

See requirements.txt:

  • Flask
  • Flask-CORS
  • scikit-learn
  • joblib

Install all with:

pip install -r requirements.txt

๐Ÿ› Troubleshooting

Issue: "Backend not running" error

Solution: Ensure Flask is running

python app.py

Issue: Model files not found

Solution: Check paths in app.py match your folder structure:

model      = joblib.load("model/model.pkl")
vectorizer = joblib.load("model/vectorizer.pkl")

Issue: CVE data not loading

Solution: Verify clean_data/cve_enriched.json exists and is valid JSON

Issue: Port 5000 already in use

Solution: Change port in app.py:

app.run(debug=True, port=8000)

๐Ÿ” Security Considerations

  • Backend CORS enabled for frontend communication
  • No authentication required (for local/trusted networks)
  • For production: Add authentication, HTTPS, and security headers
  • Validate and sanitize all user inputs
  • Keep CVE database updated regularly

๐Ÿ“ˆ Performance

  • Search Speed: < 100ms for 195,000+ CVEs
  • ML Inference: < 50ms per query
  • Memory Usage: ~500MB with full dataset loaded
  • Browser Load Time: < 2 seconds

๐Ÿค Contributing

Found a bug or have a feature request?

  1. Review existing issues
  2. Provide clear description and reproduction steps
  3. Include environment details (Python version, OS, etc.)

๐Ÿ“œ License

This project is licensed under the MIT License.

See the LICENSE file for full license text.

MIT License

Copyright (c) 2026 Ben

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions...

๐ŸŽจ Brand & Design

For branding guidelines, color schemes, and design specifications, see BRAND.md.


๐Ÿ‘จโ€๐Ÿ’ผ Author

Ben
Security Researcher & Software Developer


๐Ÿ“ž Support

For questions, issues, or feedback:

  1. Check this README
  2. Review the troubleshooting section
  3. Contact via portfolio: https://www.benslates.xyz

๐Ÿ™ Acknowledgments

  • National Vulnerability Database (NVD) - Data source
  • NIST - CVE standards and classifications
  • scikit-learn - ML toolkit
  • Flask - Web framework
  • Open Source Community - Tools and inspiration

๐Ÿ”ฎ Future Roadmap

  • Advanced filtering and faceted search
  • Custom dashboards and reports
  • CVE trend analysis and predictions
  • Integration with security tools (Nessus, Qualys)
  • Real-time CVE feed alerts
  • Multi-language support
  • Dark/Light theme toggle
  • Export results (PDF, CSV, JSON)
  • User accounts and saved searches

Last Updated: April 2026
Status: Active Development
Maintained By: Ben


Made with โค๏ธ by Ben

Empowering security professionals with AI-powered vulnerability intelligence

About

๐Ÿ” ML-powered CVE vulnerability intelligence platform with 15+ years of NVD data & real-time security analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors