A real-time threat intelligence dashboard that fuses live NIST CVE data with your asset inventory, flags matches, and uses AI to explain the risk. Built on Pathway for streaming, powered by Gemini for analysis.
In cybersecurity, time is the enemy. Traditional vulnerability management relies on:
- Polling/Batching: Checking for updates every few hours or days.
- Manual Matching: Spreadsheets to cross-reference thousands of CVEs with internal assets.
- Slow Context: Reading technical NVD descriptions to understand if a "High" severity actually affects you.
Zero-Day Cyber Sentinel solves this by moving from relatively static "management" to real-time intelligence. It ingests threats the second they appear, instantly filters them against your specific inventory, and uses AI to answer "Why does this matter?" immediately.
The system operates as a continuous streaming pipeline:
graph LR
A[NIST API / Manual] -->|Stream| B(stream_generator.py)
B -->|JSONL| C{Pathway Engine}
D[Inventory.csv] -->|Static Table| C
C -->|Join & AI Filter| E(Gemini 1.5 Flash)
E -->|Analyzed Alerts| F[alerts.jsonl]
F -->|Poll| G[Streamlit Dashboard]
- Ingestion ("The Ear"):
stream_generator.pycontinually polls the NIST NVD API for new vulnerability data. It also listens for manual overrides (Simulated Zero-Days). - Processing ("The Brain"):
logic.py(Pathway) reads this stream in real-time. It performs a streaming join with yourinventory.csv. Only threats that match your software stack proceed. - Analysis ("The Analyst"): Matched threats are sent to Google Gemini. The AI reads the technical CVE description and generates a concise, human-readable risk assessment.
- Visualization ("The Face"):
app.py(Streamlit) watches the output and updates the dashboard instantly, alerting operators with toast notifications and emails.
Pathway is the engine that makes this "Real-Time".
- Streaming Joins: Instead of querying a database for every new CVE, Pathway maintains a state of the
inventory.csvand "flows" the new CVEs through it. - Windowing/Temporal Logic: It handles the complexities of time-series data (like deduping recent events) without complex database queries.
- Python-Native: The entire logic pipeline is written in standard Python, but executes with the speed of a Rust engine.
- Hybrid Data Stream: We mix real NIST data with simulated "fake" threats. This ensures the dashboard is always alive for demos, even if NIST is quiet.
- JSONL as IPC: We use
stream.jsonlandalerts.jsonlas simple, file-based queues between processes. This decouples the components, making it easy to debug (you can just tail the file) and restart individual services without crashing the whole stack. - AI-First Analysis: We don't just show the CVSS score. We force Gemini to explain why it's a risk. This bridges the gap between "CVE-2024-1234" and "Update your Nginx server now."
Follow these steps to get the full system running locally.
- Python 3.10+
- A Google Cloud Project with Gemini API Access
- (Optional) NIST NVD API Key for higher rate limits
Clone the repo and enter the directory:
git clone https://github.com/YourUsername/DataQuest.git
cd DataQuestCreate your environment file:
# Windows (PowerShell)
copy .env.example .env
# Linux/Mac
cp .env.example .envEdit .env and paste your GEMINI_API_KEY.
It is recommended to use a virtual environment.
python -m venv venv
# Windows
.\venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
pip install -r requirements.txtEdit inventory.csv to match your "protected" assets.
product,version,vendor
nginx,1.18.0,nginx
tensorflow,2.4.0,google
...Note: The system only alerts on CVEs that match these products.
Because this is a decoupled microservice architecture, you need to run the three components simultaneously. Open 3 separate terminals in the project folder:
Terminal 1: The Data Stream
Fetches data from NIST and writes to stream.jsonl.
python stream_generator.pyTerminal 2: The Logic Engine Processes the stream using Pathway and Gemini.
python logic.pyTerminal 3: The Dashboard Launches the UI.
streamlit run app.pyOpen your browser to http://localhost:8501. You should see real-time data flowing in the "Live NIST Threat Intelligence" section.
Inject a Critical Threat: To prove the real-time nature without waiting for a real hack:
- Go to the Dashboard sidebar or main area.
- Click "🔥 Inject Demo Threat".
- Watch it appear instantly in the Stream, get processed by Pathway, Analyzed by Gemini, and pop up as a CRITICAL ALERT within seconds.
Manual Command Line Injection: You can also inject via the terminal:
echo "CRITICAL: Backdoor found in Mainframe!" > manual_input.txtstream_generator.py: Data fetcher (NIST + Sim).logic.py: Pathway streaming pipeline.app.py: Streamlit frontend.utils/: Helper modules for API clients.stream.jsonl: Raw data buffer.alerts.jsonl: Processed data buffer.