Skip to content

aniliter-cloud/CyberSecurityAssignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spectre Attack — Interactive Demo + LSTM Detection Backend

CVE-2017-5753 · Educational / POC Use Only A fully interactive cybersecurity assignment that walks through the Spectre microarchitectural attack — from theory and analogy, through live browser-based exploitation demos, to a real LSTM-powered AI/ML defence backend running locally via Flask.


Intent & Purpose

This project was built to demonstrate, end-to-end, how a modern CPU side-channel attack works — and how Machine Learning can be used to detect it in real time.

The learning journey is structured deliberately:

  1. Understand the attack — theory, CPU internals, the speculative execution flaw
  2. Experience it as the attacker — three interactive browser demos (cache timing, branch training, probe scan)
  3. See the defence — a real trained LSTM model, served via Flask, classifying Hardware Performance Counter streams tick-by-tick as normal or attack

The project bridges low-level CPU microarchitecture (cache lines, branch predictors, TLB flushes) with applied AI/ML (LSTM sequence classification, HPC feature engineering, Flask REST APIs) — making both concepts tangible and hands-on.


System Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                        BROWSER (Frontend)                           │
│                                                                     │
│   index_updated.html                                                │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │   Theory │  HP Analogy │  Demo1 │  Demo2 │  Demo3   │       │
│   │   PoC Code │  AI/ML Solution Tab                         │  │
│   │                                                             │  │
│   │   JavaScript                                                │  │
│   │   ┌──────────────────────────────────────────────────────┐  │  │
│   │   │  const LSTM_API = 'http://localhost:5000'            │  │  │
│   │   │  GET  /health   → verify server alive                │  │  │
│   │   │  POST /simulate → get tick-by-tick LSTM scores       │  │  │
│   │   └───────────────────────┬──────────────────────────────┘  │  │
│   └───────────────────────────│─────────────────────────────────┘  │
└───────────────────────────────│─────────────────────────────────────┘
                                │  HTTP (localhost:5000)
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     BACKEND (Python / Flask)                        │
│                                                                     │
│   server.py                                                         │
│   ┌─────────────────────────────────────────────────────────────┐  │
│   │  GET  /health    → model status, threshold, features        │  │
│   │  POST /simulate  → generate synthetic HPC stream + predict  │  │
│   │  POST /predict   → classify custom 10×4 sequence            │  │
│   └───────────────────────────┬─────────────────────────────────┘  │
│                               │ instantiates                        │
│   lstm_loader.py              ▼                                     │
│   ┌─────────────────────────────────────────────────────────────┐  │
│   │  LSTMSpectreDetector                                        │  │
│   │  ├── joblib.load("lstm_scaler.pkl")                         │  │
│   │  │     scaler · seq_len=10 · n_features=4 · threshold       │  │
│   │  └── tf.keras.load_model("lstm_spectre_model.keras")        │  │
│   │        LSTM(64) → BN → LSTM(32) → BN → Dense(16) → sigmoid │  │
│   └─────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘
                                ▲
                                │  generated by
┌─────────────────────────────────────────────────────────────────────┐
│                     TRAINING (run once)                             │
│                                                                     │
│   lstm_train.py                                                     │
│   ┌─────────────────────────────────────────────────────────────┐  │
│   │  Synthetic HPC Data Generation                              │  │
│   │  ├── 3,000 × Normal sequences   (low uniform noise)         │  │
│   │  └── 3,000 × Attack sequences   (3-phase Spectre pattern)   │  │
│   │        Phase 1: branch_mispredict SPIKE  (ticks 0–3)        │  │
│   │        Phase 2: cache_miss EXPLODES      (ticks 3–6)        │  │
│   │        Phase 3: stride + probe PEAK      (ticks 6–9)        │  │
│   │                                                             │  │
│   │  MinMaxScaler.fit() → lstm_scaler.pkl                       │  │
│   │  model.fit()        → lstm_spectre_model.keras              │  │
│   │  ROC curve          → optimal threshold → lstm_scaler.pkl   │  │
│   └─────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘

Project Structure

CyberSecurityAssignment/
│
├── index.html                          # Original demo (standalone, no backend)
├── spectre_attack_enhanced.html        # Enhanced standalone demo (no backend needed)
├── README.md                           # This file
│
├── css/
│   └── style.css                       # Shared stylesheet
│
├── js/
│   ├── main.js                         # Tab switching & utilities
│   ├── cache-timing.js                 # Demo 1: Cache hit/miss timing
│   ├── branch-train.js                 # Demo 2: Branch predictor training
│   ├── probe-scan.js                   # Demo 3: Probe array scan
│   └── aiml-radar.js                   # AI/ML detection animation
│
└── LSTM_model_invoke/
    ├── index_updated.html              # Main app — connects to LSTM backend
    ├── lstm_train.py                   # Trains the LSTM — run once
    ├── generate_data.py                # Documents + regenerates lstm_scaler.pkl
    ├── lstm_loader.py                  # Model wrapper / inference interface
    ├── server.py                       # Flask REST API (port 5000)
    ├── lstm_spectre_model.keras        # Trained LSTM model artifact
    └── lstm_scaler.pkl                 # MinMaxScaler + threshold + metadata

Quick Start — Two Modes

Mode 1 — Standalone (no install, no backend)

Open either HTML file directly in any modern browser:

open spectre_attack_enhanced.html       # macOS
start spectre_attack_enhanced.html      # Windows
xdg-open spectre_attack_enhanced.html  # Linux

Or drag-and-drop into Chrome / Firefox / Edge / Safari.

All 7 tabs work. The AI/ML tab runs a hardcoded offline fallback simulation when the Flask server is not running.

Note: Some browser security policies (strict COOP/COEP headers) may limit SharedArrayBuffer access. The simulations are JS-based and work fine without it; only a real native exploit would require it.


Mode 2 — Full Live LSTM Backend

Requirements

  • Python 3.8–3.11
  • pip

Setup

# 1. Navigate to the backend folder
cd CyberSecurityAssignment/LSTM_model_invoke

# 2. Create and activate virtual environment
python -m venv venv
source venv/bin/activate          # macOS / Linux
venv\Scripts\activate             # Windows

# 3. Install dependencies
pip install flask flask-cors tensorflow==2.13 scikit-learn joblib numpy

# 4. First time only — train the LSTM model (~2–5 min)
python lstm_train.py
# Outputs: lstm_spectre_model.keras + lstm_scaler.pkl

# 5. Start the Flask server
python server.py
# Expected: [Server] LSTM ready. Listening on http://localhost:5000

Open the frontend

open LSTM_model_invoke/index_updated.html

Click ** AI/ML Solution → ▶ Simulate Attack + AI Detection** to see the real LSTM running live.


Tab Structure & Flow

Follow the tabs in order — attacker first, defender last.

Theory →  HP Analogy →  Demo 1 →  Demo 2 →  Demo 3 →  PoC Code →  AI/ML Solution
  (what)     (why it works)  (timing)   (predictor)  (probe scan) (exploit)      (LSTM defence)
Tab What it covers
** Theory** Spectre mechanics — speculative execution, cache side-channels, what can be stolen
** HP Analogy** Hermione & Prof. Snape SVG scenes mapping every CPU concept to Hogwarts
** Demo 1: Cache Timing** Live cache HIT vs MISS measurement using performance.now()
** Demo 2: Branch Training** Animated branch predictor confidence bar — training then attack phase
** Demo 3: Probe Scan** Full Flush+Reload simulation across all 256 probe array entries
** Full PoC Code** Annotated JavaScript Spectre Variant 1 exploit with mitigation patterns
** AI/ML Solution** Four ML approaches + interactive live LSTM detection via Flask API

Features

  • Zero dependencies for standalone mode — single HTML file, pure HTML/CSS/JS
  • SVG cartoon characters — hand-drawn Hermione and Prof. Snape illustrate speculative execution
  • Three interactive demos — clickable, animated, real-time cache/branch/probe simulations
  • Real LSTM backend — Flask API serving a trained Keras model, tick-by-tick anomaly scoring
  • Offline fallback — AI/ML tab degrades gracefully to hardcoded simulation if server is down
  • Narrative flow — tabs ordered so you experience the attack before the defence
  • Dark theme — GitHub-inspired colour palette, Consolas monospace throughout
  • Mobile-friendly — responsive grid, scrollable nav

LSTM Backend — How It Works

Data Source

There is no real-world dataset. All training data is fully synthetic, generated by lstm_train.py to simulate Hardware Performance Counter (HPC) readings from a CPU under normal and Spectre attack conditions.

4 features per time-tick (normalised 0.0–1.0):

Feature Description Normal Attack Peak
cache_miss_rate Fraction of L1/L2 cache misses 2–8% 40–85%
branch_mispredict_rate Fraction of CPU branch mispredictions 1–5% 15–35%
stride_pattern_score Regularity of 512-byte strided accesses 0–4% 70–98%
probe_array_scan Burst of sequential probe array accesses 0–3% 65–95%

Attack signature — 3 phases across 10 ticks:

Ticks 0–3  │ Phase 1: Predictor Training  → branch_mispredict SPIKES  (15–30%)
Ticks 3–6  │ Phase 2: Flush + Trigger     → cache_miss EXPLODES        (55–85%)
Ticks 6–9  │ Phase 3: Probe Array Scan    → stride + probe PEAK        (70–98%)

Dataset: 6,000 sequences (3,000 normal + 3,000 attack), balanced, shape (6000, 10, 4).

Model Architecture

Input  (10 ticks × 4 features)
  └── LSTM(64, return_sequences=True, dropout=0.2, recurrent_dropout=0.1)
        └── BatchNormalization
              └── LSTM(32, return_sequences=False, dropout=0.2, recurrent_dropout=0.1)
                    └── BatchNormalization
                          └── Dense(16, activation='relu')
                                └── Dropout(0.3)
                                      └── Dense(1, activation='sigmoid')
                                            └── anomaly score  0.0 → 1.0
                                                  │
                                                  ▼
                                         score ≥ threshold → ATTACK
                                         score <  threshold → NORMAL

pkl File Contents

lstm_scaler.pkl stores 4 keys — generated by lstm_train.py, regenerable via generate_data.py:

Key Type Value Purpose
scaler MinMaxScaler Fitted on 60,000 rows Normalises raw HPC inputs before inference
seq_len int 10 Number of ticks per sequence
n_features int 4 HPC features per tick
threshold float ~0.69 Optimal ROC decision boundary from trained model

API Endpoints

Endpoint Method Body Returns
/health GET Model status, threshold, seq_len, feature names
/simulate POST {"mode": "attack"|"normal"} Full tick-by-tick scores + verdict
/predict POST {"sequence": [[f0,f1,f2,f3]×10]} Verdict + anomaly score for custom input

Quick test:

# Health check
curl http://localhost:5000/health

# Simulate attack stream
curl -X POST http://localhost:5000/simulate \
  -H "Content-Type: application/json" \
  -d '{"mode": "attack"}'

The HP Analogy

Hogwarts World CPU / Spectre World
Hermione guessing ahead Speculative Execution
Snape still writing the question Slow RAM bounds-check in flight
Grabbing ingredient early Speculative memory read of secret
Realising the guess was wrong Branch misprediction detected
Putting the jar back CPU rolls back architectural state
Footprints + powder on the floor Cache pollution — the side-channel!
Attacker checks which cupboard has powder Flush+Reload timing measurement
Deducing which ingredient she grabbed Recovering the secret byte value

AI/ML Defence Approaches Covered

Technique Purpose Key Metric
LSTM Neural Network Detect temporal HPC anomaly patterns in real time ~97% accuracy, <1ms inference
NLP / CodeBERT Static scan for Spectre gadget shapes in source/binary Pre-runtime, 0-day variant coverage
Reinforcement Learning Dynamically tune IBPB/LFENCE overhead vs. threat level ~40% overhead reduction vs. always-on KPTI
Graph Neural Networks Flag uniform 256-node probe-array scan in memory graphs Low false-positive rate

Mitigations Referenced

Mitigation Description
Retpoline Compiler replaces indirect branches with return trampolines
IBRS / IBPB / STIBP Intel/AMD microcode flushes branch predictor on privilege transitions
KPTI Kernel Page Table Isolation separates user/kernel memory maps
LFENCE Serialisation barrier inserted before bounds-checked reads
Index Masking Bitwise AND forces indices within safe range regardless of speculation
Site Isolation Browsers isolate origins into separate OS processes
Timer Jitter performance.now() resolution reduced to ~100µs post-Spectre

What Else Could Be Added

Demo Depth

  • Spectre Variant 2 demo — Branch Target Injection, separate from Variant 1 (Bounds Check Bypass)
  • Meltdown comparison tab — side-by-side with Spectre to clarify the difference (kernel vs. speculation)
  • Real hardware counter feed — hook into the Performance Observer API for actual CPU event counts
  • Multi-byte live leak animation — leak "SPECTRE" one character at a time with a typewriter effect

Gamification

  • Quiz mode — multiple-choice question after each demo locks the next tab until answered
  • "Can you evade the AI?" challenge — tweak attack parameters to stay below LSTM threshold
  • Leaderboard — score based on how quickly/accurately users identify the leaked byte

Technical

  • WebAssembly version — port timing demos to WASM for deterministic nanosecond measurements
  • Export report button — generate a PDF of CVE details, mitigations applied, AI/ML approach
  • Keyboard shortcuts17 to jump tabs, Space to trigger next demo action
  • Colour-blind mode — swap red/green palette for accessible blue/orange

Troubleshooting

Error Cause Fix
ModuleNotFoundError: flask Not in venv or not installed pip install flask flask-cors
Address already in use Port 5000 taken by AirPlay (Mac) Disable AirPlay Receiver: System Settings → General → AirDrop & Handoff
AttributeError: 'Adam' has no attribute 'build' Keras/TF version mismatch on M1/M2 Use tf.keras.optimizers.legacy.Adam + compile=False in lstm_loader.py
NotFittedError on pipeline sklearn check_is_fitted strict in newer versions Add self.n_features_in_ = ... in transformer fit()
LSTM verdict shows NORMAL on clear attack Threshold calibrated too high after retraining Lower threshold: data['threshold'] = 0.55 then joblib.dump(data, 'lstm_scaler.pkl')
urllib3 SSL warning on Mac LibreSSL older than urllib3 v2 expects Safe to ignore — or add warnings.filterwarnings("ignore") in server.py

Disclaimer

This project is for educational and research purposes only. The demos simulate Spectre behaviour in a browser sandbox — they do not perform actual memory exfiltration. Real Spectre exploits require native code execution, fine-grained timer access, and system-specific tuning. All mitigations described are already deployed in modern operating systems, CPUs, and browsers.


References


Anil Kumar Das - G25AIT2009 Cybersecurity Assignment — Educational / POC Use Only

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors