Skip to content

Jayinaksha/PRISM

Repository files navigation

πŸ“‘ Project PRISM

Passive RF-based Indoor Spatial Mapping

Real-Time Wi-Fi Zonal Localization Using ESP32 Channel State Information

Python 3.11+ scikit-learn ESP32 License


High-accuracy, privacy-preserving human tracking through walls and in total darkness β€” using a single $4 microcontroller and invisible Wi-Fi waves.


🧠 What Is PRISM?

Project PRISM is a passive indoor localization system that detects human presence and predicts spatial position in real-time without cameras, microphones, or wearable devices. It exploits Channel State Information (CSI) β€” the fine-grained amplitude and phase data embedded in every Wi-Fi packet β€” to sense how human bodies perturb the electromagnetic field in a room.

By deploying a custom Digital Signal Processing (DSP) pipeline and a 135-dimensional machine learning feature engine on data from a single ESP32 antenna, PRISM divides indoor spaces into discrete zones and classifies a person's location at ~100 Hz.

Key Capabilities

Capability Detail
Zones Up to 4 (Empty, Zone A, Zone B, Zone C)
Accuracy 73.3% generalized (4-zone room), 81.4% CV (2-zone corridor)
Latency Real-time (~10ms per inference cycle)
Hardware Single ESP32 NodeMCU ($4)
Privacy Zero visual/audio data captured
Conditions Works through walls, in complete darkness

πŸ“‘ Hardware Requirements

Component Purpose
1Γ— ESP32 NodeMCU Flashed with ESP-IDF CSI extraction firmware β€” operates as a Wi-Fi sniffer
1Γ— Laptop/PC Runs the Python ML backend; connected via Serial USB (/dev/ttyUSB0)
Ambient Wi-Fi Any standard 2.4GHz 802.11n router or device within range
USB Cable Micro-USB for ESP32 serial communication at 115200 baud

No additional sensors, cameras, or wearable devices are required.


πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        PRISM Architecture                           β”‚
β”‚                                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    Serial     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  ESP32   │───115200bd──→ β”‚  CSI Parser  │──→ β”‚  Ring Buffer  β”‚  β”‚
β”‚  β”‚ (Sniffer)β”‚   /dev/USB0   β”‚ (I/Q β†’ Amp)  β”‚    β”‚  (100 pkts)   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                         β”‚          β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚                              β”‚        DSP Pipeline               β”‚ β”‚
β”‚                              β”‚  1. Hampel Filter (outlier kill)  β”‚ β”‚
β”‚                              β”‚  2. Background Subtraction        β”‚ β”‚
β”‚                              β”‚  3. Butterworth Bandpass (0.1-3Hz)β”‚ β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                 β”‚                  β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚                              β”‚    Feature Engine (135-dim)        β”‚ β”‚
β”‚                              β”‚  β€’ Multi-Lag Autocorrelation       β”‚ β”‚
β”‚                              β”‚  β€’ Variance Ratios                 β”‚ β”‚
β”‚                              β”‚  β€’ Spectral Band Energy            β”‚ β”‚
β”‚                              β”‚  β€’ Covariance Eigenvalues          β”‚ β”‚
β”‚                              β”‚  β€’ Subcarrier Profile Gradients    β”‚ β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                 β”‚                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Live GUI     │←── β”‚   Random Forest Classifier               β”‚ β”‚
β”‚  β”‚ (Matplotlib)  β”‚    β”‚   + Confidence Thresholding (>50%)       β”‚ β”‚
β”‚  β”‚  Zone Display β”‚    β”‚   + 3-Vote Exponential Smoothing Queue   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ—‚οΈ Project Structure

wifi_localization/
β”‚
β”œβ”€β”€ README.md                      # This file
β”œβ”€β”€ walkthrough.md                 # Detailed technical report
β”œβ”€β”€ final_submission_materials.md  # Presentation script, slide layout, writeup
β”‚
β”œβ”€β”€ data/                          # CSI amplitude logs β€” Corridor environment
β”‚   β”œβ”€β”€ empty_area.csv             #   Empty corridor (3 recordings)
β”‚   β”œβ”€β”€ zone_a.csv                 #   Zone A occupancy (3 recordings)
β”‚   └── zone_b.csv                 #   Zone B occupancy (3 recordings)
β”‚
β”œβ”€β”€ data_room/                     # CSI amplitude logs β€” Room environment
β”‚   β”œβ”€β”€ empty_room.csv             #   Empty room
β”‚   β”œβ”€β”€ zone_a.csv                 #   Zone A occupancy
β”‚   β”œβ”€β”€ zone_b.csv                 #   Zone B occupancy
β”‚   └── zone_c.csv                 #   Zone C occupancy
β”‚
β”œβ”€β”€ exp_data/                      # Experimental data from multiple environments
β”‚   β”œβ”€β”€ sparkonics_lab_*.csv       #   Sparkonics Lab captures
β”‚   β”œβ”€β”€ stc_*.csv                  #   STC building captures
β”‚   β”œβ”€β”€ stairs_*.csv               #   Stairwell captures
β”‚   └── tl_*.csv                   #   TL environment captures
β”‚
β”œβ”€β”€ images/                        # Generated visualizations
β”‚   β”œβ”€β”€ heatmap_room_raw.png       #   Raw CSI amplitude heatmaps
β”‚   β”œβ”€β”€ heatmap_room_clean.png     #   Filtered CSI heatmaps
β”‚   β”œβ”€β”€ dsp_comparison.png         #   Raw vs cleaned signal comparison
β”‚   β”œβ”€β”€ pca_room.png               #   PCA scatter plot (room features)
β”‚   β”œβ”€β”€ pca_corridor.png           #   PCA scatter plot (corridor features)
β”‚   β”œβ”€β”€ feature_importance_*.png   #   Random Forest feature importances
β”‚   └── zone_*.png                 #   Per-zone signal plots
β”‚
β”œβ”€β”€ wifi_localization/             # Source code
β”‚   β”œβ”€β”€ pyproject.toml             #   Python dependencies (uv managed)
β”‚   β”œβ”€β”€ uv.lock                    #   Locked dependency versions
β”‚   β”‚
β”‚   β”œβ”€β”€ prism.py                   # ⚑ Core DSP filter library
β”‚   β”‚                              #    Hampel, Background Sub, Butterworth
β”‚   β”‚
β”‚   β”œβ”€β”€ prism_debug.py             # πŸ”§ ESP32 serial debug tool
β”‚   β”‚                              #    Raw packet inspection for 10 seconds
β”‚   β”‚
β”‚   β”œβ”€β”€ csi_logger.py              # πŸ“ Data harvesting script
β”‚   β”‚                              #    Records CSI from live ESP32 to CSV
β”‚   β”‚
β”‚   β”œβ”€β”€ prism_ai.py                # πŸ€– v1 ML training (SVM-RBF, 4-class)
β”‚   β”œβ”€β”€ prism_ai_prev.py           # πŸ€– v0 ML training (RF, basic variance)
β”‚   β”œβ”€β”€ prism_ai_v2.py             # πŸ€– v2 ML training (corridor, 87-dim)
β”‚   β”œβ”€β”€ prism_ai_room.py           # πŸ€– v3 ML training (room, 135-dim) ← BEST
β”‚   β”‚
β”‚   β”œβ”€β”€ prism_live_room.py         # πŸ”΄ Live inference (v1 model, 4-zone)
β”‚   β”œβ”€β”€ prism_live_room_v2.py      # πŸ”΄ Live inference (v2 corridor, 2-zone)
β”‚   β”œβ”€β”€ prism_live_room_room.py    # πŸ”΄ Live inference (room model, 4-zone) ← BEST
β”‚   β”‚
β”‚   β”œβ”€β”€ prism_model.pkl            # Serialized v1 model
β”‚   β”œβ”€β”€ prism_model_v2.pkl         # Serialized corridor model
β”‚   β”œβ”€β”€ prism_model_room.pkl       # Serialized room model ← BEST
β”‚   β”‚
β”‚   β”œβ”€β”€ generate_visualizations.py # πŸ“Š Heatmap, PCA, DSP comparison generator
β”‚   β”œβ”€β”€ create_pptx.py             # πŸ“‘ Auto-generates presentation slides
β”‚   β”‚
β”‚   β”œβ”€β”€ corridor/                  # Organized corridor environment copies
β”‚   β”‚   β”œβ”€β”€ prism_ai_v2.py
β”‚   β”‚   β”œβ”€β”€ prism_live_room_v2.py
β”‚   β”‚   └── prism_model_v2.pkl
β”‚   β”‚
β”‚   └── room/                      # Organized room environment copies
β”‚       β”œβ”€β”€ prism_ai_room.py
β”‚       β”œβ”€β”€ prism_live_room_room.py
β”‚       └── prism_model_room.pkl
β”‚
β”œβ”€β”€ PRISM_Zonal_Localization.pptx  # Generated presentation
└── RF_PRISM*.mp4                  # Demo videos

βš™οΈ The DSP Pipeline

The raw CSI amplitude from the ESP32 is devastatingly noisy. PRISM applies a three-stage Digital Signal Processing pipeline (implemented in prism.py) to isolate the human-induced perturbations:

Stage 1 β€” Hampel Filter (Outlier Removal)

Bluetooth, microwaves, and other RF sources cause massive random spikes. A rolling median window (size=15) replaces any value exceeding 3Οƒ (via Median Absolute Deviation) with the local median.

Stage 2 β€” Dynamic Background Subtraction

Static room geometry (walls, desks) dominates the raw signal. A trailing 100-packet moving average is subtracted to zero out the static environment, isolating only dynamic (human-induced) changes.

Stage 3 β€” Butterworth Bandpass Filter

A 3rd-order Butterworth bandpass at 0.1–3.0 Hz eliminates low-frequency drift and high-frequency electronic noise, isolating the Doppler frequencies of human breathing (~0.1–0.5 Hz) and walking (~1.0–3.0 Hz).

Signal Extraction

  • The ESP32 outputs a 128-element array per packet: [Real₁, Imag₁, Realβ‚‚, Imagβ‚‚, ...]
  • Amplitude is computed as: A = √(IΒ² + QΒ²) (phase discarded due to single-antenna clock drift)
  • Null subcarriers 27–37 are dropped per IEEE 802.11n β†’ 53 active subcarriers

πŸ”¬ Feature Engineering (135 Dimensions)

The critical breakthrough was moving from naive per-subcarrier statistics to domain-aware time-frequency features. For each 100-packet (1-second) window, we extract:

Feature Group Dimensions Scientific Justification
Basic Statistics 40 Variance, std, energy, diff-variance, skewness, kurtosis, IQR, range (5-number summary each)
Multi-Lag Autocorrelation 15 Lags 1, 5, 10 capture signal persistence β€” separates erratic noise from rhythmic walking
Temporal Derivatives 10 1st and 2nd order temporal diff-variance detects acceleration patterns
Multi-Scale Variance Ratios 10 Half/quarter window variance ratios detect subjects crossing zone boundaries
Spectral Features 20 FFT peak frequency, spectral centroid, spectral bandwidth, band energy ratios (breathing vs walking)
Subcarrier Profile Gradients 10 1st and 2nd derivatives of the mean amplitude profile capture frequency-selective fading
Covariance Eigenvalues 5 Top-5 eigenvalues of the 53Γ—53 subcarrier covariance matrix map multipath complexity
Correlation Statistics 3 Mean, std, median of upper-triangle cross-subcarrier correlations
Global Metrics 2 Total energy, subcarrier entropy
Top-10 Subcarrier Features 20 Variance and energy of the 10 most variable subcarriers

πŸ€– Machine Learning Models

Model Evolution

Version Script Model Features Classes Accuracy Notes
v0 prism_ai_prev.py Random Forest ~53 (variance only) 4 ~65% Basic per-subcarrier variance
v1 prism_ai.py SVM-RBF ~212 (var+std+energy+diff) 4 Variable Leave-One-Chunk-Out CV
v2 prism_ai_v2.py GradientBoosting 87 3 (corridor) 81.4% Best corridor model
v3 prism_ai_room.py RandomForest 135 4 (room) 73.3% Production model

Why Random Forest Over SVM?

  • SVM-RBF requires StandardScaling which destroys the relative magnitude physics between subcarriers
  • SVMs scale poorly in high-dimensional (135+), highly-correlated feature spaces
  • Random Forest implicitly feature-selects, carves non-linear decision boundaries, and needs no normalization

The Overfitting Trap (Temporal Data Leakage)

The Bug: Initial models reported 96.3% accuracy but failed completely in live inference.

Root Cause: A sliding window step of 10 (on a 100-packet window) created 90% overlap. K-Fold CV leaked near-identical frames across train/test splits. The model memorized local noise patterns, not physical zone signatures.

The Fix:

  1. Reduced overlap to 50% (step=50) for truly independent windows
  2. 8Γ— Gaussian noise augmentation (scaled per-subcarrier std) simulating dynamic multipath changes
  3. Cross-validation runs only on real (non-augmented) samples

Final Validated Performance (Room Model)

Metric Score
Overall Accuracy 73.3%
Empty Detection Recall >83%
Zone B Recall >86%
False Positive Rate Low β€” confusions largely between neighboring physical zones

πŸ”΄ Live Inference Architecture

The real-time system (prism_live_room_room.py) streams from the ESP32 at 115200 baud and runs inference on every incoming packet:

Ring Buffer Processing

A circular NumPy buffer maintains the latest 100 packets. Old data rolls out, new data rolls in β€” numpy matrix operations execute without memory reallocation.

Two-Layer UI Stabilization

Even an 86% accurate model will misclassify ~1/10 packets, causing UI flicker. PRISM solves this with:

  1. Confidence Thresholding: predict_proba() must exceed 50% for the dominant class. Below threshold β†’ fallback to previous stable state.

  2. Exponential Vote Queue: A 3-vote sliding window requires unanimous agreement before switching zones. A 1.0-second release timeout prevents zone "sticking" when the target leaves.

Live GUI

The Matplotlib-based dashboard renders zone rectangles that light up in real-time as the model classifies human position:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          STATUS: TARGET IN ZONE B                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚          β”‚  β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚  β”‚              β”‚   β”‚
β”‚  β”‚  Zone A  β”‚  β”‚  ZONE B  β”‚  β”‚    Zone C    β”‚   β”‚
β”‚  β”‚          β”‚  β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚  β”‚              β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Usage

We use uv to manage Python dependencies.

1. Install Dependencies

cd wifi_localization/
uv sync

2. Debug ESP32 Connection

Verify your ESP32 is streaming CSI data correctly:

uv run prism_debug.py

This prints raw serial lines for 10 seconds and reports packet rate.

3. Collect Training Data

Record CSI data from the live ESP32 to CSV files:

uv run csi_logger.py

4. Train the Room Model

Retrain the model with new data or modified hyperparameters:

uv run prism_ai_room.py

This automatically handles:

  • 50% non-overlapping sliding windows
  • 8Γ— Gaussian noise augmentation
  • 5-fold Stratified CV on real data only
  • Model comparison (SVM-RBF, RandomForest, HistGBM)
  • Feature importance plots

5. Train the Corridor Model

uv run prism_ai_v2.py

6. Run Live Inference (Room Radar)

Fire up the real-time dashboard with the room model:

uv run prism_live_room_room.py

⚠️ Note: Ensure your ESP32 is plugged into /dev/ttyUSB0 and idf.py monitor is not running. If your port differs (e.g., COM3 on Windows), modify the SERIAL_PORT variable at the top of the script.

7. Run Live Inference (Corridor Radar)

uv run prism_live_room_v2.py

8. Generate Visualizations

Create heatmaps, PCA plots, and DSP comparison images:

uv run generate_visualizations.py

9. Generate Presentation Slides

Auto-generate the PowerPoint deck:

uv run --with python-pptx create_pptx.py

πŸ“Š Data Collection Environments

Environment 1: Corridor (2 Zones)

  • Classes: Empty, Zone A, Zone B
  • Challenge: Symmetric geometry created near-identical multipath signatures
  • Fisher Separability Score: 0.070 (extremely low)
  • Data: 9 CSV files across 3 recording sessions per class

Environment 2: Enclosed Room (4 Zones)

  • Classes: Empty, Zone A, Zone B, Zone C
  • Advantage: Enclosed walls create distinct multipath reflections per zone
  • Data: 1,500 continuous packets (~15 seconds steady recording) per zone
  • Result: Substantially better spatial discrimination

Experimental Data

Additional captures from diverse environments (STC building, Sparkonics Lab, stairwells) are stored in exp_data/ for extended analysis.


πŸ§ͺ Dependencies

Package Version Purpose
numpy β‰₯2.4.4 Matrix operations, ring buffers
pandas β‰₯3.0.2 Data loading, rolling window calculations
scipy β‰₯1.17.1 Butterworth filter, signal processing
scikit-learn β‰₯1.8.0 Random Forest, SVM, PCA, cross-validation
matplotlib β‰₯3.10.8 Live GUI dashboard, visualization generation
pyserial β‰₯3.5 ESP32 serial communication
python-pptx (optional) PowerPoint slide generation

Python: β‰₯ 3.11


πŸ“š Technical References

  • CSI Extraction: ESP-IDF Wi-Fi CSI firmware for ESP32
  • Hampel Filter: Friedrich R. Hampel's robust outlier detection via MAD
  • Butterworth Filter: 3rd-order Infinite Impulse Response (IIR) bandpass
  • Feature Engineering: Inspired by radar micro-Doppler signature analysis
  • Validation: Stratified K-Fold with temporal de-correlation (50% non-overlapping windows)

πŸ“„ Files Reference

Script Role Input Output
prism.py Core DSP library Raw CSV Cleaned signal + plots
prism_debug.py ESP32 serial debugger /dev/ttyUSB0 Terminal diagnostics
prism_ai_room.py Room model trainer data_room/*.csv prism_model_room.pkl
prism_ai_v2.py Corridor model trainer data/*.csv prism_model_v2.pkl
prism_live_room_room.py Live radar (room) Serial + .pkl Real-time GUI
prism_live_room_v2.py Live radar (corridor) Serial + .pkl Real-time GUI
generate_visualizations.py Plot generator data/, data_room/ images/*.png
create_pptx.py Slide generator images/ .pptx

Built as part of the Vinayabrhami AI OS Architecture.

Project PRISM β€” Seeing through walls with invisible waves. πŸ“‘

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages