This repository implements the Mixture of Product of Experts Variational Autoencoder (MoPoEVAE) for cross-modal reconstruction between WiFi CSI (Channel State Information) and camera images. Below is a comprehensive guide to setup, training, evaluation, and key implementation details.
- Environment Setup
- Dataset Preparation
- Training the Model
- Evaluation (Testing)
- Key Arguments
- Output Files
- Troubleshooting
- Python 3.8+
- CUDA-enabled GPU (recommended for training)
- Required dependencies:
pip install torch torchvision pytorch-lightning scikit-learn albumentations pandas numpy opencv-python tqdm wandb torchmetrics scipy scikit-image pyyamlFor experiment logging, login to WandB (skip if not using --log):
wandb loginThe model uses the Wificam Dataset (WiFi CSI + camera images). Follow this directory structure:
data/
wificam/
j3/
640/ # Or 320 (matches --imgsize)
csi.csv # CSI data (complex values)
csiComplex.npy # Auto-generated (CSI amplitudes)
statistics640.csv # Precomputed stats for normalization
statistics320.csv # For 320px images (if used)
csi.csv: Contains CSI complex values and corresponding image IDs.statistics*.csv: Precomputed mean/std for CSI and image normalization (format:CSI_mean, CSI_std, img_mean_R, img_mean_G, img_mean_B, img_std_R, img_std_G, img_std_B).csiComplex.npy: Auto-generated when first running the script (CSI amplitude values extracted fromcsi.csv).
Use train.py to train the MoPoEVAE model. Below is a step-by-step guide:
python train.py \
--name mopoevae_run1 \ # Unique run name (for logs/checkpoints)
--data data/wificam/ \ # Path to dataset root
--epochs 50 \ # Number of training epochs
--bs 32 \ # Batch size (GPU memory-dependent)
--ws 151 \ # CSI window size (MUST be odd)
--am concat \ # CSI feature aggregation method
--tenc \ # Enable temporal encoding
--device 0 \ # GPU index (or 'cpu' for CPU training)
--log \ # Optional: Log to WandB
--det # Optional: Deterministic training (reproducibility)- Dataset Splitting: 80% train / 10% validation / 10% test (fixed split with
random_state=42). - Model Initialization: MoPoEVAE with separate CSI and Image VAEs.
- Checkpoint Saving: Best models (by
val_loss,val_kl,val_ll,FID) are saved toruns/<name>/. - Logging:
- WandB (if
--log): Training/validation loss, KL divergence, FID, and reconstruction visualizations. - All
.pyfiles are copied toruns/<name>/for reproducibility.
- WandB (if
After training, run evaluation to generate reconstructions and compute metrics (FID, KID, SSIM, RMSE, PSNR):
python train.py \
--name mopoevae_run1 \ # Same run name as training
--data data/wificam/ \ # Same dataset path
--epochs 50 \ # Unused (required for checkpoint loading)
--bs 32 \ # Batch size for evaluation
--ws 151 \ # Same as training
--am concat \ # Same as training
--tenc \ # Same as training
--device 0 \ # Same as training
--test # Enable evaluation modeUse ffmpeg to create a video from reconstructed images (optional):
ffmpeg -framerate 100 -i runs/mopoevae_run1/out/combined/%d.png -c:v libx264 -pix_fmt yuv420p reconstruction_demo.mp4Run python train.py -h to see all arguments. Below are the most critical:
| Argument | Type/Default | Description |
|---|---|---|
--lr |
float/1e-3 | Learning rate for Adam optimizer |
--epochs |
int/100 | Number of training epochs |
--imgsize |
int/640 | Image size (320 or 640; matches dataset statistics*.csv) |
--zdim |
int/128 | Latent dimension of the VAE |
--bs |
int/32 | Training batch size (validation batch size = bs*8) |
--ws |
int/151 | CSI window size (number of WiFi packets; MUST be odd) |
--workers |
int/8 | Dataloader workers (adjust based on CPU cores) |
--device |
str/0 | CUDA device (e.g., 0, 1) or cpu |
--name |
str/default | Run name (saves logs/checkpoints to runs/<name>/) |
--data |
str/data/wificam/ | Path to dataset root |
--test |
flag | Skip training; run evaluation (uses saved checkpoints) |
--am |
str/concat | CSI feature aggregation (concat, gaussian, uniform) |
--random |
flag | Random image sampling within CSI window (training only) |
--tenc |
flag | Enable temporal encoding for CSI/image features |
--det |
flag | Deterministic training (fixed random seeds) |
--log |
flag | Enable WandB logging |
All outputs are saved to runs/<name>/:
| File/Directory | Description |
|---|---|
bestLoss.ckpt |
Model checkpoint with minimum validation loss |
bestKl.ckpt/bestLl.ckpt/bestFID.ckpt |
Checkpoints for best KL divergence, log-likelihood, and FID |
metrics.txt |
Aggregated evaluation metrics (FID, KID, SSIM, RMSE, PSNR) |
SSIM.txt/RMSE.txt/PSNR.txt |
Per-sample metrics (time series) |
out/ |
Reconstructed images: |
├─ combined/ |
Side-by-side real + reconstructed images |
├─ real/ |
Original test images |
└─ fake/ |
Reconstructed images (from CSI) |
*.py |
Copy of all code files (for reproducibility) |
- Reduce
--bs(e.g., to 16 or 8). - Reduce
--workers(e.g., to 4 or 2). - Use a smaller
--imgsize(320 instead of 640).
- Ensure
--namematches the training run name. - Verify all training arguments (
--ws,--am,--tenc) are identical to evaluation.
- Update the
project/entityintrain.py(line 76:WandbLogger(project="YourProject", entity="YourEntity")). - Ensure WandB is logged in (
wandb login).
--detenables fixed random seeds but may slow training (disables CuDNN benchmarking).
--wsmust be odd (enforced indataset.py). Use values like 151, 101, or 51.
This project is for research purposes only. See the LICENSE file for details.
- MoPoE implementation based on thomassutter/MoPoE
- Strohmayer J., Sterzinger R., Stippel C. and Kampel M., “Through-Wall Imaging Based On WiFi Channel State Information,” 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 2024, pp. 4000-4006, doi: https://doi.org/10.1109/ICIP51287.2024.10647775.