This is the official code for the paper "AE2VID: Event-based Video Reconstruction via Aperture Modulation" by Chenxu Bai*, Boyu Li*, Peiqi Duan#, Xinyu Zhou, Hanyue Lou and Boxin Shi#.
AE2VID reconstructs high-speed videos by jointly using aperture-modulation-triggered events and motion-triggered events.
The pipeline contains two subnetworks:
- AENet reconstructs dense aperture references from aperture-opening events. It uses FIR to estimate an initial reference image from first-positive-event timing, IDN/SwinIR to denoise the reference image, and HSG to map the aperture reference into recurrent hidden states.
- MENet reconstructs intermediate frames from motion-triggered event voxels with a bidirectional E2VID-style recurrent model and a pixel-wise mixer.
The aperture-closing interval is discarded in the reconstruction pipeline and can optionally be filled with RIFE interpolation.
This project uses uv for dependency management.
The default pyproject.toml uses CUDA 12.1 PyTorch wheels.
uv syncPlease place checkpoints under pretrained/ or pass their paths with command-line arguments.
You can download checkpoints from the Google Drive:
pretrained/biape2vid_best.pth.tar
pretrained/v2v_weight.pth
pretrained/swinir_idn.pth
pretrained/flownet.pkl
Training datasets are available at Google Drive.
AMED are available at Google Drive.
Stage 1 trains the aperture/HSG adapter while the V2V-E2VID branch is frozen:
uv run python train.py adapter --config configs/train_adapter_v2v.yamlStage 2 trains the full model.BiApEVID.BiApEVID pipeline:
uv run python train.py ae2vid --config configs/train_ae2vid_v2v.yamlThe released setting initializes stage 2 from:
adapter_ckpt: ./logs/ae2vid_adapter/adapter_best.pth.tar
e2vid_ckpt: ./pretrained/v2v_weight.pth
EvAid-style folder:
uv run python predict.py evaid \
--dataset_root /path/to/EvAid \
--sequence bear \
--delta_frame 50 \
--recons_ckpt ./pretrained/biape2vid_best.pth.tar \
--denoiser_ckpt ./pretrained/swinir_idn.pth \
--rife_ckpt ./pretrained/flownet.pkl \
--output_dir ./outputs/evaidHQF h5:
uv run python predict.py hqf \
--input_h5 /path/to/HQF_h5/boxes.h5 \
--delta_frame 112 \
--recons_ckpt ./pretrained/biape2vid_best.pth.tar \
--denoiser_ckpt ./pretrained/swinir_idn.pth \
--rife_ckpt ./pretrained/flownet.pkl \
--output_dir ./outputs/hqfReal AMED-style sequence:
uv run python predict.py real \
--sequence_dir /path/to/AMED/sequence_0 \
--width 1280 \
--height 720 \
--recons_ckpt ./pretrained/biape2vid_best.pth.tar \
--output_dir ./outputs/real/sequence_0The real-data interface expects:
sequence_0/
frames/frame_0.png
frames/frame_1.png
events/*.npz or events/*.txt
For semi-real EvAid and HQF evaluation, the first and last frames in each window are degraded with the FIR simulation in utils/aperture_utils.py and then denoised by SwinIR/IDN before MENet reconstruction.

