Ankur Aditya1*, Diptyaroop Maji1*, Lingdong Wang1, Bhavya Ramakrishna2, Ramesh Sitaraman1,3, Prashant Shenoy1
1University of Massachusetts Amherst 2Dolby Labs 3Akamai Tech
* Student authors with equal contribution
ReVo streams synchronized RGB + depth video between two machines in real time using WebRTC DataChannels. It supports traditional codecs (H.264, H.265) and the neural DCVC-RT codec, with forward error correction on I-frames and deadline-based frame assembly on the receiver.
FAQ.md
scripts/
├── generate_frame_masks.py Builds per-video frame-corruption masks from receiver logs
└── download_checkpoints.sh Downloads pre-trained LossRec checkpoints from Hugging Face
src/
├── signalling_server.py WebRTC signaling server (run on any machine reachable by sender and receiver)
├── sender/
│ ├── sender-3d.py Sender — reads video files, encodes, streams over WebRTC
│ ├── run_sender_eval.py Batch evaluation script (iterates over traces × videos)
│ ├── run_loss_trace.py Network emulator — applies tc/netem from a trace file
│ ├── H264_wrapper.py H.264 codec interface
│ ├── H265_wrapper.py H.265 / HEVC codec interface
│ ├── DCVCRT_wrapper.py DCVC-RT neural codec interface
│ ├── trace_map.txt Maps (category, video) → trace file for batch eval
│ ├── traces/ Network trace files organized by category (wifi/, cell/, …)
│ └── sender.md Detailed sender documentation
├── receiver/
│ ├── receiver-3d.py Receiver — reassembles frames, decodes, saves MP4
│ ├── run_receiver_eval.py Batch evaluation service (pairs with run_sender_eval.py)
│ ├── H264_wrapper.py H.264 codec interface
│ ├── H265_wrapper.py H.265 / HEVC codec interface
│ ├── DCVCRT_wrapper.py DCVC-RT neural codec interface
│ └── receiver.md Detailed receiver documentation
└── lossrec/
├── rgb/ RGB loss recovery model and inference script
├── depth/ Depth loss recovery model and inference script
└── lossrec.md Usage, flag reference, and checkpoint documentation
Neural loss recovery training details coming soon.
Sender machine Signaling server Receiver machine
───────────── ──────────────── ────────────────
sender-3d.py ──── SDP ────► signalling_server.py ◄──── receiver-3d.py
◄─── SDP ──── ────►
│
[WebRTC DataChannels established directly peer-to-peer] │
sender-3d.py ────── rgb_payload / depth_payload ──────────► receiver-3d.py
│
write_video_pyav()
saves rgb.mp4 + depth.mp4
- Signaling server relays SDP offer/answer between sender and receiver.
- Sender reads RGB and depth MP4 files, compresses each frame with the chosen codec, and transmits both streams over two unreliable WebRTC DataChannels.
- Receiver assembles incoming chunks into frames, decodes them, and writes the output to two MP4 files.
| Design | Detail |
|---|---|
| Transport | WebRTC DataChannels — unreliable, unordered (no SCTP retransmit) |
| I-frame protection | Reed-Solomon FEC (zfec k-of-n, ≈50% parity overhead) |
| P-frame recovery | Best-effort: missing chunks are zero-padded |
| Chunk interleaving | RGB and depth chunks alternate within each frame to equalize loss impact |
| Send pacing | Chunks spread evenly across the frame's time slot |
| Deadline clock | Receiver starts a wall-clock deadline on the first decoded I-frame |
| Freeze strategy | Lost frames repeat the last good frame |
conda create -n revo python=3.12 conda activate revopip install torch torchvision --index-url https://download.pytorch.org/whl/cu118pip install tqdm scipy pybind11 pillow pandas matplotlib pyyaml torchmetrics av aiortc aiohttp torchcodec==0.3 numpy opencv-python zfec einops pytorch_msssim timmInstall DCVC-RT dependencies:
Step 1: Clone the DCVC-RT repository.
git clone https://github.com/microsoft/DCVC.gitStep 2: Build and install the C++ extension.
cd ./DCVC/src/cpp/
pip install --no-build-isolation .Step 3: Download checkpoints from https://github.com/microsoft/DCVC
For more details or troubleshooting DCVC-RT installation please refer to official documentation at https://github.com/microsoft/DCVC
NOTE: Ensure the codec wrapper modules (H265_wrapper.py, H264_wrapper.py,
DCVCRT_wrapper.py) are present in both src/sender/ and src/receiver/.
Run this on any machine reachable by both sender and receiver:
cd src/
python signalling_server.pyListens on 0.0.0.0:8080.
The receiver must be running before the sender initiates the WebRTC handshake.
- Edit
src/receiver/run_receiver_eval.pywith your machine IPs, codec, and directory paths. - On the receiver machine:
cd src/receiver/
python run_receiver_eval.py- Edit
src/sender/run_sender_eval.pywith your machine IPs, codec, and directory paths. - Populate
src/sender/trace_map.txtwith(category, video_stem, trace_path)entries. - On the sender machine:
cd src/sender/
sudo python run_sender_eval.pySee sender.md and receiver.md for full configuration details.
Once a streaming session completes, the receiver writes a log file per video.
Parse those logs into per-video .npy frame masks (required by the loss recovery step):
- Edit the
LOG_DIRandSAVE_DIRvariables at the top ofscripts/generate_lost_frame_map.pyto point to the receiver log directory and your desired mask output directory. - Run from the repository root:
python scripts/generate_lost_frame_map.pyDownload the pre-trained checkpoints first (one-time setup):
bash scripts/download_checkpoints.shThen run inference — see lossrec.md for the full command reference for both RGB and depth streams.
| Flag | Codec |
|---|---|
h265 |
H.265 / HEVC (default) |
h264 |
H.264 / AVC |
dcvcrt |
DCVC-RT (neural video codec) |
Network emulation runs automatically — run_sender_eval.py launches
run_loss_trace.py as a subprocess for each run, passing the trace file
resolved from trace_map.txt. No manual invocation is needed.
Trace files are whitespace-separated with columns:
<timestamp_s> <bandwidth_mbps> <rtt_ms> <loss_0_to_1>
RTT is fixed at 40 ms in the current setup; only bandwidth and loss columns are
applied via Linux tc / netem. run_sender_eval.py requires sudo for
the tc commands.
See FAQ.md for known issues and workarounds.
If you find ReVo useful in your research, please cite:
@misc{aditya2026revocrosslayerreliablevolumetric,
title={ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System},
author={Ankur Aditya and Diptyaroop Maji and Lingdong Wang and Bhavya Ramakrishna and Ramesh Sitaraman and Prashant Shenoy},
year={2026},
eprint={2604.27441},
archivePrefix={arXiv},
primaryClass={cs.NI},
url={https://arxiv.org/abs/2604.27441},
}This codebase builds on several excellent open-source projects:
- The neural loss recovery module borrows the ViViT backbone from VideoMAE (Wang et al., NeurIPS 2022). We thank the MCG-NJU team for releasing their code.
- The
dcvcrtcodec integration uses DCVC-RT from Microsoft Research. We thank the DCVC team for open-sourcing their neural video compression framework.