ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System

Ankur Aditya^1*, Diptyaroop Maji^1*, Lingdong Wang¹, Bhavya Ramakrishna², Ramesh Sitaraman^1,3, Prashant Shenoy¹

¹University of Massachusetts Amherst ²Dolby Labs ³Akamai Tech
^* Student authors with equal contribution

ReVo streams synchronized RGB + depth video between two machines in real time using WebRTC DataChannels. It supports traditional codecs (H.264, H.265) and the neural DCVC-RT codec, with forward error correction on I-frames and deadline-based frame assembly on the receiver.

Repository layout

FAQ.md
scripts/
├── generate_frame_masks.py     Builds per-video frame-corruption masks from receiver logs
└── download_checkpoints.sh     Downloads pre-trained LossRec checkpoints from Hugging Face
src/
├── signalling_server.py        WebRTC signaling server (run on any machine reachable by sender and receiver)
├── sender/
│   ├── sender-3d.py            Sender — reads video files, encodes, streams over WebRTC
│   ├── run_sender_eval.py      Batch evaluation script (iterates over traces × videos)
│   ├── run_loss_trace.py       Network emulator — applies tc/netem from a trace file
│   ├── H264_wrapper.py         H.264 codec interface
│   ├── H265_wrapper.py         H.265 / HEVC codec interface
│   ├── DCVCRT_wrapper.py       DCVC-RT neural codec interface
│   ├── trace_map.txt           Maps (category, video) → trace file for batch eval
│   ├── traces/                 Network trace files organized by category (wifi/, cell/, …)
│   └── sender.md               Detailed sender documentation
├── receiver/
│   ├── receiver-3d.py          Receiver — reassembles frames, decodes, saves MP4
│   ├── run_receiver_eval.py    Batch evaluation service (pairs with run_sender_eval.py)
│   ├── H264_wrapper.py         H.264 codec interface
│   ├── H265_wrapper.py         H.265 / HEVC codec interface
│   ├── DCVCRT_wrapper.py       DCVC-RT neural codec interface
│   └── receiver.md             Detailed receiver documentation
└── lossrec/
    ├── rgb/                    RGB loss recovery model and inference script
    ├── depth/                  Depth loss recovery model and inference script
    └── lossrec.md              Usage, flag reference, and checkpoint documentation

Neural loss recovery training details coming soon.

How it works

  Sender machine                Signaling server            Receiver machine
  ─────────────                 ────────────────            ────────────────
  sender-3d.py  ──── SDP ────►  signalling_server.py  ◄──── receiver-3d.py
                ◄─── SDP ────                          ────►
                                                              │
  [WebRTC DataChannels established directly peer-to-peer]     │
  sender-3d.py ────── rgb_payload / depth_payload ──────────► receiver-3d.py
                                                              │
                                                         write_video_pyav()
                                                         saves rgb.mp4 + depth.mp4

Signaling server relays SDP offer/answer between sender and receiver.
Sender reads RGB and depth MP4 files, compresses each frame with the chosen codec, and transmits both streams over two unreliable WebRTC DataChannels.
Receiver assembles incoming chunks into frames, decodes them, and writes the output to two MP4 files.

Key design choices

Design	Detail
Transport	WebRTC DataChannels — unreliable, unordered (no SCTP retransmit)
I-frame protection	Reed-Solomon FEC (zfec k-of-n, ≈50% parity overhead)
P-frame recovery	Best-effort: missing chunks are zero-padded
Chunk interleaving	RGB and depth chunks alternate within each frame to equalize loss impact
Send pacing	Chunks spread evenly across the frame's time slot
Deadline clock	Receiver starts a wall-clock deadline on the first decoded I-frame
Freeze strategy	Lost frames repeat the last good frame

Quick start

1. Install dependencies

conda create -n revo python=3.12

conda activate revo

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

pip install tqdm scipy pybind11 pillow pandas matplotlib pyyaml torchmetrics av aiortc aiohttp torchcodec==0.3 numpy opencv-python zfec einops pytorch_msssim timm

Install DCVC-RT dependencies:

Step 1: Clone the DCVC-RT repository.

git clone https://github.com/microsoft/DCVC.git

Step 2: Build and install the C++ extension.

cd ./DCVC/src/cpp/
pip install --no-build-isolation .

Step 3: Download checkpoints from https://github.com/microsoft/DCVC

For more details or troubleshooting DCVC-RT installation please refer to official documentation at https://github.com/microsoft/DCVC

NOTE: Ensure the codec wrapper modules (H265_wrapper.py, H264_wrapper.py, DCVCRT_wrapper.py) are present in both src/sender/ and src/receiver/.

2. Start the signaling server

Run this on any machine reachable by both sender and receiver:

cd src/
python signalling_server.py

Listens on 0.0.0.0:8080.

3. Start the receiver

The receiver must be running before the sender initiates the WebRTC handshake.

Edit src/receiver/run_receiver_eval.py with your machine IPs, codec, and directory paths.
On the receiver machine:

cd src/receiver/
python run_receiver_eval.py

4. Start the sender

Edit src/sender/run_sender_eval.py with your machine IPs, codec, and directory paths.
Populate src/sender/trace_map.txt with (category, video_stem, trace_path) entries.
On the sender machine:

cd src/sender/
sudo python run_sender_eval.py

See sender.md and receiver.md for full configuration details.

5. Generate frame-corruption masks

Once a streaming session completes, the receiver writes a log file per video. Parse those logs into per-video .npy frame masks (required by the loss recovery step):

Edit the LOG_DIR and SAVE_DIR variables at the top of scripts/generate_lost_frame_map.py to point to the receiver log directory and your desired mask output directory.
Run from the repository root:

python scripts/generate_lost_frame_map.py

6. Run loss recovery

Download the pre-trained checkpoints first (one-time setup):

bash scripts/download_checkpoints.sh

Then run inference — see lossrec.md for the full command reference for both RGB and depth streams.

Supported codecs

Flag	Codec
`h265`	H.265 / HEVC (default)
`h264`	H.264 / AVC
`dcvcrt`	DCVC-RT (neural video codec)

Network emulation

Network emulation runs automatically — run_sender_eval.py launches run_loss_trace.py as a subprocess for each run, passing the trace file resolved from trace_map.txt. No manual invocation is needed.

Trace files are whitespace-separated with columns:

<timestamp_s>  <bandwidth_mbps>  <rtt_ms>  <loss_0_to_1>

RTT is fixed at 40 ms in the current setup; only bandwidth and loss columns are applied via Linux tc / netem. run_sender_eval.py requires sudo for the tc commands.

🔧 Troubleshooting

See FAQ.md for known issues and workarounds.

🎓 Citation

If you find ReVo useful in your research, please cite:

@misc{aditya2026revocrosslayerreliablevolumetric,
      title={ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System}, 
      author={Ankur Aditya and Diptyaroop Maji and Lingdong Wang and Bhavya Ramakrishna and Ramesh Sitaraman and Prashant Shenoy},
      year={2026},
      eprint={2604.27441},
      archivePrefix={arXiv},
      primaryClass={cs.NI},
      url={https://arxiv.org/abs/2604.27441}, 
}

Acknowledgments

This codebase builds on several excellent open-source projects:

The neural loss recovery module borrows the ViViT backbone from VideoMAE (Wang et al., NeurIPS 2022). We thank the MCG-NJU team for releasing their code.
The dcvcrt codec integration uses DCVC-RT from Microsoft Research. We thank the DCVC team for open-sourcing their neural video compression framework.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
scripts		scripts
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
FAQ.md		FAQ.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System

Repository layout

How it works

Key design choices

Quick start

1. Install dependencies

2. Start the signaling server

3. Start the receiver

4. Start the sender

5. Generate frame-corruption masks

6. Run loss recovery

Supported codecs

Network emulation

🔧 Troubleshooting

🎓 Citation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReVo: A Cross-Layer Reliable Volumetric Videoconferencing System

Repository layout

How it works

Key design choices

Quick start

1. Install dependencies

2. Start the signaling server

3. Start the receiver

4. Start the sender

5. Generate frame-corruption masks

6. Run loss recovery

Supported codecs

Network emulation

🔧 Troubleshooting

🎓 Citation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages