Skip to content

rerun-io/MCC

 
 

Repository files navigation

Multiview Compressive Coding for 3D Reconstruction

Chao-Yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichntenhofer, Georgia Gkioxari

mcc.mp4

This is a PyTorch implementation of Multiview Compressive Coding (MCC), which has been modified to use Rerun for visualization.

@article{wu2023multiview,
  author    = {Wu, Chao-Yuan and Johnson, Justin and Malik, Jitendra and Feichtenhofer, Christoph and Gkioxari, Georgia},
  title     = {Multiview Compressive Coding for 3{D} Reconstruction},
  journal   = {arXiv preprint arXiv:2301.08247},
  year      = {2023},
}

Installation

Conda

Assuming conda is installed on your system. Dependancies can be installed with

conda env create -f environment.yml

virtualenv + pip

Make sure nvcc is available (11.7 assumed and tested here, feel free to try other versions). Create a virtualenv and install dependencies.

python3 -m venv env
source env/bin/activate
pip install torch==1.13.1 torchvision==0.14.1 # need to manually beforehand, because pytorch3d doesn't specify install dependencies properly
pip install -r requirements.txt

Quick Rerun visualziation

rerun-example.mp4

This fork uses rerun to help you more easily visualize the method. For a quick demo run

python demo.py

Data

Please see DATASET.md for information on data preparation.

Running MCC on CO3D v2

To train an MCC model on the CO3D v2 dataset, please run

python submitit_mcc.py \
    --use_volta32 \
    --job_dir ./output \
    --nodes 4 \
    --co3d_path [path to CO3D dataset] \
    --resume [path to pretrained weights for RGB encoder (optional)] \
    --holdout_categories \
  • With --holdout_categories, we hold out a subset of categories during training, and evaluate on the held out categories only. To train on all categories, please remove the argument.
  • Here we use 4 nodes (machines) by default; Users may use a different value.
  • Optional: The RGB encoder may be initialized by a pre-trained image model. An ImageNet1K-MAE-pretrained model is available [here]. Using a pre-trained model may speed up training but does not affect the final results much.

Running MCC on Hypersim

To train an MCC model on the Hypersim dataset, please run

python submitit_mcc.py \
    --use_volta32 \
    --job_dir ./output \
    --nodes 4 \
    --hypersim_path [path to Hypersim dataset] \
    --resume [path to pretrained weights for RGB encoder (optional)] \
    --use_hypersim \
    --viz_granularity 0.2 --eval_granularity 0.2 \
    --blr 5e-5 \
    --epochs 50 \
    --train_epoch_len_multiplier 3200 \
    --eval_epoch_len_multiplier 200 \
  • Here we additionally specify --use_hypersim for running Hypersim scene reconstruction experiments.
  • We use slightly different hyperparameters to accommodate the scene reconstruction task.

Testing on iPhone captures

To test on iPhone captures, please use the Record3D App on an iPhone to capture an RGB image and the corresonding point cloud (.obj) file. To generate the segmentation mask, we used a private segmentation model; Users may use other tools/models to obtain the mask. Two example captures are available in the demo folder.

To run MCC inference on the example, please use, e.g.,

python demo.py --image demo/quest2.jpg --point_cloud demo/quest2.obj --seg demo/quest2_seg.png \
--checkpoint [path to model checkpoint] \

One may use a checkpoint from the training step above or download a model that is already trained on all CO3D v2 categories [here]. One may set the --score_thresholds argument to specify the score thresholds (More points are shown with a lower threshold, but the predictions might be noisier). The script will generate an html file showing an interactive visualizaion of the MCC output with plotly.

plotly

Acknowledgement

Part of this implementation is based on the MAE codebase. We thank Sasha Sax for help on loading Hypersim and Taskonomy data.

License

Multiview Compressive Coding is released under the CC-BY-NC 4.0.

About

Multiview Compressive Coding for 3D Reconstruction with Rerun visualization

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%