Differentiable Blocks World:
Qualitative 3D Decomposition by Rendering Primitives
Tom Monnier Jake Austin Angjoo Kanazawa Alexei Efros Mathieu Aubry
v1-final-alt-2.mp4
Modified PyTorch implementation of Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives.
This fork has been modified to use Rerun for visualization.
Check out this webpage for more video results!
This repository contains:
- scripts to download and load datasets
- configs to optimize the models from scratch
- evaluation pipelines to reproduce quantitative results
- guidelines to run the model on a new scene
If you find this code useful, don't forget to star the repo ⭐ and cite the paper 👇
@article{monnier2023dbw,
title={{Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives}},
author={Monnier, Tom and Austin, Jake and Kanazawa, Angjoo and Efros, Alexei A. and Aubry, Mathieu},
journal={{arXiv:2307.05473 [cs.CV]}},
year={2023},
}
Starting from an empty Python environment (e.g., python3 -m venv dbw
) with CUDA 11.7 or 11.8 available (check that nvcc --version
is working), install torch for your CUDA version (see PyTorch website for more details).
Subsequently install all other requirements using
pip install -r requirements.txt
Note that this will install pytorch3d from source, which requires torch to be installed first, hence the torch installation step above.
TODO make sure this works as expected
conda env create -f environment.yml
conda activate dbw
Optional live monitoring 📉
Some monitoring routines are implemented, you can use them by specifying your visdom port in the config file. You will need to install visdom from source beforehand:git clone https://github.com/facebookresearch/visdom
cd visdom && pip install -e .
Optional Nerfstudio dataloading 🚜
If you want to load data processed by Nerfstudio (e.g., for a custom scene), you will need to install nerfstudio as described here. In general, executing the following lines should do the job:pip install ninja==1.10.2.3 git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install nerfstudio==0.1.15
bash scripts/download_data.sh
This command will download one of the following sets of scenes presented in the paper:
DTU
: paper / dataset (1.86GB, pre-processing conventions come from IDR, big thanks to the team!)BlendedMVS
: paper / dataset (115MB, thanks to the VolSDF team for hosting the dataset)Nerfstudio
: paper / repo / dataset (2.67GB, images and Nerfacto models for the 2 scenes in the paper)
It may happen that gdown
hangs, if so download the file manually and move it to the datasets
folder.
To launch a training from scratch, run:
cuda=gpu_id config=filename.yml tag=run_tag ./scripts/pipeline.sh
where gpu_id
is a device id, filename.yml
is a config in configs
folder, run_tag
is a tag for the experiment.
Results are saved at runs/${DATASET}/${DATE}_${run_tag}
where DATASET
is the dataset name
specified in filename.yml
and DATE
is the current date in mmdd
format.
Available configs 🔆
dtu/*.yml
for each DTU scenebmvs/*.yml
for each BlendedMVS scenenerfstudio/*.yml
for each Nerfstudio scene
NB: for running on Nerfstudio scenes, you need to install nerfstudio library (see installation section)
Computational cost 💰
The approximate optimization time is roughly 4 hours on a single GPU.
Our model is evaluated at the end of each run and scores are written in dtu_scores.tsv
for the official Chamfer evaluation and final_scores.tsv
for training losses, transparencies and
image rendering metrics.
To reproduce our results on a single DTU scene, run the
following command which will launch 5 sequential runs with different seeds
(the auto
score is the one with minimal training loss):
cuda=gpu_id config=dtu/scanXX.yml tag=default_scanXX ./scripts/multi_pipeline.sh
Get numbers for EMS and MBF baselines 📋
For completeness, we provide scripts for processing data and evaluating the following baselines:
- EMS: run
scripts/ems_pproc.sh
, then apply EMS using the official repo, then runscripts/ems_eval.sh
to evaluate the 3D decomposition - MBF: run
scripts/mbf_pproc.sh
, then apply MBF using the official repo, then runscripts/mbf_eval.sh
to evaluate the 3D decomposition
Do not forget to update the path of the baseline repos in src/utils/path.py
.
Results will also be computed using the preprocessing step removing the ground from the 3D input.
If you want to run our model on a custom scene, we recommend using
Nerfstudio framework and guidelines
to process your multi-views, obtain the cameras and check their quality by optimizing their default 3D model.
The resulting data and output model should be moved to datasets/nerfstudio
folder in the same format as
the other Nerfstudio scenes (you can also use symlinks).
Then, you can add the model path in the custom Nerfstudio dataloader (src/datasets/nerfstudio.py
), create a new
config from one of our nerfstudio config and run the model. One thing that is specific to each scene is
the initialization of R_world
and T_world
, which can be roughly estimated by visual comparisons in
plotly or Blender using the pseudo ground-truth point cloud.
If you like this project, check out related works from our group: