This repository contains the source code for the cuboid-based scene decomposition method described in our paper Robust Shape Fitting for 3D Scene Abstraction. It is extension of our previous work Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images. Please refer to the following repository if you are looking for the source code of the previous version: cuboids_revisited_cvpr21.
If you use this code, please cite both papers:
@article{kluger2024robust,
title={Robust Shape Fitting for 3D Scene Abstraction},
author={Kluger, Florian and Brachmann, Eric and Yang, Michael Ying and Rosenhahn, Bodo},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024}
}
@inproceedings{kluger2021cuboids,
title={Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images},
author={Kluger, Florian and Ackermann, Hanno and Brachmann, Eric and Yang, Michael Ying and Rosenhahn, Bodo},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021}
}
For depth estimation, we utilise BTS. If you do as well, please also cite their paper:
@article{lee2019big,
title={From big to small: Multi-scale local planar guidance for monocular depth estimation},
author={Lee, Jin Han and Han, Myung-Kyu and Ko, Dong Wook and Suh, Il Hong},
journal={arXiv preprint arXiv:1907.10326},
year={2019}
}
Get the code:
git clone --recurse-submodules https://github.com/fkluger/cuboids_revisited.git
cd cuboids_revisited
git submodule update --init --recursive
Set up the Python environment using Anaconda:
conda env create -f environment.yml
source activate cuboids_pami
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge
In order to use the NYU Depth v2 dataset, you need to obtain the original MAT-file and convert it to a version 7 MAT-file in MATLAB so that we can load it via scipy:
load('nyu_depth_v2_labeled.mat')
save('nyu_depth_v2_labeled.v7.mat','-v7')
Then, extract all images and depth maps to separate Pickle files using our helper script:
python util/extract_nyu_to_files.py --source nyu_depth_v2_labeled.v7.mat --destination ./datasets/nyu_depth/files
Download the SMH dataset with depth maps from here: https://github.com/fkluger/smh
Download our pre-trained models, which we used for the experiments in our paper from
here and place the files in the models
directory.
If you want to train our method for RGB input, please also obtain the pre-trained weights for the BTS depth estimator
from here and place them in
the models
folder as well.
In order to repeat the main experiments from our paper using pre-trained neural networks, you can run the following commands:
python evaluate.py --load models/nyu_gt_numerical/run1/consac_weights_best.net
This will run our method for depth input on the NYU test set with the parameters used in the paper and report all
evaluation metrics at the end.
Replace run1
with run2 ... run5
to evaluate one of the other training runs.
python evaluate.py --data_path datasets/nyu_depth/files --load models/nyu_gt_neural/run1/consac_weights_best.net --minsolver transformer --load_solver models/nyu_gt_neural/run1/primitive_fit_weights_best.net
python evaluate.py --data_path datasets/nyu_depth/files --depth_model bts --load models/nyu_bts_numerical/run1/consac_weights_best.net --load_depth models/nyu_bts_numerical/run1/depth_weights_best.net
python evaluate.py --data_path datasets/nyu_depth/files --depth_model bts --load models/nyu_bts_neural/run1/consac_weights_best.net --load_depth models/nyu_bts_neural/run1/depth_weights_best.net --minsolver transformer --load_solver models/nyu_bts_neural/run1/primitive_fit_weights_best.net
python evaluate.py --dataset smh --data_path /path/to/smh -t 0.04 --a_min 2 --a_max 30 --load models/smh_gt_numerical/run1/consac_weights_best.net --instances 16 --fitting_lr 0.5
python evaluate.py --dataset smh --data_path /path/to/smh -t 0.04 --a_min 2 --a_max 30 --load models/smh_gt_neural/run1/consac_weights_best.net --minsolver transformer --load_solver models/smh_gt_neural/run1/primitive_fit_weights_best.net
Add the option --visualise
to save plots visualising the results for each image.
Set the destination folder with the --eval_results PATH
option.
python train.py --train_consac --hyps 32 --data_path datasets/nyu_depth/files
python pretrain_solver.py --dataset nyu
python train.py --dataset smh --data_path /path/to/smh --train_consac --hyps 32 --consac_lr 1e-6 --maximise_second_entropy 0.1 --epochs 100 --a_min 2.0 --a_min 30 --fitting_lr 0.5
python pretrain_solver.py --dataset smh
python train.py --minsolver transformer --train_consac --train_solver --consac_lr 1e-7 --solver_lr 1e-7 --softmax_alpha 1000 --max_prob_loss 0 --minimise_corr 0 --maximise_second_entropy 0 --hyps 32 --load models/nyu_gt_numerical/run1/consac_weights_best.net --load_solver models/solver/nyu/primitive_fit_weights_best.net --data_path datasets/nyu_depth/files
python train.py --depth_model bts --train_consac --train_depth --consac_lr 1e-7 --depth_lr 1e-7 --softmax_alpha 1000 --max_prob_loss 0 --minimise_corr 0 --maximise_second_entropy 0 --hyps 32 --load models/nyu_gt_numerical/run1/consac_weights_best.net
python train.py --minsolver transformer --depth_model bts --train_consac --train_solver --train_depth --consac_lr 1e-7 --solver_lr 1e-7 --depth_lr 1e-7 --softmax_alpha 1000 --max_prob_loss 0 --minimise_corr 0 --maximise_second_entropy 0 --hyps 32 --load models/nyu_gt_numerical/run1/consac_weights_best.net --load_solver models/solver/nyu/primitive_fit_weights_best.net --data_path datasets/nyu_depth/files
python train.py --dataset smh --data_path /path/to/smh -t 0.04 --a_min 2 --a_max 30 --minsolver transformer --train_consac --train_solver --consac_lr 1e-8 --solver_lr 1e-8 --softmax_alpha 1000 --max_prob_loss 0 --minimise_corr 0 --maximise_second_entropy 0 --hyps 32 --load models/smh_gt_numerical/run1/consac_weights_best.net --load_solver models/solver/smh/primitive_fit_weights_best.net
You may want to use a separate GPU for --depth_gpu
, as the whole pipeline does not fit on a single GPU with 12GB memory.