Skip to content

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers

License

Notifications You must be signed in to change notification settings

visinf/matryoshka

Repository files navigation

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers

This source code release accompanies the paper

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
Stephan Richter and Stefan Roth. In CVPR 2018.
Paper Supplemental

Please cite our work if you use code or data from this repository.

Requirements and set up

Clone the repository via git clone https://bitbucket.org/visinf/projects-2018-matryoshka ./matryoshka. Assuming you have set up an Anaconda or Miniconda environment, the following commands should get you started:

conda create -y -n matryoshka python=3.7
source activate matryoshka
conda install -y numpy scipy pillow
conda install -y pytorch torchvision -c pytorch

General notes

The shape layer representation will work the better the more consistent your input shapes are wrt. occlusions and nesting of 3D shapes. Meshes from different sources will probably be not consistent and in this case fewer layers are likely to work better. Keep in mind that few layers can often reconstruct remarkably well. If mesh quality varies in the dataset (as in ShapeNet), you are probably better off using a single shape layer and increasing the number of inner residual blocks (--block) or number of inner feature channels (--ngf).

Datasets

This version supports ShapeNet in 2 versions: as used in 3DR2N2[1], and as used in PTN[2]. It also supports the highres car experiment from OGN[3]. To run it with the respective datasets, please check the DatasetCollector.py. It commonly expects only a base directory including sub directories for shapes and renderings. The renderings are expected to be 128x128 images (see below).

Adding a new dataset should be straightforward:

  1. process images with crop_images.py.
  2. convert binvox to voxel, voxel to shape layer with voxel2layer.
  3. write an adapter inheriting from DatasetCollector, which collects samples

Input images

The networks are built to process input images of 128x128 pixels. For convenience, we provide a script that crops images to this size. Consequently, the DatasetCollector assumes that images are named *.128.png to indicate this format. Please have a look at crop_images.py and DatasetCollector.

References

[1] C. B. Choy, D. Xu, J. Gwak, K. Chen, and S. Savarese. 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. ECCV 2016

[2] X. Yan, J. Yang, E. Yumer, Y. Guo, and H. Lee. Perspective transformer nets: Learning single-view 3D object reconstruction without 3D supervision. NIPS 2016

[3] M. Tatarchenko, A. Dosovitskiy, and T. Brox. Octree generating networks: Efficient convolutional architectures for high-resolution 3D outputs. ICCV 2017

About

Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages