Skip to content

Latest commit

 

History

History
135 lines (98 loc) · 6.64 KB

README.md

File metadata and controls

135 lines (98 loc) · 6.64 KB

GEOMetrics

This is a repository to reproduce the methods from the paper "GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects". This project is a combination of new ideas for mesh generation, applied to reconstructing mesh objects from single images. The goal of this porject is to produce mesh objects which properly take advatage of graceful scaling properties of thier graph-based representation.

Example of the variation in face density our method achieves

There are 4 main ideas proposed in this project:

  • A differentaible surface sampling of faces allowing for a point-to-point loss and a point-to-surface loss to be introduced. This is examined in the Loss_Comparison directory.
  • A latent loss based on minimizing the distance between encodings of mesh objects produced through a mesh-to-voxel mapping procedure.
  • A extension to the standard Graph Convolution Network called 0N-GCN which prevents vertex smoothing. This is defined in Layers.py.
  • An adaptive face splitting procedure which analyses local face curvature to encourage local complexity emerge.

This project runs with the following dependencies:

  • python 2.7
  • pytorch 0.4.0
  • scipy
  • matplotlib
  • PIL
  • tqdm

Data Production

To produce the data needed to train and test the methods of this project use the 'data_prep.py' script. This will download CAD models from the core classes of the ShapeNet data set, produce the data require for the latent loss, sample the surface of each ground truth mesh, render the objects as images, and split all the data into training, validation and test sets. This script makes use of the binvoxer executable, so first call

sudo chmod 777 scripts/binvox 

Blender is also needed for this project so please ensure it is installed before beginning.

sudo apt install blender

By default this scripts downloads the full chair class, and render 24 images for each object. To achieve this call:

python data_prep.py

As an example to further understand how to customize the data, to produce 1000 plane call:

python data_prep.py --object plane -no 1000

Diffentiable Surface Losses

We introduce two new losses for reconstructing meshes. These losses are based of the idea of differentiating through the random selection of points on a triangular surface via the reparametrization trick. This allows the adoption of a chamfer loss comparing the samplings of ground truth and predicted mesh surfaces, which does not explicitly penalize the position of vertices. We call this the point-to-point loss. This idea also allows to the adoption of a more accurate loss which compares a sampled set of points to a surface directly, using the "3D point to triangle distance" algorithm. We call this the point-to-surface loss. We compare these two losses and to a loss which directly penalizes vertex position with respect to their ability to reconstruct surfaces, in the Loss_Comparison directory.

These functions require a python package to be built. To do this call:

python chamfer_distance/build.py

A diagram comparing different reconstruction losses.

Latent Loss

One of the main contributions of this project, and a principle loss term for the complete mesh generation pipeline is the latent loss. To produce this loss we first train a mesh-to-voxel mapping. A mesh enocder, made up of our proposed 0N-GCN layers, takes as input a mesh object defined by vertex positions and an adjacency matrix and outputs a small lantent vector. This vector is passed to a voxel decoder which outputs a voxelized representation of the origional mesh. This mapping is trained to minimize the MSE between the ground-truth voxelization of the mesh and the predicted voxelization. When training the complete mesh prediction system the training objective is partly defined by the MSE between the latent embedding of the ground-truth mesh and the predicted mesh.

To train this system call

python auto_encoder.py --object $obj$

where $obj$ is the object class you wish to train

A diagram illustrating the mesh-to-voxel mapping and how it is employed for procuding the latent loss.

Mesh Reconstruction

The ideas put forth in this paper are applied to the task of reconstructing 3D meshes from single RGB images. This is accomplished by itertively applying what we call a mesh reconstruction module to an inputted mesh and image pair. In each module, image features are extracted form the image, and projected onto the inputted mesh. Then the mesh is passed through a series of our proposed 0N-CGN layers to deform its shape. Finally, the surface of the mesh is adaptively redefined based on the local curvate of its faces. The first module is presented a predefined mesh along with the target image, and each subsequent module takes the output of the previous mesh as its input mesh. The loss for this system is a combination of the latent loss, the differentiable surface losses, and two regularizers.

To train this system call

python GEOMetrics.py --object $obj$

where $obj$ is the object class you wish to train.

To render the results of a trained system on the test set call

python GEOMetrics.py --object $obj$ --render

where $obj$ is the object class you wish to train.

A single mesh reconstruction module.

Reconstruction results

Reference:

please cite my paper: https://arxiv.org/pdf/1802.09987.pdf ,if you use this repo for research with following bibtex:

        @incollection{NIPS2018_7883,
        title = {Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation},
        author = {Smith, Edward and Fujimoto, Scott and Meger, David},
        booktitle = {Advances in Neural Information Processing Systems 31},
        editor = {S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett},
        pages = {6479--6489},
        year = {2018},
        publisher = {Curran Associates, Inc.},
        url = {http://papers.nips.cc/paper/7883-multi-view-silhouette-and-depth-decomposition-for-high-resolution-3d-object-representation.pdf}
        }