Skip to content

Latest commit

 

History

History
71 lines (44 loc) · 5.21 KB

old_readme.md

File metadata and controls

71 lines (44 loc) · 5.21 KB

This file is the deliverable for the first Exercise of the course 194.077 Applied Deep Learning from TU Wien: https://tiss.tuwien.ac.at/course/courseDetails.xhtml?dswid=4761&dsrid=925&courseNr=194077&semester=2019W

CapsVoxGAN

Description

My project aims to be a three dimensional generative adversarial network (GAN) for generating voxel models, using a three dimensional capsule network. CNNs are quite good when it comes to detecting features, but they do not take the part-to-whole-relation into account. The following picture illustrates this:

Source: https://towardsdatascience.com/capsule-networks-the-new-deep-learning-network-bd917e6818e8

This picture looks somewhat like a face, but more like an abstract painting, than an actual face. All of the features required for a face are there, but their alignment is odd, so are their relative sizes. For two dimensional settings this is not optimal, but the results of CNNs are still good enough, so it is not a problem. In three dimensional settings, this is different: spatial arrangements are more important, due to the extra dimension, especially when generating models.

Capsule networks were introduced by Geoffrey Hinton, who is not pleased with the pooling operations in CNNs:

The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster. (Source)

They are a relatively new concept, but they have been used for GANs and for 3D data. To the best of my knowledge, this is the first attemt to use a GAN with a capsule network to generate a voxel model.

Dataset

Machine learning with 3D objects or scenes is a relatively new area, therefore there is not as much annotated data available as for image recognition. Luckily in recent years this started to change, here you can find a good overview of available datasets.

The dataset of my choice is ModelNet40, it consits of 12311 models from 40 categories. The models are polygon meshes and therefore I have to convert them into voxel models first. I'll do this using PyMesh, alternatively I could use binvox together with Gmsh. I was thinking of setting the grid of voxels to 64x64x64, as a compromise between computational effort and quality of the results, but this might be subject to change.

Project Type

Bring your own method.

Work-Breakdown:

Task Hours
Getting familiar with the data / used libraries 10
In-depth reading of related publications 10
Coding of solution 25
Creating presentation of results 10

References

Papers

Datasets

GitHub Repositories

Libraries

Tools

N.B.

I am aware, that I might bite off more than I can chew, but whatever the final result will be, the journey is its own reward 😃