poc-sparse-feature-maps

POC for using sparse feature maps

Vision

It could be pleasing to have the final layer of some future cnn have one feature map for each type of object, eg bicycles, grand pianos and stuff. If you put a '1' in one of these feature maps, and then back-propagated onto white noise, it could be cool if that somehow generated eg a grand piano at that position in the image.

For this to happen, we would need maybe ten thousand or a million feature maps at the output. Most of these would be all zeros: they will be sparse. Since there are so many of them, ideally we need some kind of sparse representation

There are two parts to achieving this:

theoretical: how to enforce sparseness?
engineering: how to store sparse tensors, implement sparse library etc

Engineering / representation

Representation

Multiple sparse representations exist. All have their own good and bad points. Currently, I am considering a representation where each plane, in the weights and the activations, is stored densely, but not all planes are stored, some are considered to be entirely zero'd out. This has advantages:

facilitates implementation
can continue to use standard GPU implementatinos for convolution etc

On the downside, it's quite non-standard, and might not achieve the same sparsity, in terms of actual storage used, compared to some more standard representation. I'm a bit concerned too that it will have all of the implementation issues that 0-norms have, ie non-differentiability, exponentially combinatorial complexity etc :-(

Implementation

To run poc:

luarocks make rocks/sparseplanar-scm-1.rockspec  && th test2.lua

Pre-requisites:

torch must be installed
torch must be activated (ie source ~/torch/install/bin/torch-activate)

Theory / model

Two things need to be made sparse:

the activations
the weights

The weights must be sparse too, otherwise they will be massive (I guess?), or at least, propagation will need a lot of calculations.

For making the activations sparse, we can use KL divergence, eg http://ufldl.stanford.edu/wiki/index.php/Autoencoders_and_Sparsity.

However this doesn't address how to make the weights sparse. Also, ideally we'd enforce sparseness on a per-plane basis, ie, once a plane has a single non-zero value, it doesnt matter how many it has. Though, this might encourage the difficulties associated with 0-norms, combinatorial complexity etc.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
rocks		rocks
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
SparsePlanarTensor.cpp		SparsePlanarTensor.cpp
init.cpp		init.cpp
init.lua		init.lua
test1.lua		test1.lua
test2.lua		test2.lua
utils.c		utils.c
utils.h		utils.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rocks

rocks

.gitignore

.gitignore

CMakeLists.txt

CMakeLists.txt

LICENSE

LICENSE

README.md

README.md

SparsePlanarTensor.cpp

SparsePlanarTensor.cpp

init.cpp

init.cpp

init.lua

init.lua

test1.lua

test1.lua

test2.lua

test2.lua

utils.c

utils.c

utils.h

utils.h

Repository files navigation

poc-sparse-feature-maps

Vision

Engineering / representation

Representation

Implementation

Theory / model

About

Releases

Packages

Languages

License

nagyistge/poc-sparse-feature-maps

Folders and files

Latest commit

History

Repository files navigation

poc-sparse-feature-maps

Vision

Engineering / representation

Representation

Implementation

Theory / model

About

Resources

License

Stars

Watchers

Forks

Languages