Add norm constraints #8

stokasto · 2014-11-30T22:55:14Z

This pull request implements a constraints mechanism for the solver effectively allowing us to implement e.g. norm constraints which are often used either in conjunction or instead of L2/L1 constraints on the weights.
I also added an implementation of an L2 norm constraint on the weights as this is often used in combination with dropout (e.g. see the dropout/maxout papers).

this includes a definition for an l2 constrained as used in many recent papers

also add a readme for the MNIST examples

coveralls · 2014-11-30T23:06:03Z

Coverage increased (+0.11%) when pulling 8716cf7 on stokasto:add_constraints into 77274da on pluskid:master.

pluskid · 2014-12-01T02:13:54Z

examples/mnist/mnist_dropout_fc.jl

+# This difference is likely due to slight differences in the 
+# learning parameters. Also note that our hyperparameters
+# are not chosen using a validation set, as one would do
+# for a paper.
 ############################################################


I just want to add a comment here that the script used to convert the MNIST dataset to HDF5 format do some randomization in the order of data samples. And I didn't fix the random seed there. So other people might not get exactly the same results if they prepare their HDF5 MNIST dataset separately. A fix might be just to fix the random seed in the data conversion script.

Agreed. I think I'll simply fix the random seed in the conversion script then as it is quite useful to be able to reproduce exact results. I'll still add a comment here though since different GPUs/cuda versions etc could potentially lead to small changes as well.

pluskid · 2014-12-01T02:25:14Z

@stokasto Thank you very much for this PR! This is a great add-on. I will merge this PR later when I after a closer look. But I do want to mention there is one thing that I dislike: I don't like allocating a new blob and destroying it every iteration. I will look at that after merging. Maybe we could either eliminate those temp blobs or, if that is possible, make norm constraints like data-transformers: they will have a state where they could store and re-use temp blobs.

stokasto · 2014-12-01T09:36:06Z

@pluskid yes I am not very fond of the blob allocation in each iteration either :).
However, as I wrote in the code simply allocating a blob statically for each weight blob beforehand would double the memory footprint of the model! If we (hopefully soon) try larger models (e.g. the AlexNet, VGG net etc) this will be a big problem.
I quickly added the cublasSnorm2_v2 function and computed the norm inside the for loop for each column but that turned out to be quite slow for some reason. Do you have any other idea how to solve this better ?
EDIT: One option would of course be to write a seperate kernel for doing the normalization since in principle it does not have to use additional memory.

stokasto · 2014-12-01T11:04:08Z

I am letting the example run again with the fixed random seed and will then adapt the description in the scrip.t

coveralls · 2014-12-01T11:09:24Z

Coverage decreased (-0.01%) when pulling 78d6250 on stokasto:add_constraints into 77274da on pluskid:master.

…t gets

stokasto · 2014-12-01T16:03:17Z

OK, changed the script to reflect the new behavior, should be good to go!

coveralls · 2014-12-01T16:12:01Z

Coverage decreased (-0.1%) when pulling cb3ffaf on stokasto:add_constraints into 77274da on pluskid:master.

pluskid · 2014-12-01T17:28:29Z

@stokasto I'm merging this PR.

Could you please isolate the code you used to benchmark different ways of implementing the norm constraint in CUDA and put it to the benchmarks directory? I will try to look at this when I have time and see if using a CUDA kernel would be better. This is also related to the current implementation of L2 norm regularizer, which is not very meaningful as the whole parameter blob is treated as a vector.

Add norm constraints

pluskid · 2014-12-01T17:29:55Z

@stokasto BTW: thanks for fixing the data conversion script. The reproducibility could also act as a regression test for Mocha to make sure we did not break things when new stuff is introduced.

stokasto added 6 commits November 30, 2014 21:19

add support for constraints applied after sgd step

ed7eb62

this includes a definition for an l2 constrained as used in many recent papers

make the dropout mnist example use constraints

0967ad2

fix typo I spotted in regularizers

a89ea6b

enable testing of constraints

5842bee

adapt description in example to changed parameters

75cffcd

also add a readme for the MNIST examples

fix capitalization

8716cf7

pluskid reviewed Dec 1, 2014
View reviewed changes

stokasto added 2 commits December 1, 2014 11:56

fix random seed for conversion

08ce2bb

remove leftover simd

78d6250

change comment in mnist_dropout script to reflect the actual result i…

cb3ffaf

…t gets

pluskid added a commit that referenced this pull request Dec 1, 2014

Merge pull request #8 from stokasto/add_constraints

74dae49

Add norm constraints

pluskid merged commit 74dae49 into pluskid:master Dec 1, 2014

arshak mentioned this pull request Nov 24, 2014

could not load module libcuda: dlopen(libcuda.dylib, 1): image not found #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add norm constraints #8

Add norm constraints #8

stokasto commented Nov 30, 2014

coveralls commented Nov 30, 2014

pluskid Dec 1, 2014

stokasto Dec 1, 2014

pluskid commented Dec 1, 2014

stokasto commented Dec 1, 2014

stokasto commented Dec 1, 2014

coveralls commented Dec 1, 2014

stokasto commented Dec 1, 2014

coveralls commented Dec 1, 2014

pluskid commented Dec 1, 2014

pluskid commented Dec 1, 2014

Add norm constraints #8

Add norm constraints #8

Conversation

stokasto commented Nov 30, 2014

coveralls commented Nov 30, 2014

pluskid Dec 1, 2014

Choose a reason for hiding this comment

stokasto Dec 1, 2014

Choose a reason for hiding this comment

pluskid commented Dec 1, 2014

stokasto commented Dec 1, 2014

stokasto commented Dec 1, 2014

coveralls commented Dec 1, 2014

stokasto commented Dec 1, 2014

coveralls commented Dec 1, 2014

pluskid commented Dec 1, 2014

pluskid commented Dec 1, 2014