Doc Fix

ragavvenkatesan · Feb 16, 2017 · 0ff8f2e · 0ff8f2e
1 parent e70ee3e
commit 0ff8f2e
Show file tree

Hide file tree

Showing 8 changed files with 90 additions and 54 deletions.
diff --git a/docs/source/organization.rst b/docs/source/organization.rst
@@ -47,13 +47,4 @@ The above figure shows a cooked network. The objects that are in gray and are sh
 parts of the network. Once cooked, the network is ready for training and testing all by using other 
 methods within the network. The network class also has several properties such as layers, which is 
 a dictionary of the layers that are added to it and params, which is a dictionary of all the 
-parameters. All layers and modules contain a property called `id` through which they are referred.
-
-
-
-
-
-
-
-
-
+parameters. All layers and modules contain a property called `id` through which they are referred.
diff --git a/docs/source/pantry/tutorials/lenet.rst b/docs/source/pantry/tutorials/lenet.rst
@@ -38,9 +38,10 @@ could be added thus,
                     verbose = verbose
                     )
 
+Refer to the APIs for more details on the convpool layer.
 It is often useful to visualize the filters learnt in a CNN, so we introduce the visualizer module 
-here with the CNN tutorial. The visualizer can be setup using the ``add_module`` method of ``net``
-object.
+here along with the CNN tutorial. The visualizer can be setup using the ``add_module`` method of 
+``net`` object. 
 
 
 .. code-block :: python
@@ -66,11 +67,15 @@ where the ``visualizer_params`` is a dictionary of the following format.
 
 ``root`` is the location where the visualizations are saved, ``frequency`` is the number of epochs
 for which visualizations are saved down, ``sample_size`` number of images are saved each time.
-``rgb_filters`` make the filters save in color.  
+``rgb_filters`` make the filters save in color. Along with the activities of each layer for the 
+exact same images as the data itself, the filters of neural network are also saved down. 
+For more options of parameters on visualizer refer to the `visualizer documentation`_ .
+
+.. _visualizer documentation: http://yann.readthedocs.io/en/master/yann/modules/visualizer.html
 
 The full code for this tutorial with additional commentary can be found in the file 
 ``pantry.tutorials.lenet.py``. This tutorial runs a CNN for the lenet dataset. 
-If you have toolbox cloned or downloaded or just the tutorials downloaded, Run the code as,
+If you have toolbox cloned or downloaded or just the tutorials downloaded, Run the code using,
 
 .. automodule:: pantry.tutorials.lenet
    :members:

diff --git a/docs/source/pantry/tutorials/mat2yann.rst b/docs/source/pantry/tutorials/mat2yann.rst
@@ -5,14 +5,76 @@ Cooking a matlab dataset for Yann.
 
 By virture of being here, it is assumed that you have gone through the :ref:`quick_start`.
 
-.. Todo::
+This tutorial will help you convert a dataset from matlab workspace to yann. To begin let us 
+acquire `Google's Street View House Numbers dataset in Matlab`_ [1]. Download from the url three
+.mat files: test_32x32.mat, train_32x32.mat and extra_32x32.mat. Once downloaded we need to 
+divide this mat dump of data into training, testing and validation minibatches appropriately as 
+used by yann. This can be accomplished by the steps outlined in the code 
+``yann\pantry\matlab\make_svhn.m``. This will create data with 500 samples per mini batch with 
+56 training batches, 42 testing batches and 28 validation batches. 
 
-    Code is done, but text needs to be written in. 
+Once the mat files are setup appropriately, they are ready for yann to load and convert them into 
+yann data. In case of data that is not form svhn, you can open one of the 'batch' files in matlab
+to understand how the data is spread. Typically, the ``x`` variable is vectorized images, in this 
+case 500X3072 (500 images per batch, 32*32*3 pixels per image). ``y`` is an integer vector labels 
+going from 0-10 in this case. 
+
+.. rubric:: References
+
+.. [#] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng
+        Reading Digits  in Natural Images with Unsupervised Feature Learning NIPS Workshop
+        on Deep Learning and Unsupervised Feature Learning 2011. 
+        
+To convert the code into yann, we can use the ``setup_dataset`` module at ``yann.utils.dataset.py``
+file. Simply call the initializer as,
+
+.. code-block:: python
+
+    dataset = setup_dataset(dataset_init_args = data_params,
+                            save_directory = save_directory,
+                            preprocess_init_args = preprocess_params,
+                            verbose = 3 )
+
+where, ``data_params`` contains information about the dataset thusly, 
+
+.. code-block:: python
+
+    data_params = {
+                   "source"             : 'matlab',
+                   # "name"               : 'yann_svhn', # some name.
+                   "location"			: location,    # some location to load from.  
+                   "height"             : 32,
+                   "width"              : 32,
+                   "channels"           : 3,
+                   "batches2test"       : 42,
+                   "batches2train"      : 56,
+                   "batches2validate"   : 28,
+                   "mini_batch_size"    : 500  }
+
+and the ``preprocess_params`` contains information on how to process the images thusly,
+
+.. code-block:: python 
+
+    preprocess_params = {
+                            "normalize"     : True,
+                            "ZCA"           : False,
+                            "grayscale"     : False,
+                            "zero_mean"     : False,
+                        }
+
+``save_directory`` is simply a location to save the yann dataset. Customarialy, it is 
+``save_directory = '_datasets'``
 
 The full code for this tutorial with additional commentary can be found in the file 
-``pantry.tutorials.mat2yann.py``. If you have toolbox cloned or downloaded or just the tutorials 
-downloaded, Run the code as,
+``pantry.tutorials.mat2yann.py``. 
+
+If you have toolbox cloned or downloaded or just the tutorials 
+downloaded, Run the code using,
 
 .. automodule:: pantry.tutorials.mat2yann
    :members:
 
+.. autoclass:: yann.utils.dataset.setup_dataset
+
+.. _Google's Street View House Numbers dataset in Matlab: http://ufldl.stanford.edu/housenumbers/
+
diff --git a/docs/source/pantry/tutorials/mlp.rst b/docs/source/pantry/tutorials/mlp.rst
@@ -75,14 +75,17 @@ options. A typical ``optimizer setup`` is:
 
 We have now successfully added a Polyak momentum with RmsProp back propagation with some :math:`L_1`
 and :math:`L2` co-efficients that will be applied to the layers for which we passed as argument
-``regularize = True``. This optimizer will therefore solve the following error:
+``regularize = True``. For more options of parameters on optimizer refer to the `optimizer 
+documentation`_ . This optimizer will therefore solve the following error:
+
+.. _optimizer documentation: http://yann.readthedocs.io/en/master/yann/modules/optimizer.html
 
 .. math::
 
-    e(\bf{w_2,w_1,w_{\sigma}}) = \sigma(d_2(d_1(\bf{x}),w_1),w_2)w_{\sigma}) + 
-                               0.0001(\vert w_2\vert + 
-                    \vert w_1\vert + \vert w_{\sigma} \vert) + 0.0002(\vert\vert w_2\vert\vert 
-                     \vert\vert w_1\vert\vert + \vert\vert w_{\sigma} \vert\vert)
+    e(\bf{w_2,w_1,w_{\sigma}}) = \sigma(d_2(d_1(\bf{x},w_1),w_2)w_{\sigma}) + 
+                                  0.0001(\vert w_2 \vert + \vert w_1\vert + \vert w_{\sigma} \vert) 
+                                + 0.0002(\vert\vert w_2\vert\vert + \vert\vert w_1\vert\vert + 
+                                  \vert\vert w_{\sigma} \vert\vert)
 
 where :math:`e` is the error, :math:`\sigma(.)` is the sigmoid layer and :math:`d_i(.)` is the
 ith layer of the network. Once we are done, we can cook, train and test as usual:

diff --git a/docs/source/setup.rst b/docs/source/setup.rst
@@ -226,27 +226,14 @@ Dependencies for visualization
 
     pip insall matplotlib
 
-  For visualization of images, yann also uses 
-  `pylearn2 <http://deeplearning.net/software/pylearn2/>`_. Pylearn2 used to be a library that was 
-  supported by the developers of theano, but the developement on it was stopped due to overheads
-  from development of blocks and fuel. I still use pylearn2 only for visualization, so it shouldn't 
-  be affected from developement and such. It has a setup.py file which can be used for pip install
-  as follows:
-
-  .. code-block:: bash
-
-    pip install git+git://github.com/lisa-lab/pylearn2.git
-
-
-  
 cPickle, gzip and hdf5py 
 ------------------------
 
   Most often the case is that `cPickle` and `gzip` these come with the python installation, 
   if not please install them.  Yann uses these for saving down models and such.
 
   For datasets, at the moment, yann uses cpickle. In the future, yann will migrate to hdf5 for 
-  datasets. Install hdf5py by running either,
+  datasets. We don't use hdf5py at the moment. Install hdf5py by running either,
 
   .. code-block:: bash
 

diff --git a/pantry/matlab/make_svhn.m b/pantry/matlab/make_svhn.m
@@ -53,7 +53,6 @@
 throw_away = 420; 
 batch_size = 500;
 
-
 data = x (1:length(x) - throw_away,:);
 labels = y (1:length(y) - throw_away) - 1; % because labels go from 1-10
 

diff --git a/pantry/tutorials/lenet.py b/pantry/tutorials/lenet.py
@@ -1,10 +1,3 @@
-"""
-TODO:
-
-    Something is off with the visualizations of the CNN filters. Need to check what is going on. 
-    
-"""
-
 from yann.network import network
 from yann.utils.graph import draw_network
 

diff --git a/yann/modules/optimizer.py b/yann/modules/optimizer.py
@@ -56,17 +56,13 @@ class optimizer(module):
 
                 optimizer_params =  {
                     "momentum_type"   : <option>  'false' <no momentum>, 'polyak', 'nesterov'.
-                                        Default value is 'polyak'
-                    "momentum_params" : (<option [0,1]>, <option [0,1]>, <int>)
+                                        Default value is 'false'
+                    "momentum_params" : (<option in range [0,1]>, <option in range [0,1]>, <int>)
                                         (momentum coeffient at start,at end,
                                         at what epoch to end momentum increase)
                                         Default is the tuple (0.5, 0.95,50)
                     "optimizer_type" : <option>, 'sgd', 'adagrad', 'rmsprop', 'adam'.
-                                       Default is 'rmsprop'
-                    "objective_function": <option>,'nll'- log likelihood,
-                                        'cce'-categorical cross entropy,
-                                            'bce'-binary cross entropy.
-                                         Default is 'nll'
+                                       Default is 'sgd'
                     "id"        : id of the optimizer
                             }
 
@@ -95,12 +91,12 @@ def __init__(self, optimizer_init_args, verbose = 1):
         if "momentum_type" in optimizer_init_args.keys():
             self.momentum_type   = optimizer_init_args [ "momentum_type" ]
         else:
-            self.momentum_type                   = 'polyak'
+            self.momentum_type                   = 'false'
 
         if "optimizer_type" in optimizer_init_args.keys():
             self.optimizer_type = optimizer_init_args [ "optimizer_type" ]
         else:
-            self.optimizer_type                  = 'rmsprop'
+            self.optimizer_type                  = 'sgd'
 
         if verbose >= 3:
             print "... Optimizer is initiliazed"