Skip to content

Commit

Permalink
Merge pull request #111 from bartvm/docs
Browse files Browse the repository at this point in the history
Installation instructions and quick-start tutorial
  • Loading branch information
bartvm committed Jan 18, 2015
2 parents 9656c02 + fa35111 commit 2ae9fb8
Show file tree
Hide file tree
Showing 13 changed files with 235 additions and 82 deletions.
20 changes: 13 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,25 @@

.. image:: https://readthedocs.org/projects/blocks/badge/?version=latest
:target: https://blocks.readthedocs.org/
:alt: Documentation Status

Blocks
======
Blocks is a framework that helps you build neural network models on top of
Theano. Currently it supports and provides:

Bricks and blocks are Theano functions with parameters. Furthermore, the
plan is to support:
* Constructing parametrized Theano operations, called "bricks"
* Pattern matching to select variables and bricks in large models
* A pipeline for loading and iterating over training data
* Algorithms to optimize your model
* Automatic creation of monitoring channels (*limited support*)
* Application of graph transformations, such as dropout (*limited support*)

* Lazy initialization
In the feature we also hope to support:

* Saving and resuming of training
* Monitoring and analyzing values during training progress (on the training set
as well as on test sets)
* Dimension, type and axes-checking
* Automatic creation of monitoring channels
* Easy pattern matching to select the bricks you want in large graphs
* Application of graph transformations, such as dropout

Please see the documentation_ for more information.

Expand Down
2 changes: 2 additions & 0 deletions blocks/bricks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1059,6 +1059,8 @@ def apply(self, input_):
Tanh = _activation_factory('Tanh', tensor.tanh)
Sigmoid = _activation_factory('Sigmoid', tensor.nnet.sigmoid)
Softmax = _activation_factory('Softmax', tensor.nnet.softmax)
Rectifier = _activation_factory('Rectifier',
lambda x: tensor.switch(x > 0, x, 0))


class Sequence(Brick):
Expand Down
2 changes: 1 addition & 1 deletion blocks/bricks/sequence_generators.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ class BaseSequenceGenerator(Initializable):
| A scheme of the algorithm described above follows.
.. image:: sequence_generator_scheme.png
.. image:: _static/sequence_generator_scheme.png
:height: 500px
:width: 500px
Expand Down
4 changes: 0 additions & 4 deletions docs/_static/.gitignore

This file was deleted.

Binary file added docs/_static/mnist.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
2 changes: 0 additions & 2 deletions docs/blocks.rst → docs/bricks.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. _bricks:

Bricks
======

Expand Down
14 changes: 3 additions & 11 deletions docs/getting_started.rst → docs/bricks_overview.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Getting started
===============
Bricks
======

Blocks is a framework that is supposed to make it easier to build complicated
neural network models on top of Theano_. In order to do so, we introduce the
Expand Down Expand Up @@ -104,7 +104,7 @@ explicitly. Consider the following example:
>>> linear3.params
[W, b]

Nested blocks
Nested bricks
-------------

Many neural network models, especially more complex ones, can be considered
Expand Down Expand Up @@ -174,11 +174,3 @@ bricks' configuration.

.. _machine translation models: http://arxiv.org/abs/1409.0473
.. _here: :class:`blocks.bricks.Brick`

Examples
--------

You can find examples using the Groundhog main loop in the folder
``blocks/groundhog/examples``. Case studies of language modeling, Markov
chains and sinewave generation are available. They are planned to be replaced
by PyLearn2 based examples in the near future.
43 changes: 27 additions & 16 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,41 +6,52 @@

.. image:: https://readthedocs.org/projects/blocks/badge/?version=latest
:target: https://blocks.readthedocs.org/
:alt: Documentation Status

|
Welcome to Blocks's documentation!
==================================

Blocks is a framework that helps you build neural network models on top of
Theano. It also helps you manage your model by doing error-checking, creating
monitoring channels, and allowing for easy configuration of your model. Features
include:
Theano. Currently it supports and provides:

* Dimension, type and axes-checking
* Automatic creation of monitoring channels
* Easy pattern matching to select the bricks you want in large graphs
* Lazy initialization of bricks
* Application of graph transformations, such as dropout
* Constructing parametrized Theano operations, called "bricks"
* Pattern matching to select variables and bricks in large models
* A pipeline for loading and iterating over training data
* Algorithms to optimize your model
* Automatic creation of monitoring channels (*limited support*)
* Application of graph transformations, such as dropout (*limited support*)

Table of contents
-----------------
In the future we also hope to support:

* Saving and resuming of training
* Monitoring and analyzing values during training progress (on the training set
as well as on test sets)
* Dimension, type and axes-checking

Getting started
---------------
.. toctree::
setup
quickstart

getting_started
In-depth
--------
.. toctree::
bricks_overview
configuration
blocks
developer_guidelines

API Reference
-------------
.. toctree::
bricks
initialization
datasets
utils
serialization
graph
developer_guidelines

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
164 changes: 164 additions & 0 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
Quickstart
==========

In this quick-start tutorial we will use the Blocks framework to train a
`multilayer perceptron`_ (MLP) to perform handwriting recognition on the `MNIST
handwritten digit database`_.

The Task
--------
MNIST is a dataset which consists of 70,000 handwritten digits. Each digit is a
grayscale image of 28 by 28 pixels. Our task is to classify each of the images
into one of the 10 categories representing the numbers from 0 to 9.

.. figure:: _static/mnist.png
:align: center

Sample MNIST digits

The Model
---------
We will train a simple MLP with a single hidden layer that uses the rectifier_
activation function. Our output layer will consist of a softmax_ function with
10 units; one for each class. Mathematically speaking, our model is parametrized
by the weight matrices :math:`\mathbf{W}_h` and :math:`\mathbf{W}_y`, and bias
vectors :math:`\mathbf{b}_h` and :math:`\mathbf{b}_y`. The rectifier activation
function is defined as

.. math:: \mathrm{ReLU}(\mathbf{x})_i = \max(0, \mathbf{x}_i)

and our softmax output function is defined

.. math:: \mathrm{softmax}(\mathbf{x})_i = \frac{e^{\mathbf{x}_i}}{\sum_{j=1}^n \mathbf{x}_j}

Hence, our complete model is

.. math:: f(\mathbf{x}) = \mathrm{softmax}(\mathbf{W}_y\mathrm{ReLU}(\mathbf{W}_h\mathbf{x} + \mathbf{b}_h) + \mathbf{b}_y)

Since the output of a softmax represents a categorical probability distribution
we can consider :math:`f(\mathbf{x}) = \hat p(\mathbf{y} \mid \mathbf{x})`,
where :math:`\mathbf{x}` is the 784-dimensional (28 × 28) input, and
:math:`\mathbf{y}` the probability distribution of it belonging to classes
:math:`i = 0,\dots,9`. We can train the parameters of our model by minimizing
the negative log-likelihood i.e. the categorical cross-entropy between our
model's output and the target distribution. That is, we minimize the sum of

.. math:: - \log \sum_{i=0}^{10} p(\mathbf{y} = i) \hat p(\mathbf{y} = i \mid \mathbf{x})

over all examples. We do so by using `stochastic gradient descent`_ (SGD) on
mini-batches.

Building the model
------------------
Constructing the model with Blocks is very simple. We start by defining the
input variable using Theano.

.. tip::
Want to follow along with the Python code? If you are using IPython, enable
the `doctest mode`_ using the special ``%doctest_mode`` command so that you
can copy-paste the examples below (including the ``>>>`` prompts) straight
into the IPython interpreter.

>>> from theano import tensor
>>> x = tensor.matrix('features')

Note that we picked the name ``'features'`` for our input. This is important,
because the name needs to match the name of the data source we want to train on.
MNIST defines two data sources: ``'features'`` and ``'targets'``.

For the sake of this tutorial, we will go through building an MLP the long way.
For a much quicker way, skip right to the end of this section. We begin with
applying the linear transformations and activations.

>>> from blocks.bricks import Linear, Rectifier, Softmax
>>> input_to_hidden = Linear(name='input_to_hidden', input_dim=784, output_dim=100)
>>> h = Rectifier().apply(input_to_hidden.apply(x))
>>> hidden_to_output = Linear(name='hidden_to_output', input_dim=100, output_dim=10)
>>> y_hat = Softmax().apply(hidden_to_output.apply(h))

Blocks' uses "bricks" to build models. Bricks are parametrized Theano ops. What
this means is that we start by initializing them with certain parameters e.g.
``input_dim``. After initialization we can apply our bricks on Theano variables
to build the model we want.

Now that we have built our model, let's define the cost to minimize. For this,
we will need the Theano variable representing the target labels.

>>> y = tensor.lmatrix('targets')
>>> from blocks.bricks.cost import CategoricalCrossEntropy
>>> cost = CategoricalCrossEntropy().apply(y.flatten(), y_hat)

That's it! But creating a simple MLP this way is rather cumbersome. In practice,
we would have simply used the :class:`~blocks.bricks.MLP` class.

>>> from blocks.bricks import MLP
>>> mlp = MLP(activations=[Rectifier(), Softmax()], dims=[784, 100, 10]).apply(x)

Training your model
-------------------
Besides helping you build models, Blocks also provides the main other features
needed to train a model. It has a set of training algorithms (like SGD), an
interface to datasets, and a training loop that allows you to monitoring and
control the training process.

We want to train our model on the training set of MNIST.

>>> from blocks.datasets.mnist import MNIST
>>> mnist = MNIST("train")

Datasets only provide an interface to the data. For actual training, we will
need to iterate over the data in minibatches. This is done by initiating a data
stream which makes use of a particular iteration scheme. We will use an
iteration scheme that iterates over our MNIST examples sequentially in batches
of size 256.

>>> from blocks.datasets import DataStream
>>> from blocks.datasets.schemes import SequentialScheme
>>> data_stream = DataStream(mnist, iteration_scheme=SequentialScheme(
... num_examples=mnist.num_examples, batch_size=256))

As our algorithm we will use straightforward SGD with a fixed learning rate.

>>> from blocks.algorithms import GradientDescent, SteepestDescent
>>> algorithm = GradientDescent(cost=cost, step_rule=SteepestDescent(learning_rate=0.1))

That's all we need! We can use the :class:`~blocks.main_loop.MainLoop` to
combine all the different pieces. Let's train our model for a single epoch and
print the progress to see how it works.

>>> from blocks.main_loop import MainLoop
>>> from blocks.extensions import FinishAfter, Printing
>>> main_loop = MainLoop(model=mlp, data_stream=data_stream, algorithm=algorithm,
... extensions=[FinishAfter(after_n_epochs=1), Printing()])
>>> main_loop.run() # doctest: +SKIP
-------------------------------------------------------------------------------
BEFORE FIRST EPOCH
-------------------------------------------------------------------------------
Training status:
iterations_done: 0
epochs_done: 0
Log records from the iteration 0:
-------------------------------------------------------------------------------
AFTER ANOTHER EPOCH
-------------------------------------------------------------------------------
Training status:
iterations_done: 235
epochs_done: 1
Log records from the iteration 235:
training_finish_requested: True
-------------------------------------------------------------------------------
TRAINING HAS BEEN FINISHED:
-------------------------------------------------------------------------------
Training status:
iterations_done: 235
epochs_done: 1
Log records from the iteration 235:
training_finish_requested: True
training_finished: True

.. _multilayer perceptron: https://en.wikipedia.org/wiki/Multilayer_perceptron
.. _MNIST handwritten digit database: http://yann.lecun.com/exdb/mnist/
.. _rectifier: https://en.wikipedia.org/wiki/Rectifier_%28neural_networks%29
.. _softmax: https://en.wikipedia.org/wiki/Softmax
.. _stochastic gradient descent: https://en.wikipedia.org/wiki/Stochastic_gradient_descent
.. _doctest mode: http://ipython.org/ipython-doc/dev/interactive/tips.html#run-doctests
25 changes: 25 additions & 0 deletions docs/setup.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Installation
============
The easiest way to install Blocks using the Python package manager ``pip``.
Blocks isn't listed yet on the Python Package Index (PyPI), so you will have to
grab it directly from GitHub.

.. code-block:: bash
pip install --upgrade --no-deps git+git://github.com/bartvm/blocks.git --user
If you have administrative rights, remove ``--user`` to install the package
system-wide. The ``--no-deps`` flag is there to make sure that ``pip`` doesn't
try to update NumPy and Scipy, possibly overwriting the optimised version on
your system with a newer but slower version.

If you want to update Blocks, simply repeat the command above to pull the latest
version from GitHub.

Requirements
------------
Blocks' only requirements are Theano and six. We develop using the bleeding-edge
version of Theano, so be sure to follow the `relevant installation
instructions`_ to make sure that your Theano version is up to date.

.. _relevant installation instructions: http://deeplearning.net/software/theano/install.html#bleeding-edge-install-instructions
34 changes: 0 additions & 34 deletions examples/mnist.py

This file was deleted.

0 comments on commit 2ae9fb8

Please sign in to comment.