Neural Turing Machines library in Theano with Lasagne


NTM-Lasagne is a library to create Neural Turing Machines (NTMs) in Theano using the Lasagne library. If you want to learn more about NTMs, check out our blog post.

This library features:

  • A Neural Turing Machine layer NTMLayer, where all its components (controller, heads, memory) are fully customizable.
  • Two types of controllers: a feed-forward DenseController and a "vanilla" recurrent RecurrentController.
  • A dashboard to visualize the inner mechanism of the NTM.
  • Generators to sample examples from algorithmic tasks.


This library is compatible with Python 2.7.8, and may be partly compatible with Python 3.x. NTM-Lasagne requires the bleeding-edge versions of Lasagne and Theano. Check the Lasagne installation instructions for details, or install them with pip install -r requirements.txt. To install this library, clone this repository and then run the script.

git clone
cd ntm-lasagne/
pip install -r requirements.txt
python install


Here is minimal example to define a NTMLayer

# Neural Turing Machine Layer
memory = Memory((128, 20), memory_init=lasagne.init.Constant(1e-6),
    learn_init=False, name='memory')
controller = DenseController(l_input, memory_shape=(128, 20),
    num_units=100, num_reads=1,
heads = [
    WriteHead(controller, num_shifts=3, memory_shape=(128, 20),
        learn_init=False, name='write'),
    ReadHead(controller, num_shifts=3, memory_shape=(128, 20),
        learn_init=False, name='read')
l_ntm = NTMLayer(l_input, memory=memory, controller=controller, heads=heads)

For more detailed examples, check the examples folder. If you would like to train a Neural Turing Machine on one of these examples, simply run the corresponding script, like

PYTHONPATH=. python examples/

and be patient while Theano compiles the model ;-). Graph optimization is computationally intensive. If you are encountering suspiciously long compilation times (more than a few minutes), you may need to increase the amount of memory allocated (if you run it on a Virtual Machine). Alternatively, turning off the swap may help for debugging (with swapoff/swapon). Note: unlucky initialisation of the parameters might lead to a diverging solution (NaN scores).

Alex Graves, Greg Wayne, Ivo Danihelka, Neural Turing Machines, [arXiv]