Skip to content

jahuth/convis_theano

Repository files navigation

The Theano based convis package

This python package provides an implementation of the Virtual Retina developed by Adrien Wohrer. It uses Theano to simulate spike trains of retinal ganglion cells by directing the input through a number of computation nodes. Each node might do linear or nonlinear computations, eg. convolve the inpute with a spatio-temporal kernel or apply gain control.

⚠️ A new version is available at github.com/jahuth/convis that uses PyTorch as a backend.

This Theano version will not be developed actively in the future since the implementation in PyTorch is a cleaner and simpler. If you are interested in a continued version of convis based on Theano, let me know!

TravisCI on the master branch: Build Status

This version only supports Python 2.7 right now.

Convis is under development and some features might not work in the current master branch or the PyPi releases. If you discover unexpected behaviour, please leave an Issue on github.

Also there are two mailing lists for people interested in Convis (for both the Theano and PyTorch versions):

Usage Example:

import convis

ret = convis.retina.Retina()
inp = np.zeros((200,50,50))
inp[:,20:30,20:30] = 255.0*(rand(*inp[:,20:30,20:30].shape)<0.2)
out = ret.run(inp)

plot(np.mean(out[0],(1,2))) # On cells
plot(np.mean(out[1],(1,2))) # Off cells

An older version was published as the retina package

Installation

Installing convis and theano itself is not complicated. Both can be installed with pip, either from the PyPi releases, or directly from github. Requirements for the base installation are: Python 2.7 or Python 3.5, Numpy, SciPy and a BLAS library (either OpenBLAS (openblas-dev package) or MKL).

pip install convis

or install the latest version from github:

pip install git+https://github.com/jahuth/convis.git

or clone the repository and install it locally:

git clone https://github.com/jahuth/convis.git
# change something in the source code
pip install -e convis

I recommend installing opencv, and jupyter notebook, if you do not already have it installed:

pip install convis notebook
# eg. for ubuntu:
sudo apt-get install python-opencv

This installation will run, but might lack GPU support.

The GPU support in the current version of theano use pygpuarray. The recommandation of the theano team is to use conda (from anaconda or miniconda) to install pygpuarray and its binary dependencies (see the pygpuarray website). You will have to install a graphic card driver, CUDA, CudNN in appropriate versions before installing pygpuarray.

# optional: you can create a conda environment for certain versions of the dependencies:
conda create -n convis_conda python=2.7
source activate convis_conda
conda install Theano notebook opencv
pip install convis

A work around if you do not want to use conda is to either compile pygpuarray yourself or to use older theano versions (eg. Theano==0.7).

Introduction

convis provides a retina model which is identical to VirtualRetina and tools to create either simpler or more complex models.

A preprint of a paper about convis is available at bioarxiv: "Convis: A Toolbox To Fit and Simulate Filter-based Models of Early Visual Processing".

The Retina model

A description of all parameters for the retina model can be obtained directly from an instantiated model. This description contains doc strings for each parameter.

import convis
retina = convis.retina.Retina()
convis.describe(retina.parameters)

Here is a graph of the model:

To use the model, supply a numpy array to the run (for short input) or run_in_chunks function:

inp = np.ones((100,20,20))
output = retina.run(inp)

inp = np.ones((10000,20,20))
output = retina.run_in_chunks(inp,200)

It will return an object containing all outputs of the model (the default for retina is two outputs: spikes of On and Off cells).

convis.describe(output[0]) # will show information about this output

If instead of spikes, only the firing rate should be returned, the retina can be initialized without a spiking mechanism:

retina_without_spikes = convis.retina.Retina(ganglion_spikes = False)

Examples

The examples folder contains jupyter notebooks of different use cases and examples.

You can view them directly on github:

Defining a Model

model = convis.Model()
A,B,C,D,E,F,G = convis.E('A"'),convis.E('B'),convis.E('C'),convis.E('D'),convis.E('E'),convis.E('F'),convis.E('G')
convis.connect([A,B,[[C,D],[E,F],G])
#or:
B.add_input(A)
C.add_input(B)
D.add_input(C)
E.add_input(B)
F.add_input(E)
G.add_input(D)
G.add_input(F)
model.outputs.append(G)

Filter Nodes

convis Models can be created from predefined filter nodes which each manage their own parameters and state. A filter node is added to the model if it is part of any graphs that lead to the models outputs, eg. if its output is itself an output of the model.

import convis

M = convis.Model()
E = convis.ConvolutionFilter1d({'size':10},'E',M) # 1d filter in time
M.outputs.append(E.graph)

print E.parameters.kernel is M.parameters.E.kernel # references the same shared parameter

convis.describe can be used to inspect filter nodes, variables or models:

convis.describe(E.parameters.kernel)

In iPython notebooks this results in HTML output, including a "best guess" visualization of tensor contents:

parameter: kernel
(…)
E_kernel
value: Numpy array (10,)
---

E.parameters.kernel is a "shared parameter", which means that to theano, it is a shared variable that is accessible to the computational graph as well as the python interface. Its value can be accessed with get_value() and set_value(..). When accessing through the parameter list, values also can be assigned like this:

E.parameters.kernel = 1.0/np.arange(10)
# shorthand for E.parameters.kernel.set_value(1.0/np.arange(10))

Additions to the theano architecture

The toolbox is based on theano but provides added convenience by labeling variables and subgraphs. A subgraph - defined by a single output - can be wrapped in a convis.base.GraphWrapper or convis.base.N.

A GraphWrapper will explore the graph from the output backwards and label all variables as belonging to this subgraph. This can be done hierarchically and each variable tracks its depth in a path attribute, which gets longer each time it is wrapped in a subgraph.

A node Layer is a GraphWrapper which also manages a the configuration of its parameters.

Variable types

Labeled variables are classified into different types:

  • inputs
  • states and out_states
  • parameters (shared parameters and input parameters)
  • outputs
  • other variables

Parameters

The most usefull addition is the labeling of parameters. Instead of supplying the run function created by the model with all of the variable values that are part of the computation, the "parameters" of the model can be saved, loaded, inspected or optimized and are supplied to the theano function automatically. The difference between an input and a parameter is artifical but helpfull. The parameters define overall the behavior of the model, while the input itself has no relation to the model.

Parameters can be shared or normal theano variables. If it is not shared, the model will automatically add it to the inputs of the compiled function.

Parameters can have special attributes attached that define default values and how to interpret configuration values.

States

Similar to parameters, states are internal variables of the model. Since some computations have temporal memory, some slices of the input or the last values of a recursion need to be remembered such that the computation can be interrupted and resumed at any point without changing the result.

This is done by defining in-states and out-states: the in-state is the state between now and the previous computation, the out-state the state between now and the next computation.

Again there are two ways to implement this: as shared or as input/output variables. For a shared state variable the out-state is used as an update expression. For a non-shared variable, the last state is supplied as an input to the function and the new state is requested as an output.

Inputs and Outputs

Inputs and outputs work on the level of Layer nodes to connect subgraphs together. They have no special attributes.

Other Variables

It is possible to label variables without assigning a type. These variables can be used for visualization or to add their value as an output. Their only important attribute is their name and their html_name (for prettier typesetting).

How Models and Layers compile a function

In plain theano, a function can be created from a list of inputs and a desired output. If there is a tree* with all inputs as leaves and the ouput as root, the tree will be optimized and a function will be compiled and wrapped that will compute the output if supplied with the matching number of numerical arguments.

%pylab inline
import convis
M = convis.Model()
E = convis.ConvolutionFilter1d({'kernel': 1.0/np.arange(1,10)},'E',M) # 1d filter in time
M.outputs.append(E.graph)

inp = np.random.rand(1000,50,50)
f = convis.theano.function(inputs=[E.variables.input], outputs=[E.graph])
print inp.shape, f(inp)[0].shape # f returns a list of outputs

* the tree can have converging branches, which is still a tree if you duplicate the branch. But since the result should be immutable the branch has to be computed only once.

If the input is larger than the available memory on the device, the input has to be chunked up into smaller pieces. The Layer and Model classes make this process easier by managing states, splitting input and combining outputs.

Automatic Differentiation and Optimization

As an example, let's create a ground truth and try to find the parameter tau=0.00003:

import convis, theano
import theano.tensor as T
m = convis.Model()
E1 = convis.filters.simple.E_1d_recursive_filter({'tau__sec':0.00003},name="E1",model=m)
m.outputs.append(E1.graph)
m.create_function()
the_input = np.zeros((200,40,40))
for i in range(20):
    the_input[randint(the_input.shape[0]),20,20] = 1.0
o_goal = m.run(the_input)
plot(o_goal[0].mean((1,2)))

In three expressions, an optimization can be defined:

  • Error term: eg. the square difference of means error_term = (some_output.mean((1,2))-some_goal.mean((1,2)))**2
  • A gradient: eg. the gradient of the error with respect to the parameter tau
  • An update rule: eg. simple gradient descent tau_new = tau_old + learning_rate * gradient
tau = 0.00001
m2 = convis.Model()
E2 = convis.filters.simple.E_1d_recursive_filter(config={'tau__sec':tau},name="E2",model=m2)
m2.outputs.append(E2.graph)
error_term = (E2.graph.mean((1,2))-o_goal[0].mean((1,2)))**2
grad_term = theano.grad(T.mean(error_term),E2.parameters.tau)

# to be able to visualize the error and gradients we can add them  as outputs
m2.outputs.append(error_term)
m2.outputs.append(grad_term)

m2.create_function(updates = {E2.parameters.tau: E2.parameters.tau - 1.0 * grad_term})
colors = cm.gnuplot2(np.linspace(0.1,0.9,50))
taus = []
for trial in range(50):
    m2.clear_states()
    o_fit = m2.run(the_input)
    subplot(211)
    plot(o_fit[0].mean((1,2)),color=colors[trial])
    subplot(212)
    plot(o_fit[1],color=colors[trial])
    taus.append(E2.parameters.tau.get_value())
subplot(211)
plot(o_goal[0].mean((1,2)),'r',lw=2)

figure()
plot(taus)
Out:
[<matplotlib.lines.Line2D at 0x7f5b19b4a190>]

Since the error gets smaller and smaller, the speed of convergence drops slower and slower, thus the actual value is never reached. A better result can be obtained by only using the direction of the gradient:

{E2.parameters.tau: E2.parameters.tau - 0.0005 * T.sgn(grad_term)}

Or a momentum based approach:

tau = 0.00001
m2 = convis.Model()
E2 = convis.filters.simple.E_1d_recursive_filter(config={'tau__sec':tau},name="E2",model=m2)
m2.outputs.append(E2.graph)
error_term = (E2.graph.mean((1,2))-o_goal[0].mean((1,2)))**2
grad_term = theano.grad(T.mean(error_term),E2.parameters.tau)

# defining a new shared variable that can dynamically change is easy:
speed = theano.shared(0.0)

m2.outputs.append(error_term)
m2.outputs.append(grad_term)

# the update dictionary now contains two equations:
m2.create_function(updates = {E2.parameters.tau: E2.parameters.tau - speed,
                              speed: 0.8*speed + grad_term})
colors = cm.gnuplot2(np.linspace(0.1,0.9,50))
taus = []
speeds = []
for trial in range(50):
    m2.clear_states()
    o_fit = m2.run(the_input)
    subplot(211)
    plot(o_fit[0].mean((1,2)),color=colors[trial])
    subplot(212)
    plot(o_fit[1],color=colors[trial])
    taus.append(E2.parameters.tau.get_value())
    speeds.append(speed.get_value())
subplot(211)
plot(o_goal[0].mean((1,2)),'r',lw=2)

figure()
plot(taus,label='tau')
plot(speeds,label='speed')
legend(loc='upper left')
Out[ ]:
<matplotlib.legend.Legend at 0x7f5b1074ab90>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published