# Chainer for Theano Users

As we mentioned [here](https://chainer.org/general/2017/09/29/thank-you-theano.html), Theano stops the development in a few weeks. Many spects of Chainer were inspired by Theano's clean interface design, so that we would like to introduce Chainer here by comparing the difference from Theano. We believe that this article assists the Theano users to move to Chainer quickly.

In this post, we asume that the modules below have been imported.

In [3]:
import numpy as np

In [1]:
import theano
import theano.tensor as T

In [50]:
import chainer
import chainer.functions as F
import chainer.links as L

## Define a parametric function

A neural network basically has many parametric functions and activation functions which are called "layers" commonly. Let's see the difference between how to create a new parametric function in Theano and Chainer. In this example, to show the way to do the same thing with the two different libraries, we show how to define the 2D convolution function. But Chainer has `chainer.links.Convolution2D`, so that you don't need to write the code below to use 2D convolution as a building block of a network actually.

### Theano:

In [42]:
class TheanoConvolutionLayer(object):
    
    def __init__(self, input, filter_shape, image_shape):
        # Prepare initial values of the parameter W
        spatial_dim = np.prod(filter_shape[2:])
        fan_in = filter_shape[1] * spatial_dim
        fan_out = filter_shape[0] * spatial_dim
        scale = np.sqrt(3. / fan_in)
        
        # Create the parameter W
        W_init = np.random.uniform(-scale, scale, filter_shape)
        self.W = theano.shared(W_init.astype(np.float32), borrow=True)

        # Create the paramter b
        b_init = np.zeros((filter_shape[0],))
        self.b = theano.shared(b_init.astype(np.float32), borrow=True)

        # Describe the convolution operation
        conv_out = T.nnet.conv2d(
            input=input,
            filters=self.W,
            filter_shape=filter_shape,
            input_shape=image_shape)
        
        # Add a bias
        self.output = conv_out + self.b.dimshuffle('x', 0, 'x', 'x')
        
        # Store paramters
        self.params = [self.W, self.b]

How can we use this class? In Theano, it defines the computation as code using symbols, but doesn't perform actual computation at that time. Namely, it defines the computational graph before run. To use the defined computational graph, we need to define another operator using `theano.function` which takes input variables and output variable.

In [43]:
batchsize = 32
input_shape = (batchsize, 1, 28, 28)
filter_shape = (6, 1, 5, 5)

# Create a tensor that represents a minibatch
x = T.fmatrix('x')
input = x.reshape(input_shape)

conv = TheanoConvolutionLayer(input, filter_shape, input_shape)
f = theano.function([input], conv.output)

In [48]:
actual_x = np.random.rand(32, 1, 28, 28).astype(np.float32)

y = f(actual_x)

print(y.shape, type(y))

(32, 6, 24, 24) <class 'numpy.ndarray'>


### Chainer:

In [53]:
class ChainerConvolutionLayer(chainer.Link):
    
    def __init__(self, filter_shape):
        super().__init__()
        with self.init_scope():
            # Specify the way of initialize
            W_init = chainer.initializers.LeCunUniform()
            b_init = chainer.initializers.Zero()
        
            # Create a parameter object
            self.W = chainer.Parameter(W_init, filter_shape)          
            self.b = chainer.Parameter(b_init, filter_shape[0])
            
    def __call__(self, x):
        return F.convolution_2d(x, self.W, self.b)

In [55]:
conv = ChainerConvolutionLayer(filter_shape)

y = conv(actual_x)

print(y.shape, type(y), type(y.array))

(32, 6, 24, 24) <class 'chainer.variable.Variable'> <class 'numpy.ndarray'>
