## What is a "backend"?

Keras is a model-level library, providing high-level building blocks for developing deep learning models. It does not handle itself low-level operations such as tensor products, convolutions and so on. Instead, it relies on a specialized, well-optimized tensor manipulation library to do so, serving as the "backend engine" of Keras. Rather than picking one single tensor library and making the implementation of Keras tied to that library, Keras handles the problem in a modular way, and several different backend engines can be plugged seamlessly into Keras.

At this time, Keras has two backend implementations available: the TensorFlow backend and the Theano backend.

* TensorFlow is an open-source symbolic tensor manipulation framework developed by Google, Inc.
* Theano is an open-source symbolic tensor manipulation framework developed by LISA/MILA Lab at Université de Montréal.
In the future, we are likely to add more backend options. Go ask Microsoft about how their CNTK backend project is doing.

We will be focusing only on the tensorflow backend, but I'll show off some important parts of both.

Switching from one backend to another

If you have run Keras at least once, you will find the Keras configuration file at:

In [1]:
ls ~/.keras

[1m[34mdatasets[m[m/   keras.json  [1m[34mmodels[m[m/


The default configuration file looks like this:

```
{
    "image_data_format": "channels_last",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow"
}
```

Simply change the field backend to either "theano" or "tensorflow", and Keras will use the new configuration next time you run any Keras code.

You can also define the environment variable KERAS_BACKEND and this will override what is defined in your config file.

## keras.json details

You can change these settings by editing $HOME/.keras/keras.json.

* image_data_format: string, either "channels_last" or "channels_first". It specifies which data format convention Keras will follow. (keras.backend.image_data_format() returns it.)
* For 2D data (e.g. image), "channels_last" assumes (rows, cols, channels) while "channels_first" assumes (channels, rows, cols).
* For 3D data, "channels_last" assumes (conv_dim1, conv_dim2, conv_dim3, channels) while "channels_first" assumes (channels, conv_dim1, conv_dim2, conv_dim3).
* epsilon: float, a numeric fuzzing constant used to avoid dividing by zero in some operations.
* floatx: string, "float16", "float32", or "float64". Default float precision.
* backend: string, "tensorflow" or "theano".

Using the abstract Keras backend to write new code

If you want the Keras modules you write to be compatible with both Theano (th) and TensorFlow (tf), you have to write them via the abstract Keras backend API. Here's an intro.

You can import the backend module via:

In [2]:
from keras import backend as K

Using TensorFlow backend.


The code below instantiates an input placeholder. It's equivalent to tf.placeholder() or th.tensor.matrix(), th.tensor.tensor3(), etc.

In [3]:
input = K.placeholder(shape=(2, 4, 5))
# also works:
input = K.placeholder(shape=(None, 4, 5))
# also works:
input = K.placeholder(ndim=3)

The code below instantiates a shared variable. It's equivalent to tf.Variable() or th.shared().

In [4]:
import numpy as np
val = np.random.random((3, 4, 5))
var = K.variable(value=val)

# all-zeros variable:
var = K.zeros(shape=(3, 4, 5))
# all-ones:
var = K.ones(shape=(3, 4, 5))

Most tensor operations you will need can be done as you would in TensorFlow or Theano:

In [5]:
# Initializing Tensors with Random Numbers
b = K.random_uniform_variable(shape=(3, 4), low=0, high=1) # Uniform distribution
c = K.random_normal_variable(shape=(3, 4), mean=0, scale=1) # Gaussian distribution
d = K.random_normal_variable(shape=(3, 4), mean=0, scale=1)
# Tensor Arithmetics
a = b + c * K.abs(d)
c = K.dot(a, K.transpose(b))
a = K.sum(b, axis=1)
a = K.softmax(b)
a = K.concatenate([b, c], axis=-1)
# etc...

## Backend functions


In [6]:
K.backend()

u'tensorflow'

In [7]:
# K.set_epsilon(1e-5)
K.epsilon()

1e-07

In [8]:
# K.set_floatx('float16')
K.floatx()

'float32'

In [9]:
import numpy
arr = numpy.array([1.0, 2.0], dtype='float64')

arr.dtype

dtype('float64')

In [10]:
new_arr = K.cast_to_floatx(arr)
new_arr

array([ 1.,  2.], dtype=float32)

In [11]:
np_var = numpy.array([1, 2])
K.is_keras_tensor(np_var)

False

In [12]:
keras_var = K.variable(np_var)
K.is_keras_tensor(keras_var)  # A variable is not a Tensor.

True

In [13]:
keras_placeholder = K.placeholder(shape=(2, 4, 5))
K.is_keras_tensor(keras_placeholder)  # A placeholder is a Tensor.

True

In [14]:
# Resets the TF graph
K.clear_session()

In [15]:
# The learning phase flag is a bool tensor (0 = test, 1 = train)
# set_learning_phase(value)
K.learning_phase()

<tf.Tensor 'keras_learning_phase:0' shape=<unknown> dtype=bool>

## How can I obtain the output of an intermediate layer?

One simple way is to create a new Model that will output the layers that you are interested in:

In [16]:
from keras.models import Model, Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(10, input_shape=(3,), name='dense_1'))
model.add(Dense(10, name='dense_2'))
model.add(Dense(10, name='dense_3'))
model.add(Dense(10, name='dense_4'))
model.add(Dense(10, name='dense_5'))

layer_name = 'dense_1'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)

intermediate_output = intermediate_layer_model.predict(np.array([[1,2,3]]))

intermediate_output

array([[ 1.4809525 ,  2.22503519, -0.45266402,  1.25213385, -1.0109905 ,
        -0.11991906, -1.82992721,  0.01926237,  0.8722105 ,  2.27682734]], dtype=float32)

Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example:

In [17]:
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
                                  [model.get_layer(layer_name).output])
layer_output = get_3rd_layer_output([np.array([[1,2,3]])])[0]

layer_output

array([[ 1.4809525 ,  2.22503519, -0.45266402,  1.25213385, -1.0109905 ,
        -0.11991906, -1.82992721,  0.01926237,  0.8722105 ,  2.27682734]], dtype=float32)

Note that if your model has a different behavior in training and testing phase (e.g. if it uses Dropout, BatchNormalization, etc.), you will need to pass the learning phase flag to your function:

In [18]:
X = np.array([[1,2,3]])

get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
                                  [model.layers[3].output])

# output in test mode = 0
layer_output = get_3rd_layer_output([X, 0])[0]

# output in train mode = 1
layer_output = get_3rd_layer_output([X, 1])[0]

Finally using eval will also work, and get_value will work for variables

In [19]:
kvar = K.variable(np.array([[1, 2], [3, 4]]), dtype='float32')
K.eval(kvar)

array([[ 1.,  2.],
       [ 3.,  4.]], dtype=float32)

In [21]:
K.count_params(kvar)

4

In [20]:
# K.set_value()
K.get_value(kvar)

array([[ 1.,  2.],
       [ 3.,  4.]], dtype=float32)