# Keras

[Keras](https://keras.io/) is an API that provides high-level building blocks for developing machine learning models.

Keras does not implement low level operations like tensor manipulations and differentiation itself but instead delegates them to a backend engine. 

Several different backend engines can be plugged into Keras:

 * [TensorFlow (Google)](https://www.tensorflow.org/)
 * [Theano (MILA lab, Universite of Montreal)](http://deeplearning.net/software/theano/)
 * [Microsoft Cognitive Toolkit (CNTK)](https://github.com/Microsoft/CNTK)

Keras models can be run with any of these backends without having to change the code.

 


## Anatomy of a Keras model

A Keras model contains the following objects:

 * Layers, which are combined into a model
 * The input data and labels
 * The loss function, which defines the feedback signal used for learning
 * The optimizer, which determines how learning proceeds

<img src="images/keras_model.png" height="250" width="400"/> 

## Layers

A layer is a function that takes as input one or more tensors and that outputs one or more tensors.

Some layers are stateless, but more frequently layers have a state: the layer’s weights, one or several tensors learned with stochastic gradient descent.

Examples of stateless layers:
 * Dropout: regularization to reducing overfitting in models
 * Merge layers: concatenate, sum, mean, min, max etc.

Examples of stateful layers:
 * Dense layers
 * Recurrent layers
 * Convolution layers

Different layers are appropriate for different types of data processing:

 * Vector data, stored in 2D tensors of shape (batch_size, features), is usually processed by dense layers
 * Sequence data, stored in 3D tensors of shape (batch_size, timesteps, features), is usually processed by recurrent layers
 * Image data, stored in 4D tensors of shape (batch_size, height, width, colors), is usually processed by convolution layers

You can think of layers as LEGO bricks.

Models are built by clipping together compatible layers to form useful data-transformation pipelines.

The notion of layer compatibility here refers specifically to the fact that every layer will only accept input tensors of a certain shape and will return output tensors of a certain shape

A model is a directed, acyclic graph of layers. 

The most common instance is a linear stack of layers, mapping a single input to a single output. 

More complex models will have multiple inputs/outputs or short-cut connections.

For each problem class usually exist one or more standard model architectures. 

It is always a good idea to start with one of this models.

In general picking the right model architecture is more an art than a science.


## Layers in Keras

The following sections demonstrate the function and behavior of some Keras layers. 

With building a single layer model and running a forward pass (e.g. calling `predict()`) it is possible to introspect the behavior of a layer in isolation. 

For more information see [Keras layers](https://keras.io/layers/about-keras-layers/).

### Dense layer

A dense layer performs the computation `output = activation(dot(input, W) + b)`.

A dense layer takes a tensor of shape (batch_size, input_size) as input and returns a tensor of shape (batch_size, output_size).

 * `W` is a (input_size, output_size) weight matrix 
 * `b` is a output_size dim. vector

Some frequently used [activation functions](https://keras.io/activations) are:
 * `linear`: identity function, e.g. no activation is applied
 * `relu`: rectified linear unit
 * `sigmoid`: Sigmoid function, used in binary classification
 * `softmax`: softmax function, used in multi-class classification

In [1]:
import numpy as np
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

W = np.array([
    [1,2,3,4,5],
    [1,2,3,4,5]])
b = np.array([0,0,0,0,0])
weights_and_bias = (W, b)

inputs = Input(shape=(2,))
outputs = Dense(5, activation='linear', weights=weights_and_bias)(inputs) 
model = Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='sgd', loss='mse')

print('Input shape', model.input.shape)
print('Output shape', model.output.shape)

x = np.array([1,2])
print('Output:', model.predict(np.expand_dims(x, 0)))

np_result = np.dot(x, W) + b
print('Numpy result:', np_result)

Input shape (None, 2)
Output shape (None, 5)
Output: [[ 3.  6.  9. 12. 15.]]
Numpy result: [ 3  6  9 12 15]
