# Introduction to Keras

So excited for this module. This means progress. Imagine, 21 days ago I was just starting out into Deep Learning. Now I have a working knowledge in Neural Network basics and I am already able to create one with numpy. Onwards.

Brief research on [Keras](https://keras.io/), its meant to be run on top of tensorlow to enable faster experimentation on Deep Neural Networks. This in line with the guiding principle of Keras: User friendliness - Designed for human beings not machines, user experience is upfront. Modularity - components of a neural network are modular in Keras that allows for fully-configurable modules that can be plugged together. Easy Extensibility - as an extension of Modularity (pun intended) new modules are possible and easy to create allowing possibilities on total control and advanced research. Work with Python - Keras has no separate models configuration files, models use Python code and therefore are compact and easier to debug.

### Moving on to Keras Basics

The core data structure for Keras is a __model__, a way to organize layers. The simplest model type is _sequential_, which is a linear stack of layers. Complex architectures are also available through the [Keras functional API](https://keras.io/getting-started/functional-api-guide) that allows arbitrary graphs of layers, not just linear.

For this exercise we go over the different Keras calls by creating an __AND gate__.

In [2]:
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import to_categorical

# Note that Keras is case sensitive. Or was it always case sensitive?

Using TensorFlow backend.


First we import all dependencies. For this one we will still use __numpy__ for the inputs since we need the array. We import __Sequential__ from keras.models as this would be our basic model. Then we import __Dense__ and __Activation__ from keras.layers.core. __Dense__ is used for _densely-connected NN layer_. From the documentation it implements h=activation(dot(input,weights_or_kernel)+bias) so its the basic y = sigmoid(wx+b), the simple perceptron. __Activation__ is used to invoke the _activation function_ to be used, keras already includes the basic activation functions like _sigmoid, tanh, ELU, SELU, RELU, hard sigmoid_ and [more](https://keras.io/activations/). More information can be found here in [Core Layers Documentation](https://keras.io/layers/core/).

In [3]:
X = np.array([[0,0],[0,1],[1,0],[1,1]]).astype('float32')
y = np.array([[0],[1],[1],[0]]).astype('float32')
y = to_categorical(y)

Here we provide our __features and targets__. For this example, we are going to do a simple __AND Gate__ and we have 4 sets of pairs since this is a 2-input AND Gate. For the output, obviously its binary even though its called float. We also included __encoding__ in the form of __to_categorical__ where we convert our class vector (integers) to binary class matrix.

In [4]:
model = Sequential()

Here we just define our __model__ as sequential which is a linear stack of layers. A good read for this is the [Guide](https://keras.io/getting-started/sequential-model-guide/) provided by keras for __sequential models__. In our case we could have named it __and__ instead of model but that is just a preference.

In [5]:
model.add(Dense(32,input_dim=X.shape[1]))

Here we add the first layer, where we define a __Dense__ network with __batch size__ equal to __32__ and __input_dim__ as the input shape which for this case is 2.

In [6]:
model.add(Activation('softmax'))

Then we add the __activation function__ for our model which in this case is a __softmax__.

In [7]:
model.add(Dense(2))

We then define our next layer which in this case is the __output__. Here, if I understood my arguments correctly, 2 is the batch size for the layer.

In [8]:
model.add(Activation('sigmoid'))

Then we also add an activation function for the output. Since we want the output to range between 0 to 1 we will use __sigmoid__.

In [9]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

The __compile__ method is needed to define the learning process. For the model to compile, you need three arguments: [__optimizer__](https://keras.io/optimizers/), [__loss function__](https://keras.io/losses/) and a _LIST_ of __metrics__.

In [10]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 32)                96        
_________________________________________________________________
activation_1 (Activation)    (None, 32)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 66        
_________________________________________________________________
activation_2 (Activation)    (None, 2)                 0         
Total params: 162
Trainable params: 162
Non-trainable params: 0
_________________________________________________________________


I cannot find any item regarding summary method in Keras. But based on the output, it just shows you the table of the model you have built as well as its interconnection.

In [14]:
history = model.fit(X, y, epochs = 100, verbose=0)  # Probably overkill of epochs.

The __fit__ method is used to __train__ the model. Think of it as the same as train function that was used when we were in Numpy. We have our __Features__ and our __Targets__ then we can also include batch sizes, validation_split and more. We also define how many passes we want via __epochs__ which in this case means 1000 passes. Then __verbose__ is a setting where we can view the results as they go or just wait for the training to run in the background. Basically, its similar to the meter we used in _Sentiment Analysis_ where we are able to get _accuracy_ and _epoch count_.

In [15]:
score = model.evaluate(X,y)



In here we __evaluate__ our model for __loss value__ and __metrics value__. Since we have metrics for __accuracy__ in our compile, we get __accuracy__ as our output for the evaluate.

In [16]:
print('\nAccuracy =', score[-1])
print('\nPredictions =')
print(model.predict_proba(X))


Accuracy = 0.75

Predictions =
[[0.48811308 0.47764763]
 [0.48274556 0.48457536]
 [0.4801547  0.48669934]
 [0.4821377  0.4898439 ]]


Here we just print out the evaluation we got earlier. We also used __predict_proba__ method which is not found in the Keras documentation.