## The Keras API

** Content:**
1. Keras layers (input, dense and output)
2. Keras model (build, compile, visualize, save)
3. Fit/train and evaluate/test the model
4. Questions (unanswered)

### 1. Keras layers (input and dense)

**Layers and Tensors:**

Layers are used to construct a deep learning model, and tensors are used to define the dataflow through the model.

**Input layers**

- The first step in creating a neural network model is to define the Input layer. 
- This layer takes in raw data, usually in the form of numpy arrays. 
- The shape of the Input layer defines how many variables our neural network will use. For example, if the input data has 10 columns, define an Input layer with a shape of (10,)
- input layer allows model to laod data.

In [2]:
# Import Input from keras.layers
from keras.layers import Input

# Create an input layer of shape 1
input_tensor = Input(shape=(1,))

In [3]:
# input is a tensor, check type
print(input_tensor)

Tensor("input_1:0", shape=(?, 1), dtype=float32)


** Dense layers** 

- Once you have an Input layer, the next step is to add a Dense layer.
- Dense layers learn a weight matrix, where the first dimension of the matrix is the dimension of the input data, and the second dimension is the dimension of the output data. In above example Input layer has a shape of 1. In this case, output layer will also have a shape of 1. This means that the Dense layer will learn a 1x1 weight matrix.

In [4]:
# Load layers
from keras.layers import Input, Dense

# Input layer
input_tensor = Input(shape=(1,))

# Dense layer
output_layer = Dense(1)

# Connect the dense layer to the input_tensor
output_tensor = output_layer(input_tensor)

In [5]:
# output is a tensor, check type
print(output_tensor)

Tensor("dense_1/BiasAdd:0", shape=(?, 1), dtype=float32)


**Output layers**

- Output layers are simply Dense layers.
- Output layers are used to reduce the dimension of the inputs to the dimension of the outputs. 
- The output layer allows model to make prediction

In [51]:
# Load layers
from keras.layers import Input, Dense

# Input layer
input_tensor = Input(shape=(1,))

# Create a dense layer and connect the dense layer to the input_tensor in one step
# We did this in 2 steps in above code, but we are doing it in one step now
output_tensor = Dense(1)(input_tensor)

In [7]:
# layer is a keras object
print(output_layer)

<keras.layers.core.Dense object at 0x000000000E606C18>


### 2. Keras Model

**Build a model**

- Once you've defined an input layer and an output layer, you can build a Keras model. 
- The model object is how you tell Keras where the model starts and stops: where data comes in and where predictions come out.

In [39]:
# Input/dense/output layers
from keras.layers import Input, Dense
input_tensor = Input(shape=(1,))
output_tensor = Dense(1)(input_tensor)

# Build the model
from keras.models import Model
model = Model(input_tensor, output_tensor)

# Above model is a complete neural network, ready to learn from data and make prediction.

**Compile a model**

- The final step in creating a model is compiling it. Now that we've created a model, we have to compile it before we can fit it to data. This finalizes your model, freezes all its settings, and prepares it to meet some data.

- During compilation, we specify the optimizer to use for fitting the model to the data, and a loss function. 'adam' is a good default optimizer to use, and most of the time works well.

1. Choose the optimizer which will choose the learning rate [adam is the famous and efficient one]
    - Optimizers: GD, SGD, Rmsprop, Nesterov accelerated gradient (NAG), AdaDelta, AdaGrad, Adam 
2. adam(Adaptive Moment Estimation) adjuts the learning rate as it does gradient decent
3. Loss function: Loss function depends on the problem at hand. Mean squared error is a common loss function and will optimize for predicting the mean, as is done in least squares regression. Mean absolute error optimizes for the median and is used in quantile regression. For classification use binary_crossentropy [2 class] or categorical_crossentropy [multiclass]

** compile **
- Definition : compile(optimizer, loss=None, metrics=None, loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None, **kwargs)

In [45]:
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])

**Visualize a model**

- Now that you've compiled the model, take a look a the result. You can do this by looking at the model summary, as well as its plot.
- The summary will tell you the names of the layers, as well as how many units they have and how many parameters are in the model.
- The plot will show how the layers connect to each other.

In [52]:
# Import the plotting function
from keras.utils import plot_model
import matplotlib.pyplot as plt

# Summarize the model
model.summary()

# Plot the model
#plot_model(model, to_file='model.png')

# Display the image
#data = plt.imread('model.png')
#plt.imshow(data)
#plt.show()

## to see the graph you have to install pydot and graphviz

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_5 (InputLayer)         (None, 1)                 0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 2         
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________


### 3. Fit and Evaluate the model

**Fit and evaluate the model**

- Syntax for fitting the model: mode.fit(feature, target, batch size, validation split, verbose)

    - feature: a numpy matrix with which are best correlated with the target
    - target: the item you want to predict
    - batch size: how many rows of data are used for each step of stochastic gradient descent
    - validation split: tell keras to use a holdout set, and returns metrics on accuracy using that data. This is useful for validating that model will perform well on new data
    - verbose: When set to True Keras will print a log during training. Useful for debugging.

**What fitting does?**
1. Applys backpropagation and gradient descent with data to update the weights
    - note: Scaling data before fitting can ease optimization


- Once we fit a model we need to evaluate it on new/unseen data.
Even we use a validation set during training, we need to do a second check using a new set of data to make sure model perfoms as expected.
To validate a model we use evaluate() method of a model and pass feature and target variable of new data.
    - model.evaluate(test_feature, test_target)

In [42]:
import pandas as pd
data=pd.read_csv('titanic.csv') # we will use titanic dataset for fitting the model

# we will consider Age as feature and Survived as target
feature=data['Age']
target=data['Survived']


from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(feature,target, test_size=.3, random_state=123)

In [47]:
# fit the model
model.fit(X_train, y_train, batch_size=100, validation_split=.2, verbose=True)

Train on 498 samples, validate on 125 samples
Epoch 1/1


<keras.callbacks.History at 0xeab4cc0>

**Evaluate the model:**

We will give the model a new X matrix (also called feature test data), allow it to make predictions, and then compare to the known y variable (also called target data).

In [38]:
# Evaluate the model
model.evaluate(X_test,y_test)



nan

Ref: Advanced Deep Learning with Keras in Python by **Zachary Deane-Mayer**

**Questions:**
- How an activation function add non-linearity to a model?
- How to decide how many nodes to add in a hidden layer?
- How to decide how many hidden layers to add?
- Can i use different number of nodes in respective layer or i need to keep it same for all layers?
- How to know which function is processed or dropped at each node?
- What are different optimizers and why adam is widely used? [and not GD, SGD]
- How to choose batch size?
- How to choose validation split?