# keras 

Keras is an API that sits on top of Google’s TensorFlow, Microsoft Cognitive Toolkit (CNTK), and other machine learning frameworks. The goal is to have a single API to work with all of those and to make that work easier.

In [None]:
import pandas as pd
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

In Keras we have some data sets that can be used to train neural networks. For example mnist data set.

In [None]:
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images,train_labels),(test_images, test_labels)=fashion_mnist.load_data()

This dataset has 60000 images for traning and 10000 images for testing. Each image is a matrix of 28 x 28 pixels

In [None]:
print(test_images.shape)
print(train_images.shape)


The images have different labels. Here is the list of label names:

In [None]:
labels = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

In [None]:
sample1 = train_images[0]
sample2 = train_images[1]
sample3 = train_images[2]
fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize = (5,5))
ax1.imshow(sample1)
ax2.imshow(sample2)
ax3.imshow(sample3)

## Preprocessing
The data that is used to train a neural network usually needs to be processed. For this example, we need to change the range of the value for each pixel from 0 to 255 to 0..1. This helps our NN to fit the data more easily. 

In [None]:
train_images = train_images / 255
test_images = test_images / 255

## Creating the model
 Next we create the model using keras Sequential. This gives us a sequential neural network. In the following code, we create a model that has 784 input nodes. In the next layer we have 120 nodes that are densely connected by the nodes in the previous layer. In the output layer there are 10 nodes that stand for the ten labels of the dataset.

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28)),
    tf.keras.layers.Dense(120,activation="relu"),
    tf.keras.layers.Dense(10,activation="softmax")
])

After creating the structure of the model, we need to set other parameters. The loss function, the optimizer and the metrics using the compile method.

In [None]:
model.compile(optimizer="adam" , loss="sparse_categorical_crossentropy" , metrics=["accuracy"])

## Training the model
Finally we use fit method to train our model.

In the following example, we used batch size = 100, meaning that in each learning loop keras runs forward algorithm on 100 samples and then calculates the loss to update the gradient. Therefore, for 60000 samples we run the learning loop 600 times. We run this training session 10 times. Overal we update the weights 6000 times.  

In [None]:
model.fit(train_images , train_labels , epochs=10 , batch_size=100)

As you can see, the model gets better after each batch and and each epoch. However, we can also cancel the fitting process eralier using a kind of cancellation method!

The following code shows how we can use a callback function to cancel the training process when the loss value reached a specific threshold:

In [None]:
class LossCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get("loss")< 0.4):
            print("\nReached 0.,6 accuracy")
            self.model.stop_training = True

loss = LossCallback()
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28)),
    tf.keras.layers.Dense(120,activation="relu"),
    tf.keras.layers.Dense(10,activation="softmax")
])
model.compile(optimizer="adam" , loss="sparse_categorical_crossentropy" , metrics=["accuracy"])
model.fit(train_images , train_labels , epochs=10 , batch_size=100 , callbacks=[loss])

## Evaluating the model
Now that we created a model that has accuracy of 90 percent on our training model, we need to evaluate it on our test data set. Because this is the accuracy on the test data set that really matters.

In [None]:
model.evaluate(test_images,test_labels , verbose=1)

As you can see, the accuracy on the testing data set is lower, which means that our model is kind of overfitting over training data. So we should try other parameters and test the model again. For example, I think that if I reduce the value for epochs it would be a good idea to avoid overfiting. In the following code I use epoch = 2

In [None]:
model.fit(train_images , train_labels , epochs=2)

In [None]:
model.evaluate(test_images,test_labels , verbose=1)

As you can see, the accuracy on test data is higher afterusging epoch = 2. This process is called hyper parameter tuning. We have to use the best options and this is done by experience and knowledge. 

## Using the model
We can use the model the predict the label of new images:

In [None]:
sample = test_images[20]
predictions = model.predict(sample)
print(predictions)
print(labels[np.argmax(predictions)])
fig , axes = plt.subplots(1,1,figsize=(3,3))
axes.imshow(sample)