<a href="https://colab.research.google.com/github/FrancescoCrecchi/Intro_Keras/blob/master/The_Keras_Framework.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup

Before starting, make sure you have enable hardware acceleration through GPU in the notebook. In order to do this:

 `Edit` -> `Notebook settings` -> `Hardware Accelerator` -> `GPU`

Download required files

In [0]:
!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1fnM_B3BZ2-2c4brcG95p1LUYQQt0sqXU' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1fnM_B3BZ2-2c4brcG95p1LUYQQt0sqXU" -O dset.zip && rm -rf /tmp/cookies.txt
!unzip -q dset.zip

# What is it

<img src='https://drive.google.com/uc?id=11pDAM6ND7NzijoK8ZEJ_Ro9UGVj6HzAb' width='60%'>

Keras is a **high-level neural networks API**, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.

<img src="https://drive.google.com/uc?id=1P1CndCWowe8pTbLsAvOEPK4Z5NQsTw35" width=60%>

It was developed with a focus on enabling **fast experimentation**: 

_Being able to go from idea to result with the least possible delay is key to doing good research_.

# Use Keras if

If you need a Deep Learning (DL) library that:
- Allows for **easy and fast prototyping** (through user friendliness, modularity, and extensibility).
- Supports both **convolutional networks and recurrent networks**, as well as combinations of the two.
- Runs seamlessly on CPU and **GPU**.

# Keras Goodness

- Minimalist, highly-modular neural network library written in **python**: no separate configuration files in a declarative format!
- Capable of running on top of either **TensorFlow/Theano and CNTK**.
- API for **human beings**, not machines! (Right, Tensorflow?)
- Strong **multi-GPU** support and **distributed training** support.

---

# 30-secs to Keras

Load a sample dataset

In [0]:
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

In [0]:
# TODO: STANDARDIZE DATA
from sklearn.preprocessing import StandardScaler

X = StandardScaler().fit_transform(X)

X.mean(), X.std()

In [0]:
# TODO: PERFORM 'ONE-HOT-ENCODING' OF THE LABELS OBTAINING A COLUMN VECTOR HAVING THE SAME LENGTH OF X
from sklearn.preprocessing import OneHotEncoder

y = OneHotEncoder().fit_transform(y.reshape(-1,1))
X.shape, y.shape

In [0]:
y[:10].todense()

In [0]:
# TODO: HOLD-OUT A TEST SET OF SIZE 0.25
from sklearn.model_selection import train_test_split

X_tr, X_ts, y_tr, y_ts = train_test_split(X, y, test_size=0.2, random_state=1234)

X_tr.shape, y_tr.shape, X_ts.shape, y_ts.shape

The core data structure of Keras is a **model**, a way to organize layers. The simplest type of model is the `Sequential` model, a **linear stack of layers**. 

Here is the `Sequential` model:

In [0]:
import keras
from keras.models import Sequential

model = Sequential()

Stacking layers is as easy as `.add()`:

In [0]:
from keras.layers import Dense

model.add(Dense(units=64, activation='relu', input_dim=4))
model.add(Dense(units=3, activation='softmax'))

Once your model looks good, configure its learning process with `.compile()`:

In [0]:
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

And inspect the structure of your model with `.summary()`:

In [0]:
model.summary()

You can now iterate on your training data in batches:

In [0]:
history = model.fit(X_tr, y_tr, epochs=30, batch_size=32, validation_split=0.3)

In [0]:
# TODO: VISUALIZE TRAINING-VALIDATION CURVES
import matplotlib.pyplot as plt

plt.plot(history.history['loss'], label='training loss')
plt.plot(history.history['val_loss'], label='validation loss')
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend()

Evaluate your performance in one line:

In [0]:
loss, acc = model.evaluate(X_ts, y_ts, batch_size=128)
print("Test set accuracy: {0:.2f}".format(acc))

Or generate predictions on new data:

In [0]:
pred = model.predict(X_ts, batch_size=128)
pred.argmax(axis=1)

In [0]:
model.predict_classes(X_ts, batch_size=128)

Building a question answering system, an image classification model, a Neural Turing Machine, or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?

---

# MNIST classification using a Multi-layer Perceptron (MLP)

<img src="https://drive.google.com/uc?id=1lRjIAnAV9kPNMWPOcF3ub4odChbTkFwR" max-heigth=50%>

## Data loading and preprocessing

In [0]:
from IPython.display import Image, SVG
import matplotlib.pyplot as plt

%matplotlib inline

import numpy as np
import keras
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPooling2D, Dropout

In [0]:
# Loads the training and test data sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [0]:
first_image = X_train[0, :, :]
plt.imshow(first_image, cmap=plt.cm.Greys);

In [0]:
num_classes = len(np.unique(y_train))
num_classes

In [0]:
# 60K training 28 x 28 (pixel) images
X_train.shape

In [0]:
# 10K test 28 x 28 (pixel) images
X_test.shape

In [0]:
input_dim = np.prod(X_train.shape[1:])
input_dim

In [0]:
# The training and test data sets are integers, ranging from 0 to 255.
# We reshape the training and test data sets to be matrices with 784 (= 28 * 28) features.
X_train = X_train.reshape(60000, input_dim).astype('float32')
X_test = X_test.reshape(10000, input_dim).astype('float32')

In [0]:
# Scales the training and test data to range between 0 and 1.
max_value = X_train.max()
X_train /= max_value
X_test /= max_value

In [0]:
# The training and test labels are integers from 0 to 9 indicating the class label
(y_train, y_test)

In [0]:
# We convert the class labels to binary class matrices
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)

y_train.shape, y_test.shape

## Multi-layer Perceptron (MLP)

Technically, we're building a perceptron with one hidden layer.

In [0]:
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(input_dim,)))
model.add(Dense(num_classes, activation='softmax'))

Summarize model

In [0]:
model.summary()

## Train Classifier

In [0]:
# Trains the model, iterating on the training data in batches of 32 in 3 epochs.
# Using the SGD optimizer.
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=32, epochs=3, verbose=1, validation_split=0.25)

In [0]:
# TODO: VISUALIZE TRAINING AND VALIDATION ACCURACY
import matplotlib.pyplot as plt

plt.plot(history.history['acc'], label='training accuracy')
plt.plot(history.history['val_acc'], label='validation accuracy')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend()

## Model Evaluation

In [0]:
loss, acc = model.evaluate(X_test, y_test)
print("Test set loss: {0:.2f} -> accuracy: {1:.2f}".format(loss, acc))

## Predicting some of Held-Out Images

Choose randomly a test sample to predict:

In [0]:
idx = np.random.choice(X_test.shape[0])

first_test_image = X_test[idx, :]
pred = model.predict_classes(first_test_image[None, :])[0]

In [0]:
# TODO: VISUALIZE IMAGE WITH PREDICTION AS TITLE
plt.imshow(first_test_image.reshape(28, 28), cmap=plt.cm.Greys)
plt.title(str(pred))
plt.show()

----

# MNIST classification using a Convolutional Neural Network (ConvNet)

<img src="https://drive.google.com/uc?id=1rAe5QRDGUstZMrPirJS0hNHf3E8x_05S" width=90%>

## Data loading and preprocessing

In [0]:
# Loads the training and test data sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [0]:
# We reshape the training and test data sets to be a 4D tensor.
# Dimensions: num_images x 28 x 28 x 1
# The 1 is because we have a single channel (greyscale). If RGB color images, we'd have 3 channels.
X_train = X_train.reshape(60000, 28, 28, 1).astype('float32')
X_test = X_test.reshape(10000, 28, 28, 1).astype('float32')
input_shape = (28, 28, 1)

In [0]:
# Scales the training and test data to range between 0 and 1.
max_value = X_train.max()
X_train /= max_value
X_test /= max_value

In [0]:
# The training and test labels are integers from 0 to 9 indicating the class label
(y_train, y_test)

In [0]:
# We convert the class labels to binary class matrices
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)

## Convolutional Neural Net (ConvNet)

In [0]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), strides=(1, 1),
                 activation='relu',
                 input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Dropout(0.25))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

Summarize model

In [0]:
model.summary()

## Train Classifier

In [0]:
# Trains the model, iterating on the training data in batches of 128 in 5 epochs.
# Using the SGD optimizer.
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=128, epochs=5, verbose=1, validation_split=0.25)

## Model Evaluation

In [0]:
loss, acc = model.evaluate(X_test, y_test)
print("Test set accuracy: {0:.2f}".format(acc))

---

# Using a pre-trained model

Achieving state-of-the-art image recognition performances training a neural network model from scratch is a **very hard** task. Luckly enough, there is no need to reinvent the wheel!
We can simply stand on the giant's shoulder by using some **pre-trained** deep neural network model in our application for image recognition or use it as an extremely powerful features extractor leaving us with just the duty of *specialize* this general-purpose model for our task at-hand!

Let's pretend you are working for a smart/IoT/intelligent/etc. pets feeding bowl startup and your job is to make the bowl able to recognize cats or dogs to provide the right type of food. In other words, you need to recognize a dog from a cat the most accurate as possible. How would you do that?

One possibility is to start creating a new ConvNet model from scratch as for the MNIST case.

In [0]:
import os
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K

## Data loading and preprocessing

In [0]:
DATA_DIR = 'dataset'

### Visualize some images

Inspecting the dataset folder structure, it looks like this:

```
data/
    training_set/
        dogs/
            dog.1.jpg
            dog.2.jpg
            ...
        cats/
            cat.1.jpg
            cat.2.jpg
            ...
    test_set/
        dogs/
            dog.1.jpg
            dog.2.jpg
            ...
        cats/
            cat.1.jpg
            cat.2.jpg
            ...
```

thus

In [0]:
import matplotlib.pyplot as plt
from PIL import Image

fig, axes = plt.subplots(ncols=4, figsize=(10,15))
for i in range(4):
    if i % 2 == 0:
        _type = 'dog'
    else:
        _type = 'cat'
    axes[i].imshow(Image.open(os.path.join(DATA_DIR, "training_set", _type+'s', "{0}.{1}.jpg".format(_type, i+1))))
plt.show()

## Model building

In [0]:
# dimensions of our images.
img_width, img_height = 224, 224

In [0]:
if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

In [0]:
model.summary()

## Model training

In [0]:
train_data_dir = os.path.join(DATA_DIR, 'training_set')
test_data_dir =  os.path.join(DATA_DIR, 'test_set')
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 20
batch_size = 128


# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

test_generator = test_datagen.flow_from_directory(
    test_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary')

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=test_generator,
    validation_steps=nb_validation_samples // batch_size)

In [0]:
loss, acc = model.evaluate_generator(test_generator, steps=nb_validation_samples // batch_size)
print("Test set accuracy: {0:.2f}".format(acc))

We have reached ~80% of accuracy with just 40 lines of code: not bad!

## Save model for reuse

In [0]:
model.save("cat_vs_dogs_from_scratch.hd5")

In [0]:
!ls cat*

---

## Standing on the giant's shoulders!

We can _stand on the giant's shoulders_ by using a state-of-the-art convolutional model as features extractor and by _fine-tune_ the last layers to perform the _cat-vs-dog_ classification task.

We will use the _ResNet-50_ architecture, pre-trained on the ImageNet dataset. Because the ImageNet dataset contains several "cat" classes (persian cat, siamese cat...) and many "dog" classes among its total of 1000 classes, this model will already have learned features that are relevant to our classification problem. In fact, it is possible that merely recording the softmax predictions of the model over our data rather than the bottleneck features would be enough to solve our dogs vs. cats classification problem extremely well. However, the method we present here is more likely to generalize well to a broader range of problems, including problems featuring classes absent from ImageNet.

Here's what the ResNet architecture looks like:

<img src="https://drive.google.com/uc?id=1AminLP0xNTlbbQ8FFKMOGRSw5oeisGqC">

Our strategy will be as follow: we will use the pre-trained convolutional part of the model as a _very_ powerful features extractor for the simple feed-forward binary classifier for distinguishing between cats and dogs images.

In [0]:
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Flatten, Dense, Dropout
from tensorflow.python.keras.applications.resnet50 import ResNet50, preprocess_input
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator


IMAGE_SIZE    = (224, 224)
NUM_CLASSES   = 2
BATCH_SIZE    = 64  # try reducing batch size or freeze more layers if your GPU runs out of memory
FREEZE_LAYERS = 2  # freeze the first this many layers for training
NUM_EPOCHS    = 5
WEIGHTS_FINAL = 'model-resnet50-final.h5'


train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   channel_shift_range=10,
                                   horizontal_flip=True,
                                   fill_mode='nearest')
train_batches = train_datagen.flow_from_directory(os.path.join(DATA_DIR, 'training_set'),
                                                  target_size=IMAGE_SIZE,
                                                  interpolation='bicubic',
                                                  class_mode='categorical',
                                                  shuffle=True,
                                                  batch_size=BATCH_SIZE)

valid_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
valid_batches = valid_datagen.flow_from_directory(os.path.join(DATA_DIR, 'test_set'),
                                                  target_size=IMAGE_SIZE,
                                                  interpolation='bicubic',
                                                  class_mode='categorical',
                                                  shuffle=False,
                                                  batch_size=BATCH_SIZE)

# show class indices
print('****************')
for cls, idx in train_batches.class_indices.items():
    print('Class #{} = {}'.format(idx, cls))
print('****************')

# build our classifier model based on pre-trained ResNet50:
# 1. we don't include the top (fully connected) layers of ResNet50
# 2. we add a DropOut layer followed by a Dense (fully connected)
#    layer which generates softmax class score for each class
# 3. we compile the final model using an Adam optimizer, with a
#    low learning rate (since we are 'fine-tuning')
net = ResNet50(include_top=False, weights='imagenet', input_tensor=None,
               input_shape=(IMAGE_SIZE[0],IMAGE_SIZE[1],3))

x = net.output
x = Flatten()(x)
x = Dropout(0.5)(x)
output_layer = Dense(NUM_CLASSES, activation='softmax', name='softmax')(x)
net_final = Model(inputs=net.input, outputs=output_layer)
for layer in net_final.layers[:FREEZE_LAYERS]:
    layer.trainable = False
for layer in net_final.layers[FREEZE_LAYERS:]:
    layer.trainable = True
net_final.compile(optimizer=Adam(lr=1e-5),
                  loss='categorical_crossentropy', metrics=['accuracy'])
print(net_final.summary())

# train the model
net_final.fit_generator(train_batches,
                        steps_per_epoch = train_batches.samples // BATCH_SIZE,
                        validation_data = valid_batches,
                        validation_steps = valid_batches.samples // BATCH_SIZE,
                        epochs = NUM_EPOCHS)

In [0]:
# save trained weights
net_final.save(WEIGHTS_FINAL)

# Show time!

Let's see how does the model perform on some new image!

<img src="https://drive.google.com/uc?id=1poVN2BQxwcoxrGK3zZyT3pbCGF_CfiZx" width=60%>

Load the trained model

In [0]:
from tensorflow.python.keras.models import load_model

net = load_model('model-resnet50-final.h5')

Load a new image

In [0]:
# DOG
# url = "https://img.huffingtonpost.com/asset/5cc1e090260000340070fb57.jpeg"
url = "https://www.mille-animali.com/wp/wp-content/uploads/2018/12/Jack-Russel.jpg"

# CAT
# url = "https://cdn.pixabay.com/photo/2018/04/20/17/18/cat-3336579__340.jpg"
# url = "https://www.lastampa.it/rf/image_500/Pub/p4/2019/06/07/LaZampa/Foto/RitagliWeb/d4819b0c-8910-11e9-aa38-fcaa0c76e025_gatto_escursione001-kdYE-U11203580860820DSB-1024x576%40LaStampa.it.jpg"

In [0]:
!curl -o new_image.jpg $url 

In [0]:
!ls -lt new_image.jpg

In [0]:
from IPython.display import Image
Image('new_image.jpg', width=600)

Let's test it on new samples!

In [0]:
from keras.preprocessing import image

cls_list = ['cats', 'dogs']

img = image.load_img('new_image.jpg', target_size=(img_width, img_height))
x = image.img_to_array(img)

x = preprocess_input(x)
x = np.expand_dims(x, axis=0)
pred = net.predict(x)[0]
top_inds = pred.argsort()[::-1][:5]
for i in top_inds:
    print('    {:.3f}  {}'.format(pred[i], cls_list[i]))