<div class="alert alert-info">
    <h2 align="center"> Week 10: Deep Learning And Convolutional Neural Network</h2>
    <h3 align="center"><a href="http://www.snrazavi.ir">Seyed Naser RAZAVI</a></h3>
</div>

# Recall
- **Machine Learning:** Data + Model + Optimization

### Data:
- **Image Data:** CIFAR-10

### Model
- **Linear classification:** linear score function
- **Non-linear classification:** Multi-layer Neural Networks

### Optimization
- **Loss function:** SVM or Softmax
- **Optimization:** Gradient Decsent, SGD, SGD + momentum, etc.
- **Backpropagation:** Computing gradients using chain rule
- **Techniques:** Regularization, Batch Normalization, Dropout, Data Augmentation, etc.

# Today's Topics
- Introduction to CNN
- Introduction to Keras
- Implementing CNN Using Keras 

<img align="left" src="http://www.snrazavi.ir/imgs/Keras.png" width="30%"/>

# Why Keras?
- A very simple and popular framework for **Deep Learning**: (2nd place, 2017)
- It is easy to learn and easy to use.
- Very flexible and powerfull: it works on top of **Tensorflow** from Google.

For more information, please see <a href="https://keras.io">here</a>.

<img src="http://www.snrazavi.ir/imgs/DL-frameworks.png" width="75%"/>

## How to install Keras
- If you are using anaconda distribution for python, installing Keras is a very simple task.
- Just type the following in the command prompt:

#### GPU version:
<code>
> conda install -c anaconda keras-gpu
</code>

#### CPU version:
<code>
> conda install -c anaconda keras
</code>

In [3]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt

import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Activation

AttributeError: module 'tensorflow.python.keras.utils.generic_utils' has no attribute 'populate_dict_with_module_objects'

In [4]:
image_size = 32
num_channels = 3
num_features = image_size * image_size * num_channels
num_classes = 10

num_train = 49000

# Load CIFAR10 Dataset

In [None]:
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

print('Train data shape: {}'.format(X_train.shape))
print('Test  data shape: {}'.format(X_test.shape))

## Data Visualization

In [None]:
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
samples_per_class = 5

plt.figure(figsize=(16, 8))

for cls, name in enumerate(class_names):
    idxs = np.flatnonzero(y_train == cls)
    idxs = np.random.choice(idxs, samples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt.subplot(samples_per_class, num_classes, i * num_classes + cls + 1)
        plt.imshow(X_train[idx], interpolation='spline16')
        plt.axis('off')
        if i == 0:
            plt.title(class_names[cls])

## Data Preprocessing

In [None]:
# Convert 4D arrays to 2D arrays
X_train = X_train.reshape([-1, num_features])
X_test  =  X_test.reshape([-1, num_features])

print('Train data shape: {}'.format(X_train.shape))
print('Test  data shape: {}'.format(X_test.shape))

In [None]:
# convert pixel range from [0, 255] to [0., 1.]
X_train = X_train.astype('float32')
x_test  = X_test.astype('float32')

mu = np.mean(X_train, axis=0)

X_train -= mu
X_train /= 255.0

x_test -= mu
x_test /= 255.0

## One-hot encodding
- We have 10 classes: 0, 1, ..., 9

In one-hot encoding, we represent labels using vectors.

```python
0: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
2: [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
.
.
.
9: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
```

In [None]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test  = keras.utils.to_categorical(y_test,  num_classes)

print(y_train.shape)
print(y_test.shape)

# Linear Classifier

## $$f(x, W, b) = Wx+b$$

<img src="http://www.snrazavi.ir/imgs/linear_classifier.jpg" width="75%"/>

In [None]:
model = Sequential()  # a sequence of layers
model.add(Dense(num_classes, input_shape=(num_features,), activation='softmax'))

In [None]:
model.summary()

## Train Model

In [None]:
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
history = model.fit(X_train[:num_train], y_train[:num_train], 
                    batch_size=512, 
                    epochs=15,
                    verbose=2,
                    validation_data=(X_train[num_train:], y_train[num_train:]),
                    shuffle=True)

In [None]:
model.save('Linear-model.h5')

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(history.history['loss'], label="Training loss")
plt.plot(history.history['val_loss'], label="Validation loss")
plt.legend()
plt.title("Training vs Validation Loss")
plt.show()

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(history.history['acc'], label="Training acc")
plt.plot(history.history['val_acc'], label="Validation acc")
plt.legend()
plt.title("Training vs Validation Accuracy")
plt.show()

## Predicting class for new images

In [None]:
# Select 25 random images from test images
idx = np.random.choice(x_test.shape[0], 25, replace=False)

# predict class for selected images
y_pred = model.predict_classes(x_test[idx])

# True class for selected images
y_true = y_test[idx].argmax(axis=1)

print('Test Accuracy = %.2f' % (100.0 * np.mean(y_pred == y_true)))

In [None]:
fig, axes = plt.subplots(5, 5, figsize=(18, 18))

for i, ax in enumerate(axes.flat):
    ax.imshow(X_test[idx[i]].reshape(image_size, image_size, num_channels), interpolation='spline16')
    pred_class = class_names[y_pred[i]]
    true_class = class_names[y_true[i]]
    ax.set_xlabel('Pred: {}\nTrue: {}'.format(pred_class, true_class), fontsize=12)
    ax.set_xticks([])
    ax.set_yticks([])

# Multi-layer Neural Network

### $$f(x) = W_3 \times \max(W_2 \times \max(0, W_1 x))$$

<img src="http://www.snrazavi.ir/imgs/neural_net2.jpeg" width="50%"/>

In [None]:
model = Sequential()

# First hidden Layer
model.add(Dense(units=100, input_shape=(num_features,)))
model.add(Activation(activation='relu'))

# Second hidden Layer
model.add(Dense(units=100))
model.add(Activation(activation='relu'))

model.add(Dense(units=100, input_shape=(num_features,)))
model.add(Activation(activation='relu'))


# Output Layer
model.add(Dense(units=num_classes, activation='softmax'))

In [None]:
model.summary()

### Training the model

In [None]:
optimizer = keras.optimizers.RMSprop(lr=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
model.fit(X_train[:num_train], y_train[:num_train],
          batch_size=256,
          epochs=15,
          validation_data=(X_train[num_train:], y_train[num_train:]))

In [None]:
model.save('nn.h5')
# model = keras.models.load_model('nn.h5')

## Adding Dropout and Batch Normalization

In [None]:
from keras.layers import BatchNormalization, Dropout

In [None]:
model = Sequential()

# First Hidden Layer
model.add(Dense(units=100, input_shape=(num_features,)))
model.add(BatchNormalization())
model.add(Activation(activation='relu'))
model.add(Dropout(0.2))

# Second Hidden Layer
model.add(Dense(units=100))
model.add(BatchNormalization())
model.add(Activation(activation='relu'))
model.add(Dropout(0.2))

# Output Layer
model.add(Dense(units=num_classes, activation='softmax'))

# print model
model.summary()

### Training the model

In [None]:
optimizer = keras.optimizers.Adam(lr=0.02, decay=1e-6)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
model.fit(X_train[:num_train], y_train[:num_train],
          batch_size=256,
          epochs=15,
          validation_data=(X_train[num_train:], y_train[num_train:]))

In [None]:
model.save('nn-dropout-bn.h5')
# model = keras.models.load_model('nn-dropout-bn.h5')

In [None]:
model.evaluate(x_test, y_test, batch_size=256)

# Convolutional Neural Networks

<img src="http://www.snrazavi.ir/imgs/CNN.png" width="75%"/>

<img src="http://www.snrazavi.ir/imgs/cnn_flowchart.png" width="100%"/>

## Convolution layer

<img src="http://www.snrazavi.ir/imgs/conv_layer.gif" width="80%" />

## Pooling layer

<img src="http://www.snrazavi.ir/imgs/maxpool.jpeg" width="60%"/>

### Visualization and Underestanding CNN
See This <a href="https://www.youtube.com/watch?v=ghEmQSxT6tw&t=1001s">video from Matt Zeiler</a> for a better understanding of convolutional neural network.

In [None]:
%%HTML
<iframe width="854" height="480" src="https://www.youtube.com/embed/ghEmQSxT6tw" frameborder="0" gesture="media" allow="encrypted-media" allowfullscreen></iframe>

## Converting dataset to 4D array

In [None]:
X_train = X_train.reshape((-1, image_size, image_size, num_channels))
X_test  =  X_test.reshape((-1, image_size, image_size, num_channels))
x_test  =  x_test.reshape((-1, image_size, image_size, num_channels))

In [None]:
from keras.layers import Conv2D, MaxPooling2D, Flatten

In [None]:
def create_cnn():
    model = Sequential()

    # Conv Block 1
    model.add(Conv2D(64, (3, 3), padding='same', input_shape=X_train.shape[1:], activation='relu'))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    # Conv Block 2
    model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    # Conv Block 3
    model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
    model.add(Conv2D(256, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    # Classifier
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(0.25))
    model.add(Dense(num_classes, activation='softmax'))
    
    return model

model = create_cnn()

#print model
model.summary()

In [None]:
optimizer = keras.optimizers.Adam(lr=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train[:num_train], y_train[:num_train],
          batch_size=200,
          epochs=1,
          validation_data=(X_train[num_train:], y_train[num_train:]))

## Data Augmentation
<img src="http://www.snrazavi.ir/imgs/09-Augmentation.jpg" width="80%"/>

In [None]:
model = create_cnn()
optimizer = keras.optimizers.Adam(lr=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
from keras.preprocessing.image import ImageDataGenerator

epochs = 5
batch_size = 256
data_augmentation = True


if not data_augmentation:
    print('Training without data augmentation.')
    model.fit(X_train[:num_train], y_train[:num_train], 
              batch_size=batch_size, 
              epochs=epochs,
              validation_data=(X_test, y_test),
              shuffle=True)
else:
    print('Training using real-time data augmentation.')
    datagen = ImageDataGenerator(
        featurewise_center=False, 
        samplewise_center=False, 
        featurewise_std_normalization=False, 
        samplewise_std_normalization=False, 
        zca_whitening=False, 
        rotation_range=0, 
        width_shift_range=0.1, 
        height_shift_range=0.1, 
        horizontal_flip=True, 
        vertical_flip=False)
    
    datagen.fit(X_train[:num_train])
    
    model.fit_generator(datagen.flow(X_train[:num_train], y_train[:num_train], batch_size=batch_size),
                        steps_per_epoch=num_train//batch_size,
                        epochs=epochs,
                        validation_data=(X_train[num_train:], y_train[num_train:]))

### Saving and Loading model

In [None]:
# model.save('cnn_data_augmentation.h5')

In [None]:
model = keras.models.load_model('cnn_data_augmentation.h5')

In [None]:
model.evaluate(x_test, y_test, batch_size=250)

## Predicting class for new images

In [None]:
y_test = np.argmax(y_test, axis=1) # to 0, 1, ..., 9

In [None]:
plt.figure(figsize=(12, 24))
idx = np.random.choice(len(x_test), 10, replace=False)

p = model.predict(x_test[idx])

for i in range(len(idx)):
    plt.subplot(10, 2, 2*i+1)
    plt.imshow(X_test[idx[i]], interpolation='spline16')
    plt.title(class_names[y_test[idx[i]]])
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    
    pred_label = np.argsort(-p[i])[:3]
    pred_prob = [p[i][l] for l in pred_label]
    pred_label = [class_names[l] for l in pred_label]
    
    plt.subplot(10, 2, 2*i+2)
    plt.bar(range(3), pred_prob)
    plt.xticks(range(3), pred_label)

plt.show()

# Last Word
- Solve problems (<a href="www.kaggle.com">Kaggle</a> is a very good place to start)
- Read papers
- Write about your experiments (both failure and success)
- Attend in my Deep Learning workshop (within 4 to 6 weeks)

# What will be covered in DL workshop?
- A framework for coding (Tensorflow, pyTorch, Keras)
- Convolutional Neural Networks (a deeper look)
- Recurrent Neural Networks for temporal data (text, speech, video)
- Generative Models
- Deep reinforcement Learning (robotics, game, self-driving cars)
- Some applications (Image Captioning, Sentiment Analysis, Machine Translation, etc.)

See you soon!