# Deep Computer Vision Using Convolutional Neural Networks

## Exercises

## 1.
The CNN will automatically learn to extract features from the input image; a CNN has many fewer parameters than a fully connected DNN, which makes it faster to train, and less prone to overfitting; the kernel of a CNN that can detect a particular feature can be used to detect that feature anywhere in the image.

## 3.
- Use 16-bit floats instead of 32-bit.
- Reduce the mini-batch size.
- Reduce dimensionality using a larger stride.
- Remove some layers.
- Distribute the CNN across multiple devices.

## 4.
A max pooling layer has the advantage of having no parameters at all.

## 5.
A local response normalization layer encourages different feature maps to specialize, and pushes them apart, forcing them to explore a wider range of features. It is typically used in the lower layers to have a larger pool of low-level features that the upper layers can build upon.

## 6.
AlexNet is much wider and deeper, and it stacks convolutional layers directly on top of each other. GoogLeNet introduced inception modules, allowing for much deeper networks with fewer parameters. ResNet introduced skip connections. SENet introduced the idea of using an SE blocks to recalibrate the relative importance of feature maps. Xception introduced the use of depthwise separable convolutional layers, which look at spatial patterns and depthwise patterns separately.

## 7.
A fully convolutional network is a neural network composed exclusively of convolutional and pooling layers. To convert dense layers to convolutional layers, replace the lowest dense layer with a convolutional layer with a kernel size equal to the layer's input size, with one filter per neuron in the dense layer, and using `"valid"` padding. The stride should generally be 1, and the activation function should be the same. The other layers should be converted the same way, but using $1 \times 1$ filters.

## 8.
The main problem is that much of the spatial information gets lost in a CNN as the signal flows through the network, especially in pooling layers and layers with a stride greater than 1.

## 9.

In [1]:
import os
from tensorflow import keras
import numpy as np

In [2]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.mnist.load_data()

X_train_full = X_train_full / 255.
X_test = X_test / 255.

X_train, X_val = X_train_full[:-5000], X_train_full[-5000:]
y_train, y_val = y_train_full[:-5000], y_train_full[-5000:]

X_train, X_val, X_test = X_train[..., np.newaxis], X_val[..., np.newaxis], X_test[..., np.newaxis]

In [3]:
model = keras.models.Sequential([
    keras.layers.Conv2D(16, kernel_size=5, padding='same', activation='relu'),
    keras.layers.Conv2D(32, kernel_size=3, padding='same', activation='relu'),
    keras.layers.MaxPool2D(),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='nadam', metrics=['accuracy'])

In [4]:
hist = model.fit(X_train, y_train, epochs=10, batch_size=256, validation_data=(X_val, y_val))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [5]:
model.evaluate(X_test, y_test)



[0.02811438962817192, 0.9900000095367432]

## 10.

In [6]:
import tensorflow as tf
import tensorflow_datasets as tfds

def preprocess(image, label):
    resized_image = tf.image.resize(image, [224, 224])
    preprocessed_image =  keras.applications.mobilenet_v2.preprocess_input(resized_image)
    
    return preprocessed_image, label

In [7]:
train_set, val_set, test_set = tfds.load(
    'tf_flowers',
    as_supervised=True,
    shuffle_files=True,
    split=[
        tfds.Split.TRAIN.subsplit(tfds.percent[:80]),
        tfds.Split.TRAIN.subsplit(tfds.percent[80:90]),
        tfds.Split.TRAIN.subsplit(tfds.percent[90:]),
    ])

In [8]:
batch_size = 32
train_set = train_set.map(preprocess).batch(batch_size).prefetch(1)
val_set = val_set.map(preprocess).batch(batch_size).prefetch(1)
test_set = test_set.map(preprocess).batch(batch_size).prefetch(1)

In [9]:
base_model = keras.applications.MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
avg = keras.layers.GlobalAveragePooling2D()(base_model.output)
output = keras.layers.Dense(10, activation='softmax')(avg)

model = keras.models.Model(inputs=base_model.input, outputs=output)

In [10]:
for layer in base_model.layers:
    layer.trainable = False
    
optimizer = keras.optimizers.SGD(lr=0.15, momentum=0.9, decay=0.01)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer,
              metrics=["accuracy"])

model.fit(train_set, validation_data=val_set, epochs=3);

Epoch 1/3
Epoch 2/3
Epoch 3/3


In [11]:
for layer in base_model.layers:
    layer.trainable = True

optimizer = keras.optimizers.SGD(learning_rate=0.001, momentum=0.9,
                                 nesterov=True, decay=0.001)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer,
              metrics=["accuracy"])
history = model.fit(train_set, validation_data=val_set, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [12]:
model.evaluate(test_set)



[6.775132179260254, 0.6805555820465088]