# AI and ML for Coders Ch 3

Let's try to have a model detect features from images to best learn how to classify them.

Image filters called convolutions can help with augmenting the pixels and therefore construct an efficient representation 
of the core features of an image.

This chapter goes into image augmentation and transfer learning


A convolution is a filter of weights that can be multiplied by a pixel and it's neighbors to produe a new value of the pixel

For example a 3x3 grid of pixels (center pixel and surrounding 8 pixels) could be multipled by a 3x3 filter producting a new value that will replace the center pixel value

Repeate the process for every image and then a new filtered image is constructed

Certain filters will provide varying resulting images. One filter applied with negative pixels on the left, zero in the middle and positive on the right will product a filtered image with nohting but vertical lines. If the negative, zero, positive was from top to bottom then all that will be left are horizatonal lines

Information is removed based on the filters. We could ideally _learn_ which filters reduce the image to features that are predictive for the labels.

When combined with _pooling_ we can reduce the amount of inof n the image while maintain the features

### Pooling

Pooling involves reducing an image while retaining on the core context.

**Max Pooling* does this by grouping an image into smaller arrays and taking the max (highest) pixel value to replace the entire smaller array

So a 4 x 4 array split into 2 x 2 arrays would result in 4 smaller arrays. Taking the max pixel value in each array will result in 4 pixel values that now replace the 4x4 array



In [3]:
# let's use the same mnist as the last chapter but update the model to include CNNs

import tensorflow as tf
data = tf.keras.datasets.fashion_mnist

(training_images, training_labels), (test_images, test_labels) = data.load_data()

training_images = training_images.reshape(60000, 28, 28, 1)
training_images = training_images / 255.0
test_images = test_images.reshape(10000, 28, 28, 1)
test_images = test_images / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.2479928731918335, 0.9106000065803528]

In [6]:
classifications = model.predict(test_images)
print(classifications[0])
print(test_labels[0])

[3.0219702e-08 5.2809389e-12 1.6438183e-09 3.7111588e-11 1.7116486e-10
 3.4847562e-05 6.5400046e-10 1.6792927e-05 7.4534330e-11 9.9994838e-01]
9


In [7]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 26, 26, 64)        640       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 13, 13, 64)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 11, 11, 64)        36928     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 1600)              0         
_________________________________________________________________
dense_4 (Dense)              (None, 128)               204928    
_________________________________________________________________
dense_5 (Dense)              (None, 10)               

Notice that the total parameters increase significantly as the image size (output shape) decreases

The convolutional and dense layers contain weights and biases as paremeters that are applied to each neuron

for the convolutonal layers there's 64 3x3 filters. The filters have 9 weights and 1 bias

So (64 * 9) + 64 = 640 parameters

The second convolution will have to consider the previous 64 filters

So (64 * (64 * 9)) + 64 = 36928 parameters

The first dense later has to deal with a flattened 5x5 for 64 images == 1600

That 1600 is then multiplied by the 128 neurons and added to the 128 biases

So (1600 * 128) + 128 = 204928 parameters

The final dense layer takes the output of the previous 128 and mlutiples by 10 with 10 biases

(128 * 10) + 10 = 1290 parameters

### Horse or Humans Dataset

Let's try a trickier dataset. One where the images are not centered and the import objects are not the same size

We'll need to use the ImageDataGenerator to label the dataset as well

In [5]:
import urllib.request
import zipfile

url = "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/horse-or-human.zip"
filename = "horse-or-human.zip"
training_dir = 'horse-or-human/training/'

urllib.request.urlretrieve(url, filename)

zip_ref = zipfile.ZipFile(filename, 'r')
zip_ref.extractall(training_dir)
zip_ref.close()

In [15]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# All iamges will be rescaled by 1./255

train_datagen = ImageDataGenerator(rescale=1/255)

train_generator = train_datagen.flow_from_directory(
  training_dir,
  target_size=(300, 300),
  class_mode='binary'
)


Found 1027 images belonging to 2 classes.


Note: Images are much larger than the MNIST dataset at 300x300

They're also color images and so will have 3 channels in the third dimension

Also this is a binary classifier, so we will use one output neurone to produce a number between 0 and 1

In [12]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(16, (3, 3), activation='relu',
      input_shape=(300, 300, 3)),
  tf.keras.layers.MaxPool2D(2, 2),
  tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
  tf.keras.layers.MaxPool2D(2, 2),
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPool2D(2, 2),
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPool2D(2, 2),
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPool2D(2, 2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(1, activation='sigmoid')
])

In [13]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_7 (Conv2D)            (None, 298, 298, 16)      448       
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 149, 149, 16)      0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 147, 147, 32)      4640      
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 73, 73, 32)        0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 71, 71, 64)        18496     
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 35, 35, 64)        0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 33, 33, 64)       

In [14]:
# compile with a binary_corossentroopy loss
model.compile(loss='binary_crossentropy',
  optimizer=tf.keras.optimizers.RMSprop(lr=0.001),
  metrics=['accuracy']
)



In [15]:
!pip install pillow



In [16]:
# train using fit_gneratior and passing the training_generatier created earlier

history = model.fit_generator(train_generator, epochs=15)



Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


### Validation 

Let's load in the validation dataset (from a separate zip) to validate the model

In [17]:
validation_url = "https://storage.googleapis.com/laurencemoroney-blog.appspot.com/validation-horse-or-human.zip"

validation_filename = "validation-horse-or-human.zip"
validation_dir= "horse-or-human/validation"
urllib.request.urlretrieve(validation_url, validation_filename)

zip_ref = zipfile.ZipFile(validation_filename, 'r')
zip_ref.extractall(validation_dir)
zip_ref.close()

In [18]:
validation_datagen = ImageDataGenerator(rescale=1/255)

validation_generator = validation_datagen.flow_from_directory(
  validation_dir,
  target_size=(300, 300),
  class_mode='binary'
)

Found 256 images belonging to 2 classes.


In [19]:
hist = model.fit_generator(
  train_generator,
  epochs=15,
  validation_data=validation_generator
)



Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


Test out this model in Colab. Here's the [Github Link](http://bit.ly/horsehuman)

## Image Augmentation

The images are CGI and perform decently well on real images. However it still misclassifies on images where humans or horses are positioned in ways the model hasn't seen yet.

Image augmentation can be a way to transform images in a way to better handle different use cases. with the `ImageDataGenerator` you can peform many taransforms 

```python

train_datagen = ImageDataGenerator(
  rescale=1./255,
  rotation_range=40,
  width_shift_range=0.2,
  shear_range=0.2,
  zoom_range=0.2,
  horizontal_flip=True,
  fill_mode='nearest'
)
```

This covers the common transformations

- Rotation (randoming up to 40 degress left or right)
- Shifting horizontally (up to 20%)
- Shifting vertically (up to 20%)
- Shearing (by up to 20%)
- Zooming (by up to 20%)
- Flipping (randomly horizontally or vertically)
- Filling in any missing pixelas after a move or shear with nearest neihbors

## Transfer Learning


What if we used a large model trained on many more features and parameters and fine tuned it for our use case?


Let's try this out with Inception version 3. A large model trained on ImageNet




In [6]:
from tensorflow.keras.applications.inception_v3 import InceptionV3

weights_url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5"

weights_file = "inception_v3.h5"
urllib.request.urlretrieve(weights_url, weights_file)

pre_trained_model = InceptionV3(input_shape=(150, 150, 3),
  include_top=False,
  weights=None)

pre_trained_model.load_weights(weights_file)

In [7]:
pre_trained_model.summary()

Model: "inception_v3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 150, 150, 3) 0                                            
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 74, 74, 32)   864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 74, 74, 32)   96          conv2d_4[0][0]                   
__________________________________________________________________________________________________
activation (Activation)         (None, 74, 74, 32)   0           batch_normalization[0][0]        
_______________________________________________________________________________________

In [8]:
# freeze the model from retraining and crop the network

for layer in pre_trained_model.layers:
  layer.trainable = False

last_layer = pre_trained_model.get_layer('mixed7')
print('last layour output shape: ', last_layer.output_shape)
last_output = last_layer.output

last layour output shape:  (None, 7, 7, 768)


In [9]:
# let's add our dense layers underneath

# Flatten the output layter to 1 dimension
x = tf.keras.layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = tf.keras.layers.Dense(1024, activation='relu')(x)
# Add a final sigmoid layer for classifcation
x = tf.keras.layers.Dense(1, activation='sigmoid')(x)

In [11]:
model = tf.keras.Model(pre_trained_model.input, x)

model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.0001),
  loss='binary_crossentropy',
  metrics=['acc'])

## Multiclass Classification


Let's see how to setup multilabel classification with categorical loss

First let's load the image dataset, buld a model and fit it

In [12]:
!wget --no-check-certificate \
  https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip \
    -O /tmp/rps.zip

--2021-09-25 15:34:34--  https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.165.144, 142.250.65.208, 142.250.64.112, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.165.144|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 200682221 (191M) [application/zip]
Saving to: ‘/tmp/rps.zip’


2021-09-25 15:34:39 (45.4 MB/s) - ‘/tmp/rps.zip’ saved [200682221/200682221]



In [16]:
local_zip = '/tmp/rps.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp/')
zip_ref.close()

TRAINING_DIR = "/tmp/rps/"
training_datagen = ImageDataGenerator(
  rescale=1./255,
  rotation_range=40,
  width_shift_range=0.2,
  height_shift_range=0.2,
  shear_range=0.2,
  zoom_range=0.2,
  horizontal_flip=True,
  fill_mode='nearest'
)

In [17]:
# Set class mode to categorical to support more than two labels

train_generator = training_datagen.flow_from_directory(TRAINING_DIR, 
  target_size=(150, 150), 
  class_mode='categorical')

Found 2520 images belonging to 3 classes.


In [19]:
model = tf.keras.models.Sequential([
  # first convolution
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu',
    input_shape=(150, 150, 3)),
  tf.keras.layers.MaxPooling2D(2, 2),
  # second convolution
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  # third convolution
  tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  # fourth
  tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  # flatten resoults for DNN
  tf.keras.layers.Flatten(),
  # 512 neuron hidden layer
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(3, activation='softmax')

])

In [20]:
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

In [22]:
history = model.fit(train_generator, epochs=25, verbose=1)

Epoch 1/25
Epoch 2/25

KeyboardInterrupt: 


## Dropout Regularization


Overfitting can happen often when training NNs for many epocs

THis is because neurons can become specialized and eventually the entire network becomes specialized as the weights and biases are shared to across neurons in hidden layers


Removing a random number of neurons while training can prevent neournes from sharing weights and biases and therefore becoming overspecialzied. This is called dropout and helps to improve generatlization

`tf.keras.layers.Dropout(0.2)`