# Building a CNN from Scratch - Lab

## Introduction

Now that you have background knowledge regarding how CNNs work and how to build them using Keras, its time to practice those skills a little more independently in order to build a CNN on your own to solve a image recognition problem. In this lab, you'll practice building an image classifier from start to finish using a CNN.  

## Objectives

In this lab you will: 

- Load images from a hierarchical file structure using an image datagenerator 
- Apply data augmentation to image files before training a neural network 
- Build a CNN using Keras 
- Visualize and evaluate the performance of CNN models 

## Loading the Images

The data for this lab concerns lung xray images for pneumonia. The original dataset is from Kaggle. We have downsampled this dataset in order to reduce training time for you when you design and fit your model to the data. ⏰ It is anticipated that this process will take approximately one hour to run on a standard machine, although times will vary depending on your particular computer and set up. At the end of this lab, you are welcome to try training on the complete dataset and observe the impact on the model's overall accuracy. 

You can find the initial downsampled dataset in a subdirectory, **chest_xray**, of this repository. 

In [45]:
#Import libaries!
import os
import time
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from PIL import Image
from scipy import ndimage
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

np.random.seed(123)

In [46]:
Establish image directories!
train_dir = 'chest_xray_downsampled/train/'
validation_dir = 'chest_xray_downsampled/val/'
test_dir = 'chest_xray_downsampled/test/' 

In [55]:
#Rescale all images by 255
train_gen = ImageDataGenerator(rescale=1./255)
validation_gen = ImageDataGenerator(rescale=1./255)

#Flow the data in from the directories
train_generator = train_gen.flow_from_directory(train_dir,
                                                target_size=(150, 150),
                                                batch_size=20,
                                                class_mode='binary')
validation_generator = validation_gen.flow_from_directory(validation_dir,
                                                          target_size=(150,150),
                                                          batch_size=20,
                                                          class_mode='binary')


Found 1738 images belonging to 2 classes.
Found 24 images belonging to 2 classes.


## Designing the Model

Now it's time to design your CNN using Keras! Remember a few things when doing this: 

- You should alternate convolutional and pooling layers
- You should have later layers have a larger number of parameters in order to detect more abstract patterns
- Add some final dense layers to add a classifier to the convolutional base 
- Compile this model 

In [60]:
# Your code here; design and compile the model
from keras import models, layers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3),activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3),activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3),activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3),activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

In [62]:
from keras import optimizers
model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
              loss='binary_crossentropy',
              metrics=['acc'])

## Training and Evaluating the Model

Remember that training deep networks is resource intensive: depending on the size of the data, even a CNN with 3-4 successive convolutional and pooling layers is apt to take a hours to train on a high end laptop. Using 30 epochs and 8 layers (alternating between convolutional and pooling), our model took about 40 minutes to run on a year old macbook pro.


If you are concerned with runtime, you may want to set your model to run the training epochs overnight.  

**If you are going to run this process overnight, be sure to also script code for the following questions concerning data augmentation. Check your code twice (or more) and then set the notebook to run all, or something equivalent to have them train overnight.** 

In [None]:
# Set the model to train 
# ⏰ This cell may take several minutes to run 


In [None]:
# Plot history
import matplotlib.pyplot as plt
%matplotlib inline


## Save the Model

In [None]:
# Your code here; save the model for future reference 

## Data Augmentation

Recall that data augmentation is typically always a necessary step when using a small dataset as this one which you have been provided. As such, if you haven't already, implement a data augmentation setup.

**Warning: ⏰ This process took nearly 4 hours to run on a relatively new macbook pro. As such, it is recommended that you simply code the setup and compare to the solution branch, or set the process to run overnight if you do choose to actually run the code.** 

In [None]:
# Add data augmentation to the model setup and set the model to train; 
# See warnings above if you intend to run this block of code 
# ⏰ This cell may take several hours to run 


Save the model for future reference.  

In [None]:
# Save the model 


## Final Evaluation

Now use the test set to perform a final evaluation on your model of choice. 

In [None]:
# Your code here 
# Perform a final evaluation using the test set

## Level Up (Optional): Adding More Data to the Model

As discussed, the current dataset we worked with is a subset of a dataset hosted on Kaggle. Increasing the data that we use to train the model will result in additional performance gains but will also result in longer training times and be more resource intensive.   

⏰ It is estimated that training on the full dataset will take approximately 4 hours (and potentially significantly longer) depending on your computer's specifications.

In order to test the impact of training on the full dataset, start by downloading the data from Kaggle here: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.   

In [None]:
# Optional extension; Your code here
# ⏰ This cell may take several hours to run 

## Summary

Well done! In this lab, you practice building your own CNN for image recognition which drastically outperformed our previous attempts using a standard deep learning model alone. In the upcoming sections, we'll continue to investigate further techniques associated with CNNs including visualizing the representations they learn and techniques to further bolster their performance when we have limited training data such as here.

In [46]:
#Establish image directories!
# train_dir = 'chest_xray_downsampled/train/'
# validation_dir = 'chest_xray_downsampled/val/'
# test_dir = 'chest_xray_downsampled/test/' 

In [47]:
#Collect images from their respective directories!
# train_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
#         train_dir,
#         target_size=(150, 150), batch_size=200)
# val_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
#         validation_dir,
#         target_size=(150, 150), batch_size=200)
# test_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
#         test_dir,
#         target_size=(150, 150), batch_size=500)

Found 1738 images belonging to 2 classes.
Found 24 images belonging to 2 classes.
Found 188 images belonging to 2 classes.


In [48]:
#Bring in the data!
# train_images, train_labels = next(train_generator)
# val_images, val_labels = next(val_generator)
# test_images, test_labels = next(test_generator)

In [49]:
#Explore the dataset some more
# m_train = train_images.shape[0]
# m_val = val_images.shape[0]
# m_test = test_images.shape[0]

# print('Number of Training Images:', m_train)
# print('train_images shape:', train_images.shape)
# print('train_labels shape:', train_labels.shape)

# print('\nNumber of Val Images:', m_val)
# print('val_images shape:', val_images.shape)
# print('val_labels shape:', val_labels.shape)

# print('\nNumber of Test Images:', m_test)
# print('test_images shape:', test_images.shape)
# print('test_labels shape:', test_labels.shape)

Number of Training Images: 200
train_images shape: (200, 150, 150, 3)
train_labels shape: (200, 2)

Number of Val Images: 24
val_images shape: (24, 150, 150, 3)
val_labels shape: (24, 2)

Number of Test Images: 188
test_images shape: (188, 150, 150, 3)
test_labels shape: (188, 2)


In [50]:
#Reshape our image data for modeling
# train_img = train_images.reshape(train_images.shape[0], -1)
# val_img = val_images.reshape(val_images.shape[0], -1)
# test_img = test_images.reshape(test_images.shape[0], -1)

#Check shapes
# print('train_img shape:', train_img.shape)
# print('val_img shape:', val_img.shape)
# print('test_img shape:', test_img.shape)

train_img shape: (200, 67500)
val_img shape: (24, 67500)
test_img shape: (188, 67500)


In [54]:
#Reshape our label data for modeling
# train_y = np.reshape(train_labels[:,0], (200, 1))
# val_y = np.reshape(val_labels[:,0], (24, 1))
# test_y = np.reshape(test_labels[:,0], (188, 1))

# print('train_y shape:', train_y.shape)
# print('val_y shape:', val_y.shape)
# print('test_y shape:', test_y.shape)

train_y shape: (200, 1)
val_y shape: (24, 1)
test_y shape: (188, 1)


In [None]:
#Build baseline fully connected model
# from keras import models
# from keras import layers

# model = Sequential()
# model.add(layers.Dense())