<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 3 Assignment 2*
# Convolutional Neural Networks (CNNs)

# Assignment

- <a href="#p1">Part 1:</a> Pre-Trained Model
- <a href="#p2">Part 2:</a> Custom CNN Model
- <a href="#p3">Part 3:</a> CNN with Data Augmentation


You will apply three different CNN models to a binary image classification model using Keras. Classify images of Mountains (`./data/mountain/*`) and images of forests (`./data/forest/*`). Treat mountains as the postive class (1) and the forest images as the negative (zero). 

|Mountain (+)|Forest (-)|
|---|---|
|![](./data/mountain/art1131.jpg)|![](./data/forest/cdmc317.jpg)|

The problem is realively difficult given that the sample is tiny: there are about 350 observations per class. This sample size might be something that you can expect with prototyping an image classification problem/solution at work. Get accustomed to evaluating several differnet possible models.

# Pre - Trained Model
<a id="p1"></a>

Load a pretrained network from Keras, [ResNet50](https://tfhub.dev/google/imagenet/resnet_v1_50/classification/1) - a 50 layer deep network trained to recognize [1000 objects](https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt). Starting usage:

```python
import numpy as np

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions

from tensorflow.keras.layers import Dense, GlobalAveragePooling2D()
from tensorflow.keras.models import Model # This is the functional API

resnet = ResNet50(weights='imagenet', include_top=False)

```

The `include_top` parameter in `ResNet50` will remove the full connected layers from the ResNet model. The next step is to turn off the training of the ResNet layers. We want to use the learned parameters without updating them in future training passes. 

```python
for layer in resnet.layers:
    layer.trainable = False
```

Using the Keras functional API, we will need to additional additional full connected layers to our model. We we removed the top layers, we removed all preivous fully connected layers. In other words, we kept only the feature processing portions of our network. You can expert with additional layers beyond what's listed here. The `GlobalAveragePooling2D` layer functions as a really fancy flatten function by taking the average of each of the last convolutional layer outputs (which is two dimensional still). 

```python
x = res.output
x = GlobalAveragePooling2D()(x) # This layer is a really fancy flatten
x = Dense(1024, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)
model = Model(res.input, predictions)
```

Your assignment is to apply the transfer learning above to classify images of Mountains (`./data/mountain/*`) and images of forests (`./data/forest/*`). Treat mountains as the postive class (1) and the forest images as the negative (zero). 

Steps to complete assignment: 
1. Load in Image Data into numpy arrays (`X`) 
2. Create a `y` for the labels
3. Train your model with pretrained layers from resnet
4. Report your model's accuracy

## Load in Data

![skimage-logo](https://scikit-image.org/_static/img/logo.png)

Check out out [`skimage`](https://scikit-image.org/) for useful functions related to processing the images. In particular checkout the documentation for `skimage.io.imread_collection` and `skimage.transform.resize`.

In [1]:
# data imports
import numpy as np
from skimage import io
from sklearn.model_selection import train_test_split

In [2]:
# read in each data folder, turn them into numpy arrays
mountains = io.imread_collection('./data/mountain/*.jpg')
forests = io.imread_collection('./data/forest/*.jpg')

forests_array = np.asarray(forests)
mountains_array = np.asarray(mountains)

In [3]:
# get forests shape
forests_array.shape

(328, 256, 256, 3)

In [4]:
# create labels for forests
y_forests = np.zeros((forests_array.shape[0],1))
y_forests.shape

(328, 1)

In [5]:
# get mountains shape
mountains_array.shape

(374, 256, 256, 3)

In [6]:
# create labels for mountains
y_mountains = np.ones((mountains_array.shape[0],1))
y_mountains.shape

(374, 1)

In [7]:
# concatenate images. check shape.
X = np.concatenate((forests_array, mountains_array))
X.shape

(702, 256, 256, 3)

In [8]:
# concatenate labels, check shape.
y = np.concatenate((y_forests, y_mountains))
y.shape

(702, 1)

In [9]:
# split into a training and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, stratify=y, random_state=42)

## Instatiate Model

In [10]:
# model imports
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions

from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Conv2D, Flatten, Dropout, AveragePooling2D, MaxPooling2D
from tensorflow.keras.models import Model, Sequential

In [25]:
# check our gpu
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 8149124108141569387
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3181130547
locality {
  bus_id: 1
  links {
  }
}
incarnation: 9749605802428794662
physical_device_desc: "device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0"
]


In [12]:
# instantiate resnet
resnet = ResNet50(weights='imagenet', include_top=False)

In [13]:
# don't train resnet layers
for layer in resnet.layers:
    layer.trainable = False

In [14]:
# wrap pooling, dense layer, and output around resnet
das_model = resnet.output
das_model = GlobalAveragePooling2D()(das_model) # This layer is a really fancy flatten
das_model = Dense(1024, activation='relu')(das_model)
predictions = Dense(1, activation='sigmoid')(das_model)
model = Model(resnet.input, predictions)

In [15]:
# compile
model.compile(optimizer='adam',
              loss='mean_squared_error',
              metrics=['accuracy'])

## Fit Model

In [16]:
# Go fast
model.fit(X_train, y_train, batch_size=32, epochs=10, validation_data=(X_test, y_test))

Train on 631 samples, validate on 71 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x18173944438>

In [17]:
# test accuracy

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)

71/71 - 1s - loss: 1.2182e-07 - accuracy: 1.0000


# Custom CNN Model

In this step, write and train your own convolutional neural network using Keras. You can use any architecture that suits you as long as it has at least one convolutional and one pooling layer at the beginning of the network - you can add more if you want. 

In [40]:
# custom model

model = Sequential()
model.add(Conv2D(128, (3,3), activation='relu', input_shape=(256,256,3)))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(rate=0.3))
model.add(Conv2D(128, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(rate=0.3))
model.add(Conv2D(128, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(rate=0.2))
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(rate=0.2))
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(rate=0.2))
model.add(Conv2D(64, (3,3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_21 (Conv2D)           (None, 254, 254, 128)     3584      
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 127, 127, 128)     0         
_________________________________________________________________
dropout_16 (Dropout)         (None, 127, 127, 128)     0         
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 125, 125, 128)     147584    
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 62, 62, 128)       0         
_________________________________________________________________
dropout_17 (Dropout)         (None, 62, 62, 128)       0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 60, 60, 128)      

In [41]:
# Compile Model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

In [42]:
# go fast
model.fit(X_train, y_train, batch_size=16, epochs=10, validation_data=(X_test, y_test))

Train on 631 samples, validate on 71 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x18231e5c198>

In [37]:
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)

71/71 - 1s - loss: 0.2277 - accuracy: 0.8873


# Custom CNN Model with Image Manipulations
## *This a stretch goal, and it's relatively difficult*

To simulate an increase in a sample of image, you can apply image manipulation techniques: cropping, rotation, stretching, etc. Luckily Keras has some handy functions for us to apply these techniques to our mountain and forest example. Check out these resources to help you get started: 

1. [Keras `ImageGenerator` Class](https://keras.io/preprocessing/image/#imagedatagenerator-class)
2. [Building a powerful image classifier with very little data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)
 

In [38]:
# State Code for Image Manipulation Here

from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [39]:
datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

In [43]:
datagen.fit(X_train)

In [48]:
# custom model

model_gen = Sequential()
model_gen.add(Conv2D(128, (3,3), activation='relu', input_shape=(256,256,3)))
model_gen.add(MaxPooling2D((2,2)))
model_gen.add(Dropout(rate=0.3))
model_gen.add(Conv2D(128, (3,3), activation='relu'))
model_gen.add(MaxPooling2D((2,2)))
model_gen.add(Dropout(rate=0.3))
model_gen.add(Conv2D(128, (3,3), activation='relu'))
model_gen.add(MaxPooling2D((2,2)))
model_gen.add(Dropout(rate=0.2))
model_gen.add(Conv2D(64, (3,3), activation='relu'))
model_gen.add(MaxPooling2D((2,2)))
model_gen.add(Dropout(rate=0.2))
model_gen.add(Conv2D(64, (3,3), activation='relu'))
model_gen.add(MaxPooling2D((2,2)))
model_gen.add(Dropout(rate=0.2))
model_gen.add(Conv2D(64, (3,3), activation='relu'))
model_gen.add(Flatten())
model_gen.add(Dense(64, activation='relu'))
model_gen.add(Dense(1, activation='sigmoid'))

model_gen.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_33 (Conv2D)           (None, 254, 254, 128)     3584      
_________________________________________________________________
max_pooling2d_26 (MaxPooling (None, 127, 127, 128)     0         
_________________________________________________________________
dropout_26 (Dropout)         (None, 127, 127, 128)     0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 125, 125, 128)     147584    
_________________________________________________________________
max_pooling2d_27 (MaxPooling (None, 62, 62, 128)       0         
_________________________________________________________________
dropout_27 (Dropout)         (None, 62, 62, 128)       0         
_________________________________________________________________
conv2d_35 (Conv2D)           (None, 60, 60, 128)      

In [50]:
model.fit_generator(datagen.flow(X_train, y_train, batch_size=16), epochs=10, validation_data=(X_test, y_test))

  ...
    to  
  ['...']
Train for 40 steps, validate on 71 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x182344c2470>

# Resources and Stretch Goals

Stretch goals
- Enhance your code to use classes/functions and accept terms to search and classes to look for in recognizing the downloaded images (e.g. download images of parties, recognize all that contain balloons)
- Check out [other available pretrained networks](https://tfhub.dev), try some and compare
- Image recognition/classification is somewhat solved, but *relationships* between entities and describing an image is not - check out some of the extended resources (e.g. [Visual Genome](https://visualgenome.org/)) on the topic
- Transfer learning - using images you source yourself, [retrain a classifier](https://www.tensorflow.org/hub/tutorials/image_retraining) with a new category
- (Not CNN related) Use [piexif](https://pypi.org/project/piexif/) to check out the metadata of images passed in to your system - see if they're from a national park! (Note - many images lack GPS metadata, so this won't work in most cases, but still cool)

Resources
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - influential paper (introduced ResNet)
- [YOLO: Real-Time Object Detection](https://pjreddie.com/darknet/yolo/) - an influential convolution based object detection system, focused on inference speed (for applications to e.g. self driving vehicles)
- [R-CNN, Fast R-CNN, Faster R-CNN, YOLO](https://towardsdatascience.com/r-cnn-fast-r-cnn-faster-r-cnn-yolo-object-detection-algorithms-36d53571365e) - comparison of object detection systems
- [Common Objects in Context](http://cocodataset.org/) - a large-scale object detection, segmentation, and captioning dataset
- [Visual Genome](https://visualgenome.org/) - a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language