<div class="alert alert-danger">
    <p><strong>NOTE:</strong></p>
    <p>Before you submit this assignment, <strong>make sure everything runs as expected</strong>:</p>
    <ol>
        <li><strong>restart the kernel</strong> (in the menubar, select <strong>Kernel → Restart</strong>)
        <li><strong>run all cells</strong> (in the menubar, select <strong>Cell → Run All</strong>)</li>
    </ol>
    <p>Make sure to complete every cell that states "<strong><TT>YOUR CODE HERE</TT></strong>" or "<strong><TT>YOUR ANSWER HERE</TT></strong>".</p>
</div>

---

<div class="alert alert-info">
    <h3 style="text-decoration:underline;">Version 2: Changes</h3>
    <p>Please view the blue bubbles (similar to the one encapsulating this text) with the heading <strong>Update</strong> for revised instructions, clarifications, or added details.</p>
</div>

---


# Coding Assignment #2: Version 2

This assignment involves **classification**. It extends **Coding Assignment #1**, which involved **regression**.

You will classify a set of images using simple implementations of classifiers (linear, SVM, and decision tree classifiers) and an ensemble of the three classifiers. Each image will consist of either a cat or a dog. The classifier must correctly label each image as either an image of a *cat* or as an image of a *dog*.
![alt text](https://storage.googleapis.com/tfds-data/visualization/fig/cats_vs_dogs-4.0.0.png "Dogs & Cats Image")

We will then create a more diverse and obfuscated set of test images to evaluate the robustness of the models. Such an obfuscated image could appear as follows:

![alt text](https://www.tensorflow.org/tutorials/generative/deepdream_files/output_tEfd00rr0j8Z_0.png "DeepDreamed Image")

# Preliminaries & Dependencies

You will need to install the following **Python** packages to complete this assignment (you likely already have most of these libraries installed from previous assignments, quizzes, etc.):

    pip install matplotlib numpy
    pip install sklearn
    pip install tensorflow tensorflow-datasets

# Overview

We want to determine the difference in a model's performance when evaluated with data that has been altered or obfusctaed to make the task more difficult to perform.

The theme of this assignment is *composition* ("*the principle of progressive disclosure of complexity*", to borrow from [Keras](https://keras.io/about)). We will extend and build upon the ideas we have previously explored.


# Objectives

Access a *real-world* dataset (e.g., used in kaggle.com competitions, etc.) rather than a *toy* dataset.

Generate data from existing, actual data (contrast this to generating synthetic data).

Download and implement already trained models.

Evaluate the robustness of a model by testing it on data that is designed to be more *complex* than the training data.

Develop and evaluate ensembles of models.

# Acquire Data

We will use the [Cats and Dogs dataset](https://www.tensorflow.org/datasets/catalog/cats_vs_dogs) (size: 786.7 Mb).
The dataset consists of images (the input) and labels, either *cat* or *dog* (the output).
Information on the dataset is [here](https://www.microsoft.com/en-us/download/details.aspx?id=54765).

Our **goal** is to predict which of two labels (*cat* or *dog*) to apply to an image. This is a binary classification task (i.e., *two categories, two labels, two classes*, etc.).

Downlading a dataset (https://www.tensorflow.org/tutorials/images/data_augmentation#apply_augmentation_to_a_dataset):

In [None]:
(train_datasets, val_ds, test_ds), metadata = tfds.load(
    'tf_flowers',
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True,
    as_supervised=True,
)

f, y = (features, output)


# Transforming Data

We will essentially **augment** our dataset. Augmenting data "*increases the diversity of your training set by applying random (but realistic) transformations such as image rotation*" (https://www.tensorflow.org/tutorials/images/data_augmentation).

In our case we are *not* augmenting the data to collect more diverse **training data** but to *evaluate* the models on more diverse/complex **testing data**. Thus we will **only** transform the **test data**.



## Examples Of Data Augmentation

We will be doing something similar to:
https://www.tensorflow.org/datasets/catalog/moving_mnist
where the MNIST hadwritten digit image dataset was transformed into a video of animated/moving handwritten digits (click *Display Examples* to view the videos).

Augmenting a dataset of flower images:
https://www.tensorflow.org/tutorials/images/data_augmentation#using_tfimage
which uses warping and colour transformations on flower pictures.

## Example Image Transformations

<div class="alert alert-danger">
    <h4>NOTE</h4>
    This resize_and_rescale function example is provided for your information. We will <strong>not</strong> be using the code in this example.
</div>

The code (taken from **Tensorflow**'s explanation for [transforming an image dataset](https://www.tensorflow.org/tutorials/images/data_augmentation#apply_augmentation_to_a_dataset)) is to show how simple the image processing step is.


Create a function that resizes and rescales the images (so as to "*unify the size and scale of images in the dataset*"):

In [None]:
def resize_and_rescale(image, label):
    image = tf.cast(image, tf.float32)
    image = tf.image.resize(image, [IMG_SIZE, IMG_SIZE])
    image = (image / 255.0)
    return image, label

<div class="alert alert-danger">
    <h4>NOTE</h4>
    This example (the augment function) is provided for your information. We will <strong>not</strong> be using the code in this example.
</div>

Create an *augment* function that applies random transformations to an image:

In [None]:
def augment(image_label, seed):
    image, label = image_label
    image, label = resize_and_rescale(image, label)
    image = tf.image.resize_with_crop_or_pad(image, IMG_SIZE + 6, IMG_SIZE + 6)
    
    # Make a new seed
    new_seed = tf.random.experimental.stateless_split(seed, num=1)[0, :]
    
    # Random crop back to the original size
    image = tf.image.stateless_random_crop(image, size=[IMG_SIZE, IMG_SIZE, 3], seed=seed)
    
    # Random brightness
    image = tf.image.stateless_random_brightness(image, max_delta=0.5, seed=new_seed)
    
    image = tf.clip_by_value(image, 0, 1)
    
    return image, label

# Stylistic Transformation

We will transform the image data *stylistically* by processing images with [**DeepDream**](https://www.tensorflow.org/tutorials/generative/deepdream).

For example, this image of a labrador:

![alt text](https://www.tensorflow.org/tutorials/generative/deepdream_files/output_Y5BPgc8NNbG0_0.png "Original Image")

transforms into the following image after being processed by **DeepDream**:

![alt text](https://www.tensorflow.org/tutorials/generative/deepdream_files/output_tEfd00rr0j8Z_0.png "DeepDreamed Image")

A description and tutorial implementation of **Deep Dream** (aka **Inceptionism**) is [here](https://www.tensorflow.org/tutorials/generative/deepdream). Another type of transformation we could have applied to the images was a [Style Transfer](https://www.tensorflow.org/tutorials/generative/style_transfer).

<div class="alert alert-danger">
    <h4>NOTE</h4>
    We are <strong>not</strong> concerned with the details of <strong>DeepDream</strong>. We will use it as an off-the-shelf component in our system to augment our image dataset for evaluating how a model performs on the <strong>DeepDreamed</strong> images.
</div>

<div class="alert alert-success">
    <h4>Code To Execute Begins Here</h4>
</div>

## Imports

Import the following libraries.

In [None]:
import tensorflow as tf
import numpy as np
import matplotlib as mpl
import IPython.display as display
import PIL.Image
from tensorflow.keras.preprocessing import image

## Image Preparation

**DeepDream**'s image preparation (code is from https://www.tensorflow.org/tutorials/generative/deepdream#choose_an_image_to_dream-ify):


In [None]:
# Download an image and read it into a NumPy array.
def download(url, max_dim=None):
    name = url.split('/')[-1]
    image_path = tf.keras.utils.get_file(name, origin=url)
    img = PIL.Image.open(image_path)
    if max_dim:
        img.thumbnail((max_dim, max_dim))
    return np.array(img)

# Normalize an image
def deprocess(img):
    img = 255*(img + 1.0)/2.0
    return tf.cast(img, tf.uint8)

# Display an image
def show(img):
    display.display(PIL.Image.fromarray(np.array(img)))

# image we will process
url = 'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg'

# Downsizing the image makes it easier to work with.
original_img = download(url, max_dim=500)
show(original_img)
display.display(display.HTML('Image cc-by: <a "href=https://commons.wikimedia.org/wiki/File:Felis_catus-cat_on_snow.jpg">Von.grzanka</a>'))

## Download Pre-Trained Model

Download the [pre-trained model **InceptionV3**](https://keras.io/api/applications/inceptionv3) (size: **92 Mb**).

FYI other pre-trained models are available here:
https://keras.io/api/applications/#available-models

In [None]:
base_model = tf.keras.applications.InceptionV3(include_top=False, weights='imagenet')

## Prepare Feature Extraction Model

We we do not need to be concerned with the details (we will use the code as it is *off-the-shelf*), but feel free to play with the **layers** to see what effect they have on an image.

The explanation of the following code is provided FYI and is taken from https://www.tensorflow.org/tutorials/generative/deepdream#prepare_the_feature_extraction_model.

> ...the layers of interest are those where the convolutions are concatenated. There are 11 of these layers in InceptionV3, named 'mixed0' though 'mixed10'. Using different layers will result in different dream-like images. Deeper layers respond to higher-level features (such as eyes and faces), while earlier layers respond to simpler features (such as edges, shapes, and textures). Feel free to experiment with the layers selected below, but keep in mind that deeper layers (those with a higher index) will take longer to train on since the gradient computation is deeper.

> The complexity of the features incorporated depends on layers chosen by you, i.e, lower layers produce strokes or simple patterns, while deeper layers give sophisticated features in images, or even whole objects.



In [None]:
# Maximize the activations of these layers
names = ['mixed3', 'mixed5']
layers = [base_model.get_layer(name).output for name in names]

# Create the feature extraction model
dream_model = tf.keras.Model(inputs=base_model.input, outputs=layers)

## Calculate Loss

The following is from https://www.tensorflow.org/tutorials/generative/deepdream#calculate_loss.
Again, we aren't concerned with the details and will use the code as it is.

> The **loss** is the sum of the activations in the chosen layers. The loss is normalized at each layer so the contribution from larger layers does not outweigh smaller layers. Normally, *loss is a quantity you wish to minimize via gradient descent*. In **DeepDream**, you will *maximize this loss via gradient ascent*.

In [None]:
def calc_loss(img, model):
    # Pass forward the image through the model to retrieve the activations.
    # Converts the image into a batch of size 1.
    img_batch = tf.expand_dims(img, axis=0)
    layer_activations = model(img_batch)
    if len(layer_activations) == 1:
        layer_activations = [layer_activations]

    losses = []
    for act in layer_activations:
        loss = tf.math.reduce_mean(act)
        losses.append(loss)

    return tf.reduce_sum(losses)

## Gradient Ascent

The following is from https://www.tensorflow.org/tutorials/generative/deepdream#gradient_ascent.
Again, we aren't concerned with the details and will use the code as it is.

> After the loss for the chosen layers is calculated, calculate the gradients with respect to the image and add them to the original image.
Adding the gradients to the image **enhances the patterns seen by the neural network**. At each step, we create an image that **increasingly excites the activations of certain layers** in the network.

> The method that does this is wrapped in a `tf.function` for performance. It uses an `input_signature` to ensure that the function is not retraced for different image sizes or `steps/step_size` values.

In [None]:
class DeepDream(tf.Module):
    def __init__(self, model):
        self.model = model

    @tf.function(
        input_signature=(
            tf.TensorSpec(shape=[None,None,3], dtype=tf.float32),
            tf.TensorSpec(shape=[], dtype=tf.int32),
            tf.TensorSpec(shape=[], dtype=tf.float32),)
    )
    def __call__(self, img, steps, step_size):
        print("Tracing")
        loss = tf.constant(0.0)
        for n in tf.range(steps):
            with tf.GradientTape() as tape:
                # This needs gradients relative to `img`
                # `GradientTape` only watches `tf.Variable`s by default
                tape.watch(img)
                loss = calc_loss(img, self.model)

            # Calculate the gradient of the loss with respect to the pixels of the input image.
            gradients = tape.gradient(loss, img)

            # Normalize the gradients.
            gradients /= tf.math.reduce_std(gradients) + 1e-8 

            # In gradient ascent, the "loss" is maximized so that the input image increasingly "excites" the layers.
            # You can update the image by directly adding the gradients (because they're the same shape!)
            img = img + gradients*step_size
            img = tf.clip_by_value(img, -1, 1)

        return loss, img

deepdream = DeepDream(dream_model)

## DeepDream: Main Loop

From https://www.tensorflow.org/tutorials/generative/deepdream#main_loop.
Again, we are not concerned with the details and will use the code as it is.

In [None]:
def run_deep_dream_simple(img, steps=100, step_size=0.01):
    # Convert from uint8 to the range expected by the model.
    img = tf.keras.applications.inception_v3.preprocess_input(img)
    img = tf.convert_to_tensor(img)
    
    step_size = tf.convert_to_tensor(step_size)
    steps_remaining = steps
    step = 0
    
    while steps_remaining:
        if steps_remaining>100:
            run_steps = tf.constant(100)
        else:
            run_steps = tf.constant(steps_remaining)
        steps_remaining -= run_steps
        step += run_steps

        loss, img = deepdream(img, run_steps, tf.constant(step_size))

        display.clear_output(wait=True)
        show(deprocess(img))
        print ("Step {}, loss {}".format(step, loss))

    result = deprocess(img)
    display.clear_output(wait=True)
    show(result)

    return result

Process an image and view the result. This should process the labrador image mentioned earlier and display the **DeepDreamed** image to the screen.

In [None]:
dream_img = run_deep_dream_simple(img=original_img, steps=100, step_size=0.01)

<div class="alert alert-success">
    <h2>The assignment to be submitted begins here.</h2>
</div>


# Generate New Image Data Via DeepDream

<div class="alert alert-info">
    <h3 style="text-decoration:underline;">UPDATE</h3>
    <p>The <strong>Cats & Dogs</strong> dataset consists of 20,000 images, which is too many to process by <strong>DeepDream</strong> (as well as taking too long to train the simpler classification models).</br>
    Possible approach: use only 1,000 images from the original dataset (800 training, 200 testing).</p>
    <p>One method to read the <strong>Cats & Dogs</strong> data is by using the <strong>Tensorflow-Datasets</strong> module:</p>
</div>

In [None]:
#import tensorflow_datasets as tfds

# (train_dataset, test_dataset), metadata = tfds.load(
#     'cats_vs_dogs',
#     split=['train[:80%]', 'train[80%:]'],
#     with_info=True,
#     as_supervised=True,
# )

<div class="alert alert-info">
    <h3 style="text-decoration:underline;">UPDATE</h3>
    <p>Load <strong>Cats & Dogs</strong> dataset, reshape/resize, and save to a new file.</p>
    <p><strong>Keras</strong> provides image preprocessing tools we can use.</p>
</div>

In [None]:
from os import listdir
from numpy import asarray
from numpy import save
from tensorflow import keras
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import img_to_array

# location of downloaded Dogs vs. Cats image dataset on my computer
#folder = '/Volumes/WDBook/dogs-vs-cats/train/'

# subset of original dataset consisting of 1,000 images
folder = "/Users/cyrus/tensorflow_datasets/cats_vs_dogs/train1000/"
# subset of original dataset consisting of 10,000 images
#folder = '/Volumes/WDBook/dogs-vs-cats/train-10000/'

<div class="alert alert-info">
    <h4 style="text-decoration:underline;">UPDATE</h4>
    <p>Because the images in the dataset are of many different dimensions, we will need to resize the images so they are all 200x200 pixels.</br>
    This transformation will result in distorting, stretching, etc. every image to conform to a 200x200 image.</p>
    <p></p>
    <p>Sample code to resize/reshape images:</p>
</div>

In [None]:
# from keras.preprocessing.image import load_img
# resized_photo = load_img("/path/to/original_image_file", target_size=(200, 200))

<div class="alert alert-info">
    <h4 style="text-decoration:underline;">UPDATE</h4>
    Convert the resized image to a Python array:
</div>

In [None]:
# resized_photo = img_to_array(resized_photo)

<div class="alert alert-info">
    <h4 style="text-decoration:underline;">UPDATE</h4>
    <p>After resizing the images, calculate the amount of memory (RAM) the computer will require to process the entire <strong>Cats & Dogs</strong> image dataset.</p>
    <h4 style="text-decoration:underline;">Size Of Dataset With 20,000 Images</h4>
    <p>Using all <strong>20,000 images</strong> of the original dataset:</br>
    20,000 images <strong>x</strong> 200 x 200 x 3 pixels per image</br>
     = 2,400,000,000 32-bit pixels</br>
     = 76,800,000,000 bits</br>
     = 9,600,000,000 bytes (conversion: 8 bits in 1 byte)</br>
     = <strong>9.6 Gbytes</strong></p>
    <p></br></p>
    <h4 style="text-decoration:underline;">Size Of Dataset With 1,000 Images</h4>
    <p>Using only <strong>1,000 resized images</strong> from the original dataset:</br>
    1,000 images <strong>x</strong> 200 x 200 x 3 pixels per image</br>
     = 120,000,000 32-bit pixels = 3,840,000,000 bits = 480,000,000 bytes</br>
     = <strong>0.48 Gb</strong></p>
    <p></br></p>
    <h4 style="text-decoration:underline;">Size Of Dataset With 10,000 Images</h4>
    <p>Using <strong>10,000 resized images</strong> from the original dataset:</br>
    10,000 images <strong>x</strong> 200 x 200 x 3 pixels per image</br>
     = 1,200,000,000 32-bit pixels = 38,400,000,000 bits = 4,800,000,000 bytes</br>
     = <strong>4.8 Gb</strong></p>
</div>

<div class="alert alert-danger">
    <h3 style="text-decoration:underline;">NOTE</h3>
    <p>If the dataset is larger than the amount of available <strong>RAM</strong>, then Jupyterlab will display a message similar to:</br>
    "The kernel for A2.ipynb appears to have died. It will restart automatically."</p>
    <p></p>
    Successful execution of the below code will display something similar to:</br>
    (1000, 200, 200, 3) (1000,)</p>
</div>

In [None]:
photos, labels = list(), list()

# processing every file in a folder
for file in listdir(folder):
	# determine label of image from filename (cat = 1, dog = 0)
	output = 0.0 
	if file.startswith('cat'):
		output = 1.0
	
	photo = load_img(folder + file, target_size=(200, 200))
    
	# convert image to a Python array
	photo = img_to_array(photo)
	
    # store converted image & its corresponding output label
	photos.append(photo)
	labels.append(output)

# converts list of images to a Python array
photos = asarray(photos)
labels = asarray(labels)

# print(photos.shape, labels.shape)

# save resized photos to avoid having to repeat the above process
save('/Users/cyrus/tensorflow_datasets/cats_vs_dogs/dogs_vs_cats_photos.npy', photos)
save('/Users/cyrus/tensorflow_datasets/cats_vs_dogs/dogs_vs_cats_labels.npy', labels)

<div class="alert alert-info">
    <h3 style="text-decoration:underline;">UPDATE</h3>
    Load the saved reshaped images then confirm their shape (i.e., dimension).
</div>

In [None]:
from numpy import load

photos = load('/Users/cyrus/tensorflow_datasets/cats_vs_dogs/dogs_vs_cats_photos.npy')
labels = load('/Users/cyrus/tensorflow_datasets/cats_vs_dogs/dogs_vs_cats_labels.npy')

print(photos.shape, labels.shape)

### Analysis Of Dataset BONUS (5 Marks)

Provide empirical information about the Cats & Dogs dataset.

<div class="alert alert-success">
    <h4>BONUS</h4>
    Write code to provide empirical information about the <strong>Cats & Dogs</strong> dataset. Use visual elements where possible.
</div>

In [None]:
# YOUR CODE HERE
import IPython.display as display
from PIL import Image
cat = Image.open(r'C:\Users\cyrus\tensorflow_datasets\cats_vs_dogs\train1000\cat(1).jpg')
dog = Image.open(r'C:\Users\cyrus\tensorflow_datasets\cats_vs_dogs\train1000\dog(1).jpg')
display.display(cat)
display.display(dog)

photos.shape

### Create Testing & Training Datasets (5 Marks)

Separate the **Cats & Dogs** dataset into a test set and a training set.

<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code to separate the data into a test set and a training set below.
</div>

In [None]:
# YOUR CODE HERE
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(photos, labels, test_size=0.33, random_state=42) 

### Process Testing Dataset Via DeepDream (10 Marks)

Create a **DeepDreamed Test Set** by processing the original test set via **DeepDream**. Display a few of the resulting **DeepDreamed** images.

<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code that processes images via <strong>DeepDream</strong> below.
</div>

In [None]:
# YOUR CODE HERE
import copy
deepDreamTest = []

X_testCopy = copy.deepcopy(X_test)
for image in X_testCopy:
    newimg = run_deep_dream_simple(img=image, steps=100, step_size=0.01)
    deepDreamTest.append(newimg)

### Style Transfer BONUS (10 Marks)

<div class="alert alert-success">
    <p>Process the test set using <a href="https://www.tensorflow.org/tutorials/generative/style_transfer">Style Transfer</a>.</p>
    <p>Display a few images processed using <strong>Style Transfer</strong>.</p>
</div>

In [None]:
# YOUR CODE HERE
for img in deepDreamTest[:5]:
    show(img)

# Evaluating Individual Models (10 Marks)

The following classification models are to be evaluated (default parameters can be used in both models):
* [SVM classifier](https://scikit-learn.org/stable/modules/svm.html#classification) 
* [decision tree classifier](https://scikit-learn.org/stable/modules/tree.html#classification)


Models will be *trained* on the **training data**.

Two *separate evaluations* will be performed:
* models will be evaluated on the **original test data**
* models will be evaluated on the **transformed test data**


<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code below to create and evaluate the classifiers.
</div>

<div class="alert alert-info">
    <h4>TIP</h4>
    Classifiers take two arrays as input:</br>
    <strong>array X</strong> of shape (number_of_samples, number_of_features) containing the training samples feature data</br>
    <strong>array y</strong> of class labels (strings or integers) of shape (number_of_samples)</p>
    <p></p>
    <p>print(photos.shape, labels.shape)</br>
    num_samples = labels.shape[0]</br>
    x = np.reshape(photos, (num_samples, -1))</p>
</div>

In [None]:
# Code to convert RGB image into a NumPy array for input to a classifier model
print(X_train.shape, y_train.shape)
num_samples = y_train.shape[0]
test_samples = y_test.shape[0]
X_trainShaped = np.reshape(X_train, (num_samples, -1))
deepDream_testShaped = np.reshape(deepDreamTest, (test_samples, -1))
X_testShaped = np.reshape(X_test, (test_samples, -1))

##### SVM Classifier

In [None]:
# SVM classifier
from sklearn.svm import SVC
from sklearn import svm

SVM = svm.SVC()
# Train model with training data
SVM.fit(X_trainShaped, y_train)

In [None]:
# Evaluating using original test data
from sklearn.model_selection import cross_val_score
from sklearn import metrics

y_pred = SVM.predict(X_testShaped)
SVMOrigACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
SVMOrigACC

In [None]:
from sklearn.metrics import confusion_matrix

SVMOrigConf = confusion_matrix(y_test, y_pred)

In [None]:
# Evaluating using transformed test data
y_pred = SVM.predict(deepDream_testShaped)
SVMTransACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
SVMTransACC

In [None]:
SVMTransConf = confusion_matrix(y_test, y_pred)

In [None]:
SVMScores = cross_val_score(SVM, deepDream_testShaped + X_testShaped, y_test + y_test)

#### Decision tree classifier

In [None]:
# Decision tree classifier
from sklearn import tree

DTC = tree.DecisionTreeClassifier()
# Train model with training data
DTC.fit(X_trainShaped, y_train)

In [None]:
# Evaluating using original test data
y_pred = DTC.predict(X_testShaped)
DTCOrigACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
DTCOrigACC

In [None]:
DTCOrigConf = confusion_matrix(y_test, y_pred)

In [None]:
# Evaluating using transformed test data
y_pred = DTC.predict(deepDream_testShaped)
DTCTransACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
DTCTransACC

In [None]:
DTCScores = cross_val_score(DTC, deepDream_testShaped + X_testShaped, y_test + y_test)

In [None]:
DTCTransConf = confusion_matrix(y_test, y_pred)

In [None]:
# Visualizing the tree
tree.plot_tree(DTC)

# Create Ensemble Models

View the lecture videos for a *brief* explanation of ensemble models. Scikit-learn provides a [concise explanation of ensembles](https://scikit-learn.org/stable/modules/ensemble.html#ensemble).

### Random Forest (5 Marks)

The **scikit-learn** implementation of **Random Forest** "*combines classifiers by averaging their probabilistic prediction, instead of letting each classifier vote for a single class*".

Code examples for implementing a **Random Forest Classifiers** are [here](https://scikit-learn.org/stable/modules/ensemble.html#forest).
API information on **Random Forest Classifiers** is [here](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier).

Using **scikit-learn**, create a **Random Forest** classifier consisting of the following parameters (model parameters not specified can be left to their default values):
* number of estimators = 100



<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code to create a Random Forest classifier ensemble below.
</div>

In [None]:
# YOUR CODE HERE
from sklearn.ensemble import RandomForestClassifier
RFC = RandomForestClassifier(n_estimators=100)
RFC.fit(X_trainShaped, y_train)

### Voting Ensemble (10 Marks)

A **Voting Ensemble** classifier "*combine conceptually different machine learning classifiers and use a majority vote (hard vote) or the average predicted probabilities (soft vote) to predict the class labels*".

Using **scikit-learn**, create two **Voting Ensemble Classifiers** (*hard voting* and *soft voting*) consisting of the models created in the previous sections (**Evaluating Individal Models**, etc.):
* SVM classifier
* decision tree classifier
* Random Forest classifier

Use the default parameters for both *hard voting* and *soft voting* classifiers.

Code examples for implementing a **Voting Ensemble Classifier** is [here](https://scikit-learn.org/stable/modules/ensemble.html#voting-classifier).
API information on **Voting Ensemble Classifier** is [here](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html#sklearn.ensemble.VotingClassifier).

<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code to create a Voting Ensemble classifier below.
</div>

In [None]:
# Hard voting
from sklearn.ensemble import VotingClassifier

VCHard = VotingClassifier(estimators=[('SVM', SVM), ('Tree', DTC), ('Forest', RFC)], 
                          voting='hard')
VCHard.fit(X_trainShaped, y_train)

In [None]:
# Setting up new SVM to adjust for soft voting
SVMSoft = svm.SVC(probability=True)
SVMSoft.fit(X_trainShaped, y_train)

In [None]:
# Soft voting
VCSoft = VotingClassifier(estimators=[('SVM', SVMSoft), ('Tree', DTC), ('Forest', RFC)], 
                          voting='soft')
VCSoft.fit(X_trainShaped, y_train)

# Evaluate Ensembles (10 Marks)

Models will be *trained* on the **training data**.

Two *separate evaluations* will be performed:
* ensemble model will be evaluated on the **original test data**
* ensemble model will be evaluated on the **transformed test data**

<div class="alert alert-danger">
    <h4>WRITE CODE</h4>
    Write the code below to evaluate the ensembles on the test data.
</div>

#### Random Forest evaluation

In [None]:
# Evaluating using original test data
y_pred = RFC.predict(X_testShaped)
RFCOrigACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
RFCOrigACC

In [None]:
RFCOrigConf = confusion_matrix(y_test, y_pred)

In [None]:
RFCScores = cross_val_score(RFC, deepDream_testShaped + X_testShaped, y_test + y_test)

In [None]:
# Evaluating using transformed test data
y_pred = RFC.predict(deepDream_testShaped)
RFCTransACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
RFCTransACC

In [None]:
RFCTransConf = confusion_matrix(y_test, y_pred)

#### Hard voting evaluation

In [None]:
# Evaluating using original test data
y_pred = VCHard.predict(X_testShaped)
VCHardOrigACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
VCHardOrigACC

In [None]:
VCHardOrigConf = confusion_matrix(y_test, y_pred)

In [None]:
# Evaluating using transformed test data
y_pred = VCHard.predict(deepDream_testShaped)
VCHardTransACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
VCHardTransACC

In [None]:
VCHardTransConf = confusion_matrix(y_test, y_pred)

In [None]:
VCHardScores = cross_val_score(VCHard, deepDream_testShaped + X_testShaped, y_test + y_test)

#### Soft voting evaluation

In [None]:
# Evaluating using original test data
y_pred = VCSoft.predict(X_testShaped)
VCSoftOrigACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
VCSoftOrigACC

In [None]:
VCSoftOrigConf = confusion_matrix(y_test, y_pred)

In [None]:
# Evaluating using transformed test data
y_pred = VCSoft.predict(deepDream_testShaped)
VCSoftTransACC = metrics.accuracy_score(y_test, y_pred)

In [None]:
VCSoftTransACC

In [None]:
VCSoftTransConf = confusion_matrix(y_test, y_pred)

In [None]:
VCSoftScores = cross_val_score(VCSoft, deepDream_testShaped + X_testShaped, y_test + y_test)

# Discussion Of Results (40 Marks)

Discuss the results similar to **Coding Assignment #1** (using charts and graphs where possible).
Compare the performance of the various models.
Was the performance what you expected?

Which system performed best? Why?
Which system had the worst performance? Why?

Provide some ideas to try that could improve the performance of the models (i.e., *Future Work*).

<div class="alert alert-danger">
    <h4>DISCUSSION</h4>
    <p>Provide your <strong>Discussion Of Results</strong> below. Use visual elements where possible.</p>
    <p>Feel free to include both <em>Markdown</em> cells and <em>Code</em> cells where necessary.</p>
</div>

### BEGIN ANSWER

### END ANSWER

With each model used, performace was measured through 3 different metrics; accuracy, cross validation, and confusion matrix.<br>
Accuracy was used since the dataset is known to have exact same number of cat pictures and dog pictures. When data is split in to training and testing sets, the probability of having extremely skewed or biased sets is very low. Having balanced datasets is required to ensure the accuracy metric is reliable and accurate. <br>
Cross validation was used for a couple different reasons. One of the biggest is to use all data possible in the dataset to evaluate the model instead of just train and test sets. To save time and resources, the original cats_vs_dogs dataset contained 20,000 photos, but this assignment only used 1002 of those photos. Another reason is to avoid overfitting of models. Since cross validation will train and test with all data in the set rather than just the train and test split, the process will show if the model was overfitted with much lower scores. <br>
Confusion matrix was used to visual exactly how many correct predictions, incorrect prediction, and the type of errors that were committed. Having this visual will help with optimizing models with different parameters to adjust for the errors. <br>
<br>
To begin with the results, the accuracy score between the original test data and transformed test data were almost the same with each model. Most of the time, the accuracy of the original test data is slightly higher than the transformed images. This result is to be expected since transformed images were not a part of the training set, so all transformed data is like complete different and new images not very similar to regular cats and dogs.
<br>
All accuracy scores from the invididual models SVM classfier and Decision tree classifier in the graph below

In [None]:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_axes([0,0,2,1])
models = ['SVM Original', 'SVM Transformed', 'Decision Tree Original', 'Decision Tree Transformed']
accuracy = [SVMOrigACC, SVMTransACC, DTCOrigACC, DTCTransACC]
ax.bar(models,accuracy)
plt.show()

SVM with the original data has the highest score but it only beats the other models by a little bit. Both decision tree predictions share the lowest score as the difference between them is less than 0.004. I expected the performance of the original data to be much higher than its counter part of transformed data. This would make sense because each model was trained and fitted on the original pictures without transformations, so it would better predict other original pictures well and be unable to predict transformed pictures very well. However, the accuracy scores states otherwise where all scores are almost the same. 

All accuracy scores from the ensemble models Random forest, Hard vote, and Soft vote are given below

In [None]:
fig = plt.figure()
ax = fig.add_axes([0,0,2,1])
models = ['Forest Original', 'Forest Transformed', 'Hard Vote Original', 'Hard Vote Transformed', 
          'Soft Vote Original' ,'Soft Vote Transformed']
accuracy = [RFCOrigACC, RFCTransACC, VCHardOrigACC, VCHardTransACC, VCSoftOrigACC, VCSoftTransACC]
ax.bar(models,accuracy)
plt.show()

As expected, hard vote model with the original data has the highest accuracy score. This is to be expected since it encorporated 3 different models, SVM, Decision tree, and Random forest to create a prediction. Using default parameters is a clear handicap for soft vote since the largest difference between the two voting classifiers are these weights. Since the weights of Soft voting were not specified, hard voting comes with the highest score. Even hard voting with transformed data was more accurate than all other models except for its original variant. Across all other models, the original data set scored slightly higher than transformed data, similar result as the individual models.

Accuracy scores from all models can be seen in this graph

In [None]:
fig = plt.figure()
ax = fig.add_axes([0,0,2,1])
models = ['SVM O', 'SVM T', 'D Tree O', 'D Tree T',
          'Forest O', 'Forest T', 'H Vote O', 'H Vote T', 
          'S Vote O' , 'S Vote T']
accuracy = [SVMOrigACC, SVMTransACC, DTCOrigACC, DTCTransACC, RFCOrigACC, RFCTransACC, 
            VCHardOrigACC, VCHardTransACC, VCSoftOrigACC, VCSoftTransACC]
ax.bar(models, accuracy)
plt.show()

Next we will look at cross validation scores of all models. The main difference to note between accuracy score and cross validation score is the process of how the score is achieved. 10 fold cross validation was used. Both original data and transformed data were added together, split, trained, then predicted rather than training solely on original data. Essentially all data was used together for training and testing rather than splitting the different sets.
It is difficult to predict the difference between the cross validation scores and accuracy scores. The scores can increase since transformed data will be used in training and testing. The transformed photos will not be a completely foreign object for each model. However there can be a decrease because these photos can also skew and ruin training data since there is less consistency throughout each photo. The model may not have enough photos to learn especially since there are much less transformed photos than original photos.<br>
10 fold cross validation scores of all models can be seen in the graph below

In [None]:
scores = [SVMScores, DTCScores, RFCScores, VCHardScores, VCSoftScores]
names = ['SVM', 'Decision Tree', 'Random Forest', 'Hard Vote', 'Soft Vote']
fig = plt.figure()
ax = fig.add_axes([0,0,2,1])
plt.boxplot(scores, labels=names, showmeans=True)
plt.show()

This box plot shows how similar each model is in terms of cross validation score since all medians overlap. The highest mean showcased by the green triangle which seems to be both SVM and Hard vote classifiers. Random forest has the tighest mean as shown with the smallest box which means most of its scores were very similar compared to decision tree classifier which has a large box and whisker. 

In [None]:
print('SVM original\n',  SVMOrigConf)
print('SVM transformed\n',  SVMTransConf)
print('Decision tree original\n',  DTCOrigConf)
print('Decision tree transformed\n',  DTCTransConf)
print('RFC original\n',  RFCOrigConf)
print('RFC transformed\n',  RFCTransConf)
print('Hard vote original\n',  VCHardOrigConf)
print('Hard vote transformed\n',  VCHardTransConf)
print('Soft vote original\n',  VCSoftOrigConf)
print('Soft vote transformed\n',  VCSoftTransConf)


In these confusion matrix, the top left number and the bottom right number are the correct predictions while the other numbers are the incorrect predictions. Both hard vote classifiers with original data and transformed data seem to have out performed all other models with SVM original and transformed trailing close behind. All other models performed about the same reaching around 90 correct predictions for both labels. 


Overall, the hard vote classifier performed the best according to all metrics presented above. This was expected since this classifier uses all other models presented (except for its counter part soft voting) to optimize and balance the strengths and weaknesses of each model. The worst performing model was the decision tree where the predictions made on the original data set was actually worse than predictions made on the transformed data set.

There could be many ways to improve each prediction model. The easiest solution could be to increase the size of the data set so that models can  be trained with more photos and make predictions on more photos. Some issues this may cause would be the amount of time and resources it would require. Already while using this data set, it takes a long time to transform each photo, to fit the model, and calculate the predictions. <br>
Another improvement that could be made is to include some parameters for each model. The model that performed the worst, the decision tree model could highly benefit from this. Some limits to the depth of the tree may improve the performance since it would stop looking at the specifics and focus more on larger parts of the photo. When attempting to read the tree created, it is impossible to follow along with all the different branches, values, and leaves it created.  

In [None]:
tree.plot_tree(DTC)

In terms of parameters, soft voting could have been improved with the use of custom weights. Soft voting is based around weighted average probabilities, using weights to put importance on significant values, but none were used according to requirements. Figuring out the optimal weights to use however would not be as simple as inputting random numbers and it is unclear on how to input the best weights for better results. 

One key thing to note is all data is based on one iteration of split data. All of these scores can change depending on what images were selected for training, testing, and transformed. Since all scores were so similar, these scores will change and performance of each model can increase and decrease greatly. For example, another iteration could result in the decision tree performing the best with highest scores while hard voting may be the worst depending on data. On top of this, the entire data set was manually selected with the first 501 cat images and 501 dog images selected in the original cats vs dogs data set that contained 20,000 photos. If different photos were to be selected, that can result in different scores leading to different performance rankings. Again, this is only a concern since all scores were essentitally less than 0.1 points away from one another. There was no clear model that performed much better, and there was not a clear model that performed the worst. All models performed at a very similar level which could be attributed to computation error or measurement error. The slight differences might show insight on what model to focus on and optimize if that is a consideration in the future.