<a href="https://colab.research.google.com/github/wetherc/data-2000/blob/main/exams/final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DATA-2000 Final Exam

## Grading Rubric

This final will be worth 15% of your total grade for this course. It will be graded out of 50 points, divided into 4 sections:

  - Model Building: 25 points
    - 15 points will be awarded for the actual model building (evaluating your Python code)
    - 10 points will be awarded for the text commentary narrating your choices and explaining your rationale
  - Model Validation/Evaluation: 25 points
    - 5 points will be awarded by default, but may be subtracted from if there are substantial errors in your model building that negatively impact the validity of your model
    - 10 points will be awarded for the actual model validation and evaluation (evaluating your Python code)
    - 10 points will be awarded for the text commentary narrating your choices and explaining your rationale

> **NOTE:** You will NOT be evaluated on whether you model actually makes accurate predictions or not

## Using Additional Resources

This is an open-resource exam. You may use any available resources as references. I will be available for any questions that you have during the exam.

Remember that all work must still be your own, and that this exam is governed by the [Policy on Academic Honesty outlined in our course syllabus](https://docs.google.com/document/d/1Aoh7LvTKTEZO74eOsNhLzorkLtljkuchpg3ScNM_VEs/edit#heading=h.r0b18a8gh450).

-----

# Image Classification: Horse or Human


For this exercise, we are going to use a dataset of images of both horses and humans, taken from [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/horses_or_humans).

Our dataset contains 1,027 training images (300x300 pixels in full color) and 256 testing images, as well as a category label for each image.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

## Importing the Data

First, let's download our dataset and take a look at what it contains:

In [None]:
X_train, y_train = tfds.load(
    'horses_or_humans',
    split='train',
    shuffle_files=True,
    as_supervised=True,
    batch_size=-1)
X_test, y_test = tfds.load(
    'horses_or_humans',
    split='test',
    shuffle_files=True,
    as_supervised=True,
    batch_size=-1)

In [None]:
viz_data, ds_info = tfds.load(
    'horses_or_humans',
    split='train[:10]',
    shuffle_files=True,
    with_info=True)
tfds.visualization.show_examples(viz_data, ds_info)

## Extra Credit

For 3 points of extra credit, use TensorFlow's Keras preprocessing layers to create **synthetic training data**. To do this, you can, for example:
  - Create new records that rotate the original images a random number of degrees;
  - Create new records that mirror the original images left-to-right or top-to-bottom;
  - Create new records that partially crop the original images;
  - Create records that introduce noise to the original images;
  - etc.

  For more detail on how to do this, refer to the [Data Augmentation](https://www.tensorflow.org/tutorials/images/data_augmentation) TensorFlow tutorial, and take a look at the [Image Super-Resolution](#scrollTo=aNGRuJahuk26) section of the final below.

## Model Building

Build a Convolutional Neural Network to classify each image as either a horse or a human.

Provide a narrative explanation of your choices to accompany any code. Your narrative should be substantive and enough for someone with little to no familiarity with CNNs to be able to understand what you are doing. By way of example, this should include discussion of what the major elements of a CNN are and what they do, as well as detail about your choices in parameters such as filter size and stride (or others as necessary).

## Model Evaluation

After training your model, evaluate its performance. What metric(s) did you choose to optimize on? Would you say that your model performed well or poorly? How did you evaluate its performance to arrive at that conclusion?

Minimally, you should consider evaluating:
  - Your model's accuracy on the training and testing datasets;
  - Your model's loss over time as it trained;
  - A confusion matrix of your model's true and false positive and negative predictions; and
  - Holistically whether your model performs "well" enough for the classification task, and why or why not

-----

<a id="scrollTo=aNGRuJahuk26"></a>

# Image Super-Resolution

> **NOTE:** This section of the final is **optional**. If you choose to complete it, it will contribute to both the "Model Building" and "Model Evaluation" portions of the grading rubric in addition to the image classifier you have already built. This will mean that grading is more lenient; however, you will have to do additional work. There is no penalty for choosing to not complete this section.

For this task, you will build an autoencoder that takes an image and creates a super-resolution version of that image. I.e., it _upscales_ the image to fill in more detail than was originally present.

To build this model, we will use the same dataset as in the previous example; however, with a small twist. Your training data will be images of horses and humans that have been downsampled to 150x150 pixels, and your model output will be the **exact same** images, but at the original 300x300 pixel resolution. To help get started, I have prepared a training and testing dataset of these images for you:

In [None]:
resize_and_rescale = tf.keras.Sequential([
  tf.keras.layers.Resizing(150, 150),
  tf.keras.layers.Resizing(300, 300),
  tf.keras.layers.Rescaling(1./255)
])

X_train_2 = resize_and_rescale(X_train, training=True)
X_test_2 = resize_and_rescale(X_test, training=True)

As we can see, the image on the right is the one we have downscaled and it shows an obvious loss of detail compared to the original on the left. Let's see if our supersampling autoencoder is able to clear up the image resolution!

In [None]:
orig_img = next(iter(X_train))
downscaled_img = next(iter(X_train_2))

plt.figure(figsize=(10, 10))

ax = plt.subplot(1, 2, 1)
plt.imshow(orig_img)

ax = plt.subplot(1, 2, 2)
plt.imshow(downscaled_img)

Finally, we will also create new y variables for our model to use as a ground truth against which to compare its predictions. These will just be the original 300x300 pixel images:

In [None]:
# Instead of "horse" or "human" labels,
# our y variable will now be the original
# 300x300 pixel images
y_train_2 = X_train
y_test_2 = X_test

You should reference our [Autoencoders Lab](https://github.com/wetherc/data-2000/blob/main/labs/11-16_autoencoders.ipynb) for guidance on how to structure your model. Importantly, remember:

  - Your model's input should have a shape that matches the input's pixel size (300x300 pixels --- remember, we downscaled the images and then stretched them back to their original dimensions);
  - Your model's output should have a shape that matches the output's pixel size (300x300 pixels);
  - Your convolutional and deconvolutional layers should be careful to evenly divide your images so that you don't have rounding issues from fractional pixels;
  - For your model's final layer, you should use TensorFlow's [UpSampling2D layer](https://www.tensorflow.org/api_docs/python/tf/keras/layers/UpSampling2D) followed by one or more Convolutional2D layers

## Model Building

Build a Convolutional Neural Network to classify each image as either a horse or a human.

Provide a narrative explanation of your choices to accompany any code. Your narrative should be substantive and enough for someone with little to no familiarity with CNNs to be able to understand what you are doing. By way of example, this should include discussion of what the major elements of a CNN are and what they do, as well as detail about your choices in parameters such as filter size and stride (or others as necessary).

## Model Evaluation

After training your model, evaluate its performance. What metric(s) did you choose to optimize on? Would you say that your model performed well or poorly? How did you evaluate its performance to arrive at that conclusion?

Minimally, you should consider evaluating:
  - Your model's accuracy on the training and testing datasets;
  - Your model's loss over time as it trained;
  - A visual comparison of your upsampled predicted images and the original 300x300 pixel images;
  - Holistically whether your model performs "well" enough for the classification task, and why or why not