# COVisualize-19

## [COMS 4771](http://www.cs.columbia.edu/~verma/classes/ml/index.html) Final Project

Classifying COVID-19 patients from lung scans.

Anthony Krivonos

In [1]:
# from google.colab import drive
# drive.mount('/content/drive')

# % cd '/content/drive'
# % cd 'My Drive/Spring 2020/Machine Learning/AK Final'

In [2]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

### Library Imports

In [3]:
import os, sys

from os import mkdir, chdir
from os.path import join, exists
from tqdm.notebook import trange, tqdm

import matplotlib.pyplot as plt

import cv2
import numpy as np
import scipy as sp
import pandas as pd

### Import Data

In [4]:
"""
    Read CSVs into DataFrames.
"""

train_data = pd.read_csv('./train.csv')
test_data = pd.read_csv('./test.csv')

print("Feature     Count")
print("normal      %d" % train_data[train_data['label'] == 'normal'].shape[0])
print("viral       %d" % train_data[train_data['label'] == 'viral'].shape[0])
print("bacterial   %d" % train_data[train_data['label'] == 'bacterial'].shape[0])
print("covid       %d" % train_data[train_data['label'] == 'covid'].shape[0])

Feature     Count
normal      350
viral       350
bacterial   350
covid       77


### Quick Settings

Keep these updated, so we only have to do certain tasks (like preprocessing) once.

In [5]:
SETTINGS = {

    # Preprocess the images at all?
    "PREPROCESS": False,

    # Process images using the maximum image dimensions?
    "PREPROCESS_SIZE": 200, # -1: min, 0: max, >0: upper bound on height/width
    
    # Preprocess the images using specific methods. PREPROCESS must also be True.
    "PREPROCESS_METHOD_A": True,
    "PREPROCESS_METHOD_B": True,
    "PREPROCESS_METHOD_C": True,
    "PREPROCESS_METHOD_D": True,
    
    # Classify the images using specific methods
    "CLASSIFY_METHOD_A": False,
    "CLASSIFY_METHOD_B": False,
    "CLASSIFY_METHOD_C": False
    
}

SETTINGS

{'CLASSIFY_METHOD_A': False,
 'CLASSIFY_METHOD_B': False,
 'CLASSIFY_METHOD_C': True,
 'PREPROCESS': False,
 'PREPROCESS_METHOD_A': True,
 'PREPROCESS_METHOD_B': True,
 'PREPROCESS_METHOD_C': True,
 'PREPROCESS_METHOD_D': True,
 'PREPROCESS_SIZE': 200}

### Image Preprocessing

Original Image Example:
<img src="train/img-0.jpeg" style="width:400px;height:300px" />


For each of the following processing methods, [CLAHE(Contrast Limited Adaptive Histogram Equalization)](https://www.kaggle.com/seriousran/image-pre-processing-for-chest-x-ray) will be applied to increase the contrast of training and testing images.

#### Method A – Square images w/ padding

1. Find smallest-dimension training image dimension. This will be the standard image size for both training and testing.
2. Take this image's larger dimension and resize all images into squares with that side length. While doing this, center the images and pad the left and right sides.

Example:
<img src="processed/method_a/train/img-0.jpeg" style="width:400px;height:300px" />

#### Method B – Crop images, ignoring aspect ratio

1. Find smallest-dimension training image dimensions. This will be the standard image size for both training and testing.
2. Resize every image to this height and width, ignoring the aspect ratio of each image.

Example:
<img src="processed/method_b/train/img-0.jpeg" style="width:400px;height:300px" />

#### Method C – Crop images to smallest size, maintaining aspect ratio

1. Find smallest-dimension training image dimensions. This will be the standard image size for both training and testing.
2. Resize and crop every image to this height and width, maintaining the aspect ratio of each image and centering its source.

Example:
<img src="processed/method_c/train/img-0.jpeg" style="width:400px;height:300px" />

#### Method D – Crop images to square, maintaining aspect ratio

1. Find smallest-dimension training image dimension. This will be the standard image size for both training and testing.
2. Resize and crop every image to this size, maintaining the aspect ratio of each image and centering its source.

Example:
<img src="processed/method_d/train/img-0.jpeg" style="width:400px;height:400px" />

##### Tools
- Uses OpenCV (`CV2`)

In [6]:
##
#   Directory Names
##

# Name of the directory to put the processed images in
PROCESSED_DIRECTORY = 'processed'

# Different process method directories
METHOD_A_DIRECTORY = 'method_a'
METHOD_B_DIRECTORY = 'method_b'
METHOD_C_DIRECTORY = 'method_c'
METHOD_D_DIRECTORY = 'method_d'


##
#   Image Writing Helper Function
##
def relative_imwrite(relative_filepath, img):
    """
    Call cv2.imwrite(..., img) on a relative file path.
    :param relative_filepath: The path (i.e. 'processed/train/img-2.jpeg').
    :param img: The cv2 image.
    """
    
    # Store current working directory so we can navigate back
    cwd = os.getcwd()
    
    # Get index of last slash to use it as a splitting point
    split_idx = relative_filepath.rindex("/")
    
    # Create a relative path and filename from this
    relative_path = relative_filepath[:split_idx]
    file_name = relative_filepath[(split_idx + 1):]
    
    # Extract the directory names
    directory_names = relative_path.split("/")
    top_directory_name = cwd
    
    # Create the directory if it doesn't exist and then move to it
    # Repeat this for every subdirectory
    for dir_name in directory_names:
        top_directory_name = join(top_directory_name, dir_name)
        if not exists(top_directory_name):
            mkdir(dir_name)
        chdir(dir_name)

    # Save the file at the given path
    file_path = "./" + file_name
    cv2.imwrite(file_path, img)
    chdir(cwd)


##
#   Get processed training data as dataframe
##

def get_df(process_method_directory = METHOD_A_DIRECTORY, type = "train"):
    """
    Given a preprocessing directory, returns a dataframe with images and their labels.
    Uses caching to speed up loading.
    :param process_method_directory: The preprocessing directory to load.
    :param type: "train" or "test"
    :return: The dataframe with training or testing data and labels.
    """
    pickle_name = process_method_directory + "_" + type + ".pkl"
    pickle_path = "processed/" + pickle_name
    
    try:
        # Try to read the pickle, if it exists
        data = pd.read_pickle(pickle_path)
    except:
        # The pickle doesn't exist, so we create one and return it
        images = {}
        data = train_data if type == "train" else test_data
        for _, row in tqdm(data.iterrows(), desc="Loading %s data" % type, leave = False):
            
            # Read image
            id, filename = row['id'], row['filename']
            image = cv2.imread("processed/" + process_method_directory + "/" + row['filename'], 0)
            if type == "train":
                images[id] = {
                    "data": image,
                    "label": row['label']
                }
            else:
                images[id] = {
                    "data": image
                }
        data = pd.DataFrame.from_dict(images, orient='index')
        data.to_pickle(pickle_path)

    return data


if SETTINGS["PREPROCESS"]:

    # Maps of file names to CV2 images
    train_images = {}
    test_images = {}
    
    min_train_image_width = sys.maxsize
    min_train_image_height = sys.maxsize
    max_train_image_width = 0
    max_train_image_height = 0
    
    ##
    #   Traverse train images and record smallest dimension
    ##
    
    # Find the smallest and largest training images, storing the images as they're traversed
    for _, row in tqdm(train_data.iterrows(), desc="Storing train rows for preprocessing"):
        
        # Read image
        id, filename = row['id'], row['filename']
        image = cv2.imread(row['filename'])
        train_images[id] = image
        
        # Record minimum height and width
        height, width, _ = image.shape
        min_train_image_width = min(min_train_image_width, width)
        min_train_image_height = min(min_train_image_height, height)
        max_train_image_width = max(max_train_image_width, width)
        max_train_image_height = max(max_train_image_height, height)
    
    # Instantiate the smallest and largest sizes of the two dimensions in another variable
    min_train_image_size = min(min_train_image_width, min_train_image_height)
    max_train_image_size = max(max_train_image_width, max_train_image_height)
    
    ##
    #   Traverse test images only
    ##
    
    for _, row in tqdm(test_data.iterrows(), desc="Storing test rows for pre-processing"):
        
        # Read image
        id, filename = row['id'], row['filename']
        image = cv2.imread(row['filename'])
        test_images[id] = image

    # Initialize CLAHE
    clahe = cv2.createCLAHE(clipLimit = 2.0, tileGridSize = (16, 16))

    if SETTINGS["PREPROCESS_SIZE"] == 0:
        train_image_height = max_train_image_height
        train_image_width = max_train_image_width
        train_image_size = max_train_image_size
    elif SETTINGS["PREPROCESS_SIZE"] == -1:
        train_image_height = min_train_image_height
        train_image_width = min_train_image_width
        train_image_size = min_train_image_size
    else:
        upper_bound = SETTINGS["PREPROCESS_SIZE"]
        train_image_height = upper_bound if max_train_image_height > max_train_image_width else int((max_train_image_height / max_train_image_width) * upper_bound)
        train_image_width = upper_bound if max_train_image_width > max_train_image_height else int((max_train_image_width / max_train_image_height) * upper_bound)
        train_image_size = upper_bound
    print("Resizing to (w: %d, h: %d, s: %d)" % (train_image_width, train_image_height, train_image_size))
    
    ##
    #   Delete pickles if data has already been processed
    ##
    for directory in [ METHOD_A_DIRECTORY, METHOD_B_DIRECTORY, METHOD_C_DIRECTORY, METHOD_D_DIRECTORY ]:
        for type in [ "train", "test" ]:
            try:
                pickle_path = "processed/" + directory + "_" + type + ".pkl"
                os.remove(pickle_path)
            except OSError:
                pass

#### Method A

In [7]:
def resize_and_pad(cv2_img, to_size, padding_color=0):
    """
    Resize the given CV2 image to a square with the given size length, and then pad it with the given color.
    Adapted from https://stackoverflow.com/questions/44720580/resize-image-canvas-to-maintain-square-aspect-ratio-in-python-opencv.
    :param cv2_img: The CV2 image obtained via cv2.imread(...).
    :param to_size: The desired size int.
    :param padding_color: A color int, list, tuple, or ndarray.
    :return: The padded image.
    """
    
    # Create height and width variables for better naming
    to_height = to_width = to_size

    # Get actual image dimensions
    height, width = cv2_img.shape[:2]
    aspect_ratio = width / height

    # Interpolate differently based on the image's relative size
    if height > to_height or width > to_width:
        # Shrink image via inter area as its too large
        interp = cv2.INTER_AREA
    else:
        # Stretch image via inter cubic as its too small
        interp = cv2.INTER_CUBIC

    is_image_horizontal = aspect_ratio > 1
    is_image_vertical = aspect_ratio < 1
    
    # Height and width we're resizing the image to
    new_height, new_width = to_height, to_width
    
    # Padding around the new image's inner edges
    pad_left, pad_right, pad_top, pad_bot = 0, 0, 0, 0

    if is_image_horizontal:
        # Image is horizontal, so requires vertical padding
        new_height = np.round(new_width / aspect_ratio).astype(int)
        pad_vert = (to_height - new_height) / 2
        pad_top, pad_bot = np.floor(pad_vert).astype(int), np.ceil(pad_vert).astype(int)

    elif is_image_vertical:
        # Image is vertical, so required horizontal padding
        new_width = np.round(new_height * aspect_ratio).astype(int)
        pad_horiz = (to_width - new_width) / 2
        pad_left, pad_right = np.floor(pad_horiz).astype(int), np.ceil(pad_horiz).astype(int)

    # If only one color is provided and the image is RGB, then set the padding color to an array of length 3
    if len(cv2_img.shape) is 3 and not isinstance(padding_color, (list, tuple, np.ndarray)):
        padding_color = [padding_color] * 3

    # Resize the image to the newly calculated dimensions and interpolation strategy
    new_img = cv2.resize(cv2_img, (new_width, new_height), interpolation=interp)
    
    # Add the calculated borders around the image
    new_img = cv2.copyMakeBorder(new_img, pad_top, pad_bot, pad_left, pad_right, borderType=cv2.BORDER_CONSTANT, value=padding_color)

    # Make image grayscale
    new_img = cv2.cvtColor(new_img, cv2.COLOR_BGR2GRAY)
    
    return new_img

if SETTINGS["PREPROCESS"] and SETTINGS["PREPROCESS_METHOD_A"]:

    # Add a black border around the resized images
    COLOR = 0
    
    ##
    #   Resize Training Data
    ##
    
    # Resize and write training images
    for id in tqdm(train_images.keys(), desc="Method A on train data"):
        train_image = train_images[id]
        resized_train_image = resize_and_pad(train_image, train_image_size, COLOR)
        clahe_train_image = clahe.apply(resized_train_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_A_DIRECTORY + "/train/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_train_image)
    
    ##
    #   Resize Test Data
    ##
    
    # Resize and write testing images
    for id in tqdm(test_images.keys(), desc="Method A on test data"):
        test_image = test_images[id]
        resized_test_image = resize_and_pad(test_image, train_image_size, COLOR)
        clahe_test_image = clahe.apply(resized_test_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_A_DIRECTORY + "/test/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_test_image)

#### Method B

In [8]:
def resize_ignoring_aspect_ratio(cv2_img, to_width, to_height):
    """
    Resizes the image to the given size, ignoring aspect ratio.
    :param cv2_img: The image to resize.
    :param to_width: The desired width.
    :param to_height: The desired height.
    :return: A new, resized cv2 image.
    """
    
    # Resize the image
    new_img = cv2.resize(cv2_img, (to_width, to_height), interpolation = cv2.INTER_AREA)
    
    # Make image grayscale
    new_img = cv2.cvtColor(new_img, cv2.COLOR_BGR2GRAY)
    
    return new_img

if SETTINGS["PREPROCESS"] and SETTINGS["PREPROCESS_METHOD_B"]:
    
    ##
    #   Resize Training Data
    ##
    
    # Resize and write training images
    for id in tqdm(train_images.keys(), desc="Method B on train data"):
        train_image = train_images[id]
        resized_train_image = resize_ignoring_aspect_ratio(train_image, train_image_width, train_image_height)
        clahe_train_image = clahe.apply(resized_train_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_B_DIRECTORY + "/train/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_train_image)
    
    ##
    #   Resize Test Data
    ##
    
    # Resize and write testing images
    for id in tqdm(test_images.keys(), desc="Method B on test data"):
        test_image = test_images[id]
        resized_test_image = resize_ignoring_aspect_ratio(test_image, train_image_width, train_image_height)
        clahe_test_image = clahe.apply(resized_test_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_B_DIRECTORY + "/test/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_test_image)
        

#### Method C

In [9]:
def resize_maintaining_aspect_ratio(cv2_img, to_width, to_height):
    """
    Crop the given cv2 image to the desired width and height, maintaining the image's original aspect ratio.
    :param cv2_img: The image to crop.
    :param to_width: The desired width.
    :param to_height: The desired height.
    :return: The new resized and cropped image.
    """
    
    # Create height and width variables for better naming
    height, width = cv2_img.shape[:2]
    aspect_ratio = width / height
    
    # Resizing
    max_side = max(to_height, to_width)
    if height < width:
        new_height = max_side
        new_width = int(aspect_ratio * new_height)
    else:
        new_width = max_side
        new_height = int(new_width / aspect_ratio)
    resized_img = cv2.resize(cv2_img, (new_width, new_height), interpolation = cv2.INTER_AREA)
        
    # Cropping
    left_padding = int((new_width - to_width) / 2)
    right_padding = int(np.ceil((new_width - to_width) / 2))
    top_padding = int((new_height - to_height) / 2)
    bottom_padding = int(np.ceil((new_height - to_height) / 2))
    cropped_img = resized_img[top_padding:(new_height - bottom_padding), left_padding:(new_width - right_padding)]
    
    # Make image grayscale
    new_img = cv2.cvtColor(cropped_img, cv2.COLOR_BGR2GRAY)
    
    return new_img

if SETTINGS["PREPROCESS"] and SETTINGS["PREPROCESS_METHOD_C"]:
    
    ##
    #   Resize Training Data
    ##
    
    # Resize and write training images
    for id in tqdm(train_images.keys(), desc="Method C on train data"):
        train_image = train_images[id]
        resized_train_image = resize_maintaining_aspect_ratio(train_image, train_image_width, train_image_height)
        clahe_train_image = clahe.apply(resized_train_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_C_DIRECTORY + "/train/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_train_image)
    
    ##
    #   Resize Test Data
    ##
    
    # Resize and write testing images
    for id in tqdm(test_images.keys(), desc="Method C on test data"):
        test_image = test_images[id]
        resized_test_image = resize_maintaining_aspect_ratio(test_image, train_image_width, train_image_height)
        clahe_test_image = clahe.apply(resized_test_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_C_DIRECTORY + "/test/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_test_image)

        

#### Method D

In [10]:
def resize_to_square_maintaining_aspect_ratio(cv2_img, to_size):
    """
    Wrapper around resize_maintaining_aspect_ratio to produce a cropped square image.
    :param cv2_img: The image to crop.
    :param to_size: The desired square side size.
    :return: The new resized and cropped image.
    """
    
    # Resize
    return resize_maintaining_aspect_ratio(cv2_img, to_size, to_size)


if SETTINGS["PREPROCESS"] and SETTINGS["PREPROCESS_METHOD_D"]:
    
    ##
    #   Resize Training Data
    ##
    
    # Resize and write training images
    for id in tqdm(train_images.keys(), desc="Method D on train data"):
        train_image = train_images[id]
        resized_train_image = resize_to_square_maintaining_aspect_ratio(train_image, train_image_size)
        clahe_train_image = clahe.apply(resized_train_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_D_DIRECTORY + "/train/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_train_image)
    
    ##
    #   Resize Test Data
    ##
    
    # Resize and write testing images
    for id in tqdm(test_images.keys(), desc="Method D on test data"):
        test_image = test_images[id]
        resized_test_image = resize_to_square_maintaining_aspect_ratio(test_image, train_image_size)
        clahe_test_image = clahe.apply(resized_test_image)
        image_dir = PROCESSED_DIRECTORY + "/" + METHOD_D_DIRECTORY + "/test/img-" + str(id) + ".jpeg"
        relative_imwrite(image_dir, clahe_test_image)

### Heuristic Understanding of Lung X-Rays

Before proceeding, dozens of images of **normal**, **viral**, **bacterial**, and **covid**, infections (or lack thereof) were scanned. Then, one image from each class was analyzed.

| Type of Infection | Image File        | Image                                                           |
| ----------------- |:-----------------:| ---------------------------------------------------------------:|
| Normal            | train/img-0.jpeg  | <img src="train/img-0.jpeg" style="width:400px;height:300px" /> |
| Viral             | train/img-11.jpeg | <img src="train/img-11.jpeg" style="width:400px;height:300px" /> |
| Bacterial         | train/img-21.jpeg | <img src="train/img-21.jpeg" style="width:400px;height:300px" /> |
| COVID             | train/img-13.jpeg | <img src="train/img-13.jpeg" style="width:400px;height:300px" /> |

#### Normal Lungs

Normal lung images are usually the most contrastful and least inflammated.

#### Viral Lungs

Viral lungs seem deflated and have more strain-related noise around the center.

#### Bacterial Lungs

Bacterial lungs look very similar to viral lungs, except have more noticeable cloudiness above the lungs.

#### COVID Lungs

COVID (COVID-19) lungs have the most prominent deformation towards the bottom of the lungs.

### Validation

Our goal is to classify every image in the testing set (`test.csv`). Thus, we will be splitting the images in `train.csv` to use for both training and validation.

#### Method – k-Fold Cross-Validation

1. Choose `k` number of folds.
2. Iterate over each fold, making that fold the **test** fold. The rest of the folds will be used for **training**.
3. Record the model accuracy as the unweighted mean of the accuracy over each iteration.
4. Compare the run of each classification algorithm using this method.

In [11]:
def k_fold_cv(classifier, train_data, k, *classifier_args):
    """
    :param classifier: Function that takes in (training data, test data, classifier_args) and returns a testing accuracy.
    :param train_data: List of training data to validate on, with labels.
    :param k: The number of folds to train on.
    :return: The mean classifier accuracy over all folds.
    """
    total_accuracy = 0
    fold_size = int(train_data.shape[0] / k)
    for i in trange(k, desc="k-fold (k: %d)" % k):
        test_fold = train_data.iloc[i * fold_size : (i + 1) * fold_size].reset_index(drop=True)
        train_fold = pd.concat([train_data.iloc[0 : i * fold_size], train_data.iloc[(i + 1) * fold_size :]]).reset_index(drop=True)
        accuracy = classifier(train_fold, test_fold, *classifier_args)
        total_accuracy += accuracy
    mean_accuracy = 0 if total_accuracy is 0 else (total_accuracy / k)
    return mean_accuracy

### Classification

We'll now lay the pros and cons for, expectations of, and variations of each classifier that can perform classification of each lung image into
the classes $ Y \in \{ \text{normal}, \text{viral}, \text{bacterial}, \text{covid} \} $.

In [12]:
LABELS = [ 'normal', 'viral', 'bacterial', 'covid' ]

#### Method A

In [13]:
def classifier_svm(train_data, test_data, epochs = 1000, C = 0.1, ε = 0.0001):

    # Learn weights vector
    def learn_weights(X, epochs, C, ε, focus_label):
        get_label = lambda l: 1 if l == focus_label else -1

        # Initialize weights and lambda slack variable vectors
        w_len = X.iloc[0].loc['data'].shape[0] * X.iloc[0].loc['data'].shape[1]
        w = np.full(w_len, 0.0)
        b = 0

        orig_ε = ε
        for _ in trange(epochs, desc="learning weights (epochs)", leave=False):
            for index, row in X.iterrows():
                x = row.loc['data'].flatten()
                y = row.loc['label']

                if 1 - get_label(y) * np.dot(w, x) + b > 0:
                    w -= ε * ((1 / epochs) * w - C * get_label(y) * x)
                    b -= ε * (C / epochs)
                else:
                    w -= ε * (w / epochs)

                ε = orig_ε / epochs
        return w
    
    # For each label, do a one-versus-all classification and add it to the test_data frame
    for focus_label in tqdm(LABELS, desc="SVM w/GCD (epochs: %d, C: %d, ε: %d)" % (epochs, C, ε), leave=False):
        # Learn weights for the focus label
        w = learn_weights(train_data, epochs, C, ε, focus_label)

        # Create a column for the classification result for the explicit focus label
        test_data['cls_' + focus_label] = ""
        
        # Get weight * row dot products for each row
        for i, test_row in tqdm(test_data.iterrows(), desc="training", leave=False):
            x = test_row.loc['data'].flatten()
            cls_result = np.dot(w, x)
            test_row['cls_' + focus_label] = cls_result
        
    # Classify each test row
    accuracy = 0
    for i, test_row in tqdm(test_data.iterrows(), desc="classifying", leave=False):
        label = test_row.loc['label']
        cls = LABELS[np.argmax(test_row.iloc[2:])]
        accuracy += int(cls == label)
    
    # Calculate accuracy
    accuracy = 0 if accuracy == 0 else accuracy / test_data.shape[0]
    
    return accuracy

if SETTINGS["CLASSIFY_METHOD_A"]:
    train_data = get_df(METHOD_A_DIRECTORY, "train")
    
    percent_train = .8
    train_temp = train_data.iloc[0:int(train_data.shape[0] * percent_train)]
    test_temp = train_data.iloc[int(train_data.shape[0] * percent_train):]

    epochs = 1000
    c = 0.01
    eps = 0.0001

    k_fold_cv(classifier_svm, train_data, 10, epochs, c, eps)

#### Method B

In [14]:
from enum import Enum

class NeuralNetwork:

    def __init__(self, input_size):
        self.input_size = input_size
        self.layers = [ Layer(input_size) ]
        self.weights = [ ]
        self.biases = [ ]

    def add(self, layer, verbose = False):
        self.weights.append(np.random.randn(self.layers[-1].size, layer.size))
        self.biases.append(np.random.randn(layer.size,))
        self.layers.append(layer)
        # Verbosity
        vprint(verbose, "Added '%s' layer of size %d" % (layer.activation.name, layer.size))
        vprint(verbose, "  Layers:  %s" % str([ "(%d, %s)" % (l.size, l.activation.name) for l in self.layers ]))
        vprint(verbose, "  Weights: %s" % str([ w.shape for w in self.weights ]))
        vprint(verbose, "  Biases:  %s" % str([ b.shape for b in self.biases ]))

    def feedforward(self, X):
        # Feedforward vector
        a = np.copy(X)
        # Outputs
        Z = []
        # Activated outputs
        A = [ a ]
        # Feed forward
        for weight, bias, layer in zip(self.weights, self.biases, self.layers):
            z = np.array(a.dot(weight) + bias, dtype = np.float)
            a = np.array(layer.activate(z), dtype = np.float)
            Z.append(z)
            A.append(a)
        return Z, A

    def backpropagate(self, y, Z, A):
        dCdZ = [ 0.0 ] * len(self.weights) # errors for each layer
        # Get error in the last layer)
        dCdZ[-1] = np.array((y - A[-1])) * self.layers[-1].derivative(Z[-1])
        # Backpropogate
        for i in reversed(range(len(dCdZ) - 1)):
            dCdZ[i] = dCdZ[i + 1].dot(self.weights[i + 1].T) * self.layers[i].derivative(Z[i])
        num_outputs = y.shape[0]
        dCdw = []
        dCdb = []
        for i, d in enumerate(dCdZ):
            dCdw.append(A[i].T.dot(d) / float(num_outputs))
            dCdb.append(np.ones((num_outputs, 1)).T.dot(d) / float(num_outputs))
        return dCdw, dCdb

    def train(self, X, y, batch_size = 8, epochs = 100, lr = 0.0001, verbose = False):
        # Reshape input and output if flat
        X = np.array(X, dtype = np.float)
        y = np.array(y, dtype = np.float)
        for e in range(epochs):
            i = 0
            loss = sys.maxsize
            while i < len(y):
                X_batch = X[i : i + batch_size]
                y_batch = y[i : i + batch_size]
                Z, A = self.feedforward(X_batch)
                dCdw, dCdb = self.backpropagate(y_batch, Z, A)
                self.weights = [ w + lr * d for w, d in zip(self.weights, dCdw) ]
                self.biases = [ b + lr * d for b, d in zip(self.biases, dCdb) ]
                loss = np.linalg.norm(y_batch - A[-1])
                vprint(verbose, "Epoch %6d (loss: %1.6f) %s%s" % (e + 1, loss, progress_bar(i / len(y)), " " * 10), end='\r' if i < len(y) - 1 else '')
                i += batch_size
            vprint(verbose, "Epoch %6d (loss: %1.6f) %s%s" % (e + 1, loss, progress_bar(1), " " * 10))

    def classify(self, X):
        Z, A = self.feedforward(X)
        output = A[-1]
        return output

    @staticmethod
    def one_hot_encode(y_categorical):
        label_mapping = []
        labels_seen = {}
        label_count = 0
        for y in y_categorical:
            if y not in labels_seen:
                labels_seen[y] = label_count
                label_mapping.append(y)
                label_count += 1
        y_one_hot = [ [ 0 ] * label_count for _ in range(len(y_categorical)) ]
        for i, y in enumerate(y_categorical):
            hot_label = labels_seen[y]
            y_one_hot[i][hot_label] = 1
        return np.array(y_one_hot), label_mapping

    @staticmethod
    def one_hot_decode(y_one_hot, label_mapping):
        y_categorical = []
        for y in y_one_hot:
            hot_label = np.argmax(y)
            categorical_label = label_mapping[hot_label]
            y_categorical.append(categorical_label)
        return y_categorical

class Activation(Enum):
    sigmoid = 'sigmoid'
    relu = 'relu'
    leaky_relu = 'leaky_relu'
    noisy_relu = 'noisy_relu'
    elu = 'elu'
    linear = 'linear'
    softmax = 'softmax'


class Layer:

    def __init__(self, size, activation = Activation.linear, activation_parameter = None):
        self.size = size
        self.activation = activation
        self.activation_parameter = activation_parameter

    def activate(self, x):
        if self.activation == Activation.sigmoid:
            shift = np.max(x, axis = 0)
            exp = sp.special.expit(x - shift)
            return exp / (1 + exp)
        elif self.activation == Activation.relu:
            return np.maximum(0, x)
        elif self.activation == Activation.leaky_relu:
            assert(self.activation_parameter is not None)
            leak = self.activation_parameter
            return np.vectorize(lambda χ: χ if χ > 0 else χ * leak)(x)
        elif self.activation == Activation.noisy_relu:
            assert(self.activation_parameter is not None)
            std_dev = self.activation_parameter
            noise = np.random.normal(scale = std_dev)
            return np.maximum(0, x + noise)
        elif self.activation == Activation.elu:
            assert(self.activation_parameter is not None)
            a = self.activation_parameter
            return np.vectorize(lambda χ: χ if χ > 0 else a * (sp.special.expit(χ) - 1))(x)
        elif self.activation == Activation.softmax:
            assert(len(x) > 0)
            shift = np.max(x, axis = 0)
            exp = sp.special.expit(x - shift)
            sum = exp.sum(axis = 0)
            return exp / sum
        return x

    def derivative(self, x):
        if self.activation == Activation.sigmoid:
            sigma = self.activate(x)
            return sigma * (np.ones(x.shape) - sigma)
        elif self.activation == Activation.relu:
            return np.vectorize(lambda χ: 1 if χ > 0 else 0)(x)
        elif self.activation == Activation.leaky_relu:
            leak = self.activation_parameter
            return np.vectorize(lambda χ: 1 if χ > 0 else leak)(x)
        elif self.activation == Activation.noisy_relu:
            std_dev = self.activation_parameter
            noise = np.random.normal(scale = std_dev)
            return np.vectorize(lambda χ: χ + noise if χ > 0 else 0)(x)
        elif self.activation == Activation.elu:
            a = self.activation_parameter
            return np.vectorize(lambda χ: 1 if χ > 0 else a * sp.special.expit(χ))(x)
        elif self.activation == Activation.softmax:
            assert(len(x) > 0)
            sigma = self.activate(x)
            return sigma * (np.ones(x.shape) - sigma)
        return x
    
##
#   Static Methods
##

def progress_bar(perc, width = 30):
    assert(width > 10)
    width -= 3
    prog = int(perc * width)
    bar = "[" + "=" * prog + (">" if perc < 1 else "=") + "." * (width - prog) + "]"
    return bar


def vprint(verbose, *args, **kwargs):
    if verbose:
        print(*args, **kwargs)

In [None]:
def classifier_nn(train_data, test_data):
    """
    Multi-layer perceptron neural network classifier with SGD.
    :param train_data: Data to train on.
    :param test_data: Data to test on.
    :param α: L2 regularization penalty.
    :param hl: Array of hidden layer sizes.
    :param learn_rate: Learning rate.
    :return: The accuracy on the test_data.
    """
    
    # since classes are strings, we need a way to represent them numerically for the NN
    def encode_label(label):
        return LABELS.index(label)
    
    # Create training rows by flattening the images and record training labels
    X = []
    y = []
    for i, train_row in train_data.iterrows():
        X.append(train_row['data'].flatten())
        y.append(encode_label(train_row['label']))
    
    # Create testing rows
    test_X = []
    test_y = []
    
    for i, test_row in test_data.iterrows():
        test_X.append(test_row['data'].flatten())
        test_y.append(encode_label(test_row['label']))
    
    # Normalize the data
    X /= 255
    test_X /= 255
    
    input_size = X.shape[1]
    output_size = y.shape[1]
    
    # Construct the network
    nn = NeuralNetwork(input_size = input_size)
    nn.add(Layer(512, Activation.relu))
    nn.add(Layer(512, Activation.relu))
    nn.add(Layer(output_size, Activation.softmax), verbose = True)

    # Train the network
    nn.train(X, y, batch_size = 64, epochs = 1000, verbose = True, lr = 0.0001)
    
    # Get the predictions
    y_labels = nn.classify(X_test)
    y_labels = NeuralNetwork.one_hot_decode(y_labels, LABELS)
    
    # Calculate the accuracy
    accuracy = np.sum([ int(y_test[i] == y_labels[i]) for i in range(len(y_labels)) ]) / len(y_labels)
    
    return accuracy
    

if SETTINGS["CLASSIFY_METHOD_B"]:
    
    print(k_fold_cv(classifier_nn, train_data))

#### Method C

In [None]:
from sklearn.utils import class_weight

import keras
from keras import utils
from keras.engine.input_layer import Input
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from keras.models import Sequential, Model, load_model
from keras.layers import Add, SeparableConv2D, Conv2D, ZeroPadding2D, MaxPooling2D, AveragePooling2D, Dropout, Dense, Flatten, LeakyReLU, Activation, BatchNormalization, UpSampling2D

##
#   CORONet Shortcuts
##

mid_layer_activation = 'relu' # 0-to-1 activation
relu_leak = 0.3

# Creates a set of three convolutions that act as a bottleneck. Speeds up learning.
# source: https://www.kaggle.com/akumaldo/resnet-from-scratch-keras
def Bottleneck(center_kernel_size, conv_filters):

    # Conform to Keras layer spec
    def bottleneck(cnn):

        cnn_shortcut = cnn

        # Bottleneck convolutions
        for i, kernel_size in enumerate([ (1, 1), (center_kernel_size, center_kernel_size), (1, 1) ]):
            cnn = Conv2D(conv_filters[i], kernel_size = kernel_size, strides = (1, 1), padding = 'same')(cnn)
            cnn = BatchNormalization(axis = 3)(cnn)
            cnn = LeakyReLU(alpha = relu_leak)(cnn)

        # Add shortcut
        cnn = Add()([ cnn, cnn_shortcut ])
        cnn = LeakyReLU(alpha = relu_leak)(cnn)

        return cnn

    return bottleneck

def ConvBlock(comp_kernel_size, conv_filters, stride):

    # Conform to Keras layer spec
    def conv_block(cnn):
        cnn_shortcut = cnn

        # Entry convolution
        cnn = Conv2D(conv_filters[0], kernel_size = (1, 1), strides = (stride, stride), padding = 'same')(cnn)
        cnn = BatchNormalization(axis = 3)(cnn)
        cnn = LeakyReLU(alpha = relu_leak)(cnn)

        # Computational convolution
        cnn = Conv2D(conv_filters[1], kernel_size = (comp_kernel_size, comp_kernel_size), strides = (1, 1), padding = 'same')(cnn)
        cnn = BatchNormalization(axis = 3)(cnn)
        cnn = LeakyReLU(alpha = relu_leak)(cnn)

        # Close bottle w/o activation
        cnn = Conv2D(conv_filters[2], kernel_size = (1, 1), strides = (1, 1), padding = 'same')(cnn)
        cnn = BatchNormalization(axis = 3)(cnn)

        # Repeat for shortcut path
        cnn_shortcut = Conv2D(conv_filters[2], kernel_size = (1, 1), strides = (stride, stride), padding = 'same')(cnn_shortcut)
        cnn_shortcut = BatchNormalization(axis = 3)(cnn_shortcut)

        # Add shortcut
        cnn = Add()([ cnn, cnn_shortcut ])
        cnn = LeakyReLU(alpha = relu_leak)(cnn)

        return cnn

    return conv_block


def classifier_cnn(train_data, test_data, epochs = 15, batch_size = 64, submission_data = None, model_name = 'cnn'):

    # since classes are strings, we need a way to represent them numerically for the NN
    encode_label = lambda label: LABELS.index(label)
    decode_label = lambda index: LABELS[int(index)]
    
    # Store TRAINING data and labels
    X = []
    y = []
    for i, train_row in train_data.iterrows():
        X.append(train_row['data'])
        y.append(encode_label(train_row['label']))
    
    # Next, store TESTING data and lables
    test_X = []
    test_y = []
    
    for i, test_row in test_data.iterrows():
        test_X.append(test_row['data'])
        test_y.append(encode_label(test_row['label']))

    # Store class weights
    # class_weights = class_weight.compute_class_weight('balanced', [ 0, 1, 2, 3 ], y)

    # Store data dimensions for later use
    row_len = len(X[0])
    col_len = len(X[0][0])
    train_data_len = len(X)
    input_shape = (row_len, col_len, 1)
    
    # Create numerical labels from the categorical classes
    y = utils.to_categorical(y, len(LABELS))
    test_y = utils.to_categorical(test_y, len(LABELS))
    
    # Reshape training and testing features (add 2 extra channels to features for ResNet)
    X = np.array(X)
    X = X.reshape(X.shape[0], row_len, col_len, 1)
    test_X = np.array(test_X)
    test_X = test_X.reshape(test_X.shape[0], row_len, col_len, 1)

    # Create image generators
    train_gen = ImageDataGenerator(
        rotation_range = 18,
        width_shift_range = 0.3,
        height_shift_range = 0.3,
        shear_range = 0.3,
        zoom_range = 0.3,
        horizontal_flip = True
    )
    test_gen = ImageDataGenerator(
        rotation_range = 18,
        width_shift_range = 0.3,
        height_shift_range = 0.3,
        shear_range = 0.3,
        zoom_range = 0.3,
        horizontal_flip = True
    )

    seed = 1
    rounds = 2

    train_gen.fit(X, augment = True, seed = seed, rounds = rounds)
    test_gen.fit(test_X, augment = True, seed = seed, rounds = rounds)

    train_flow = train_gen.flow(X, y, batch_size = batch_size, seed = seed, shuffle = True)
    test_flow = test_gen.flow(test_X, test_y, batch_size = batch_size, seed = seed, shuffle = True)

    try:
        # Try loading the already-made model
        cnn = load_model('models/%s.h5' % model_name)
    except:

        ##
        #   CORONet Layers
        ##

        inputs = Input(input_shape)

        # If we can't load it, we'll make a new one
        cnn = ZeroPadding2D((3, 3))(inputs)

        ##
        #   Layer Group 1
        ##
        
        cnn = Conv2D(32, kernel_size = (9, 9), strides = (1, 1), padding = 'same')(cnn)
        cnn = BatchNormalization()(cnn)
        cnn = LeakyReLU(alpha = relu_leak)(cnn)
        cnn = MaxPooling2D((3, 3))(cnn)

        ##
        #   Layer Group 2
        ##

        filters = [ 32, 32, 128 ]
        cnn = ConvBlock(3, filters, 1)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
    
        ##
        #   Layer Group 3
        ##

        filters = [ 64, 64, 256 ]
        cnn = ConvBlock(3, filters, 2)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
    
        ##
        #   Layer Group 4
        ##

        filters = [ 128, 128, 512 ]
        cnn = ConvBlock(3, filters, 2)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
    
        ##
        #   Layer Group 5
        ##

        filters = [ 256, 256, 1024 ]
        cnn = ConvBlock(3, filters, 2)(cnn)
        cnn = Bottleneck(3, filters)(cnn)
        cnn = Bottleneck(3, filters)(cnn)

        ##
        #   Output Layer Group
        ##

        # Pool and flatten
        cnn = AveragePooling2D(pool_size = (2, 2))(cnn)
        cnn = Flatten()(cnn)
        cnn = Dense(1024)(cnn)
        cnn = LeakyReLU(alpha = relu_leak)(cnn)

        # Drop 50% of connections
        cnn = Dropout(0.5)(cnn)
        cnn = Dense(len(LABELS), activation = 'softmax')(cnn)

        cnn = Model(inputs = inputs, outputs = cnn, name = model_name)
        cnn.compile(loss = keras.losses.categorical_crossentropy, optimizer = keras.optimizers.Adam(learning_rate=0.1), metrics=['accuracy'])

    cnn.summary()

    # history = cnn.fit(X, y, epochs=epochs, verbose=True, validation_data=(test_X, test_y), callbacks = [ EarlyStopping() ])
    history = cnn.fit_generator(train_flow,
                                validation_data = test_flow,
                                epochs=epochs,
                                verbose = 1,
                                workers = 3,
                                use_multiprocessing = True,
                                shuffle = True,
                                callbacks = [
                                    ModelCheckpoint('models/%s.h5' % model_name, monitor = 'val_loss', verbose = True, save_best_only = True, mode='auto'),
                                    ReduceLROnPlateau(monitor = 'val_loss', patience = 3, verbose = True, mode='auto')
                                ]#,
                                #class_weight = class_weights
                                )
    cnn.save('models/%s.h5' % model_name)

    accuracy = cnn.evaluate(test_X, test_y)[1]
    
    if submission_data is not None:
        submission_X = []
        for i, submission_row in submission_data.iterrows():
            submission_X.append(submission_row['data'])
        submission_X = np.array(submission_X)
        submission_X = submission_X.reshape(submission_X.shape[0], row_len, col_len, 1)
        
        submission_rows = cnn.predict(submission_X)
        
        submission = []
        for i, submission_row in submission_data.iterrows():
            label = np.argmax(submission_rows[i])
            submission.append({ "Id": i, "label": decode_label(label) })
            
        pd.DataFrame(submission).to_csv('submission.csv', index = False)
        

    return accuracy, history
    

if SETTINGS["CLASSIFY_METHOD_C"]:

    accuracy, history = classifier_cnn(train_temp, test_temp, 300, batch_size = 64, submission_data = submission_data)
    
    print(accuracy)

    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title('Model accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='upper left')
    plt.show()