# Parker Dunn

__Assignment for COURSERA: Introduction to Deep Learning (via CU Boulder)__  
__Assignment:__ Week 3 - CNN Cancer Detection Kaggle Mini-Project

## Section 3 - Model Architecture

#### Plan

Due to limited time and computing resources, I'll stick to a simple model. I plan to use the "building block-style" Covolution-Convolution-Pooling design pattern with probably no more than 4 repetitions of this pattern. Since we previously experimented with the development of neural network architecture, I am hoping to replicate a reliable NN structure from one of the example image classification models from the videos. In theory, the key features will be extracted by the convolution architecture and the NN structure from another image classification task can be successfully optimized for new features.

Laid out below are my achitecture plans as well as some of the thoughts I have regarding the training of my CNN.

__Design parameters and Hyperparameters__
Decisions
* I will use ReLU (hidden layers) and sigmoid (output layer) as activation functions. This is not a design parameter that I plan to vary this time.
* I will primarily use 3x3xd convolutional filters
* As an optimization method, I will stick to SGD, which I am most familiar with, and plant to incorporate momentum if possible with the Keras API.

Hyperparameters
* Learning rate
    * Test: 0.01 | 0.001 | 0.0001 (3 values)
* Momentum
    * Test: 0.0 | 0.01 | 0.1 (3 values)
* Number of epochs (i.e., how much training)

Design
* Number of [Conv-Conv-Pool] layers
    - Test: 2, 3, 4
* Number of filters to use

Potential ways to improve a struggling model
* L2 regularization
* Batch normalization

I plan to use moderate training parameters at first (e.g. learning rate -> 0.001 and momentum -> 0.01) to experiment and narrow down some viable convolution designs.

#### Helper Functions

In [1]:
# HELPER FUNCTIONS

def partial_load_data(n):
    # n == total number of images to load
    # split == tuple with fraction of images for training and validation
    
    train_locs, test_locs, y_train_info = load_image_info()
    
    # Generate random set of indices
    rand_idx = np.random.randint(0,200000,(1,n))
    
    X = np.zeros((n, 96, 96, 3))
    X_IDs = []
    
    for i, ind in enumerate(rand_idx):
        img_file = train_locs[ind]
        img = io.imread(img_file)        # NOTE: io.imread() reads images in as numpy.ndarray
        
        #img = img.reshape(1,96*96,3)
        
        X[i,:,:,:] = img /255.0  # NOTE: MODIFYING ALL VALUES TO 0-1 SCALE!!!
        
        X_IDs.append(img_file[6:-4])
    
    return X, X_IDs, y_train_info

def partial_train_val_split(X, y_info, split=(0.66, 0.34)):
    # generate indices for training and validations sets based on 'split'
    sz = len(X)
    n_train, n_val = int(split * sz)
    
    rng = np.random.default_rng()
    idx_train = rng.choice(range(sz), (n_train,), replace=False, shuffle=False)
    idx_val = list(set(range(sz)) - set(idx_train))
    
    # separate X and y_info into separate datasets
    
    # return X_tr, y_tr, X_val, y_val
    return 1, 2, 3, 4

___
#### Step 3 - Part 1: Trying to find a repeatable way to create a CNN!

In [2]:
import tensorflow as tf
from tensorflow.keras import layers, models

#import numpy as np
from helperfunctions import *

In [3]:
layers_lst = ["input","conv", "maxpool","conv","conv","maxpool","flatten","dense","dense","dense"]
layer_design = [
    {"filters":24, "kernel_size":(3,3), "padding":"valid", "data_format":"channels_last", "use_bias":True, "input_shape":(96,96,3)},
    {"filters":48, "kernel_size":(3,3), "padding":"valid", "data_format":"channels_last", "use_bias":True},
    {"pool_size":(2,2)},
    {"filters":64, "kernel_size":(3,3), "padding":"valid", "data_format":"channels_last", "use_bias":True},
    {"filters":72, "kernel_size":(3,3), "padding":"valid", "data_format":"channels_last", "use_bias":True},
    {"pool_size":(2,2)},
    None,
    {"size":96, "activation":'relu'},
    {"size":48, "activation":'relu'},
    {"size":1, "activation":'sigmoid'}]

In [None]:
## BELOW CAN BE TURNED INTO A FUNCTION THAT TAKES THE PARAMETERS ABOVE AND
#  TURNS THEM INTO A MODEL!

model = tf.keras.Sequential()
for (l, d) in zip(layers_lst, layer_design):
    if l == "input":
        model.add(layers.Conv2D(d["filters"], d["kernel_size"], padding=d["padding"], use_bias=d["use_bias"], input_shape=d["input_shape"]))
    elif l == "conv":
        model.add(layers.Conv2D(d["filters"], d["kernel_size"], padding=d["padding"], use_bias=d["use_bias"]))
    elif l == "maxpool":
        model.add(layers.MaxPool2D(d["pool_size"]))
    elif l == "flatten":
        model.add(layers.Flatten())
    elif l == "dense":
        model.add(layers.Dense(d["size"], activation=d["activation"]))
    # elif l == "output":
    #     model.add(layers.Dense(d["size"], activation=d["activation"])
    else:
        raise Exception("Invalid layer provided for the model")

model.summary()

___

#### Step 3 - Part 2: Loading some image data and splitting into training and validation

I don't want to use all of the available images to do some preliminary testing of model designs. Therefore, I'll setup some specific functions for training and validating on a small subset of the images available.

In [None]:
X, X_ids, y_info = partial_load_data(3000)


X_tr, y_tr, X_val, y_val = partial_train_val_split(X, y_info)

print(X_tr, y_tr, X_val, y_val)