# Model Performance Notebook
This notebook contains the epochs and performance of each relevant model

## Modeling

## Model 4

In [86]:
history4 = model4.fit(
    train_data,
    batch_size=16,
    validation_data=valid_data,
    epochs=10,
    callbacks=[early_stop,cp_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 9: early stopping


Epoch 10:

![Screen%20Shot%202022-10-06%20at%201.55.09%20PM.png](attachment:Screen%20Shot%202022-10-06%20at%201.55.09%20PM.png)

## Model 11

In [215]:
history11 = model11.fit(
    train_data,
    batch_size=32,
    validation_data=valid_data,
    epochs=5,
    callbacks=[early_stop,model_checkpoint_callback])

## Epochs 20-25

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Epochs 1-10

![Screen%20Shot%202022-10-05%20at%209.45.32%20PM.png](attachment:Screen%20Shot%202022-10-05%20at%209.45.32%20PM.png)

Epochs 10-20

![Screen%20Shot%202022-10-06%20at%201.02.22%20PM.png](attachment:Screen%20Shot%202022-10-06%20at%201.02.22%20PM.png)

In [27]:
# instantiating model
model = Sequential()

# Convulutional Input Layer
model.add(Conv2D(
    input_shape=(256,256,1), # Input Shape set to match image shape 256,256,1
    filters=32, # filters at 32, should be fine to not overfit
    kernel_size=(3,3), # 3x3x1 for grayscale images 
    activation='relu', 
    padding='same')) # same spatial dimensions as inputs padding evenly left to right

# downsampling input along spatial dimensions
model.add(MaxPool2D(
      pool_size=(2,2)))

# hidden layer with dropout of 20 percent and l2 regularizer of 5%
model.add(Dense(128, 
                activation='relu',
                kernel_regularizer=l2(0.05)))
model.add(Dropout(.20))

# make the multidimensional input one-dimensional
model.add(Flatten())

# binary output layer
model.add(Dense(1,
    activation='sigmoid'))

In [28]:
model.compile(loss='binary_crossentropy', 
              optimizer='adam', metrics=['accuracy'])

early_stop = EarlyStopping(
    monitor='val_accuracy', # monitoring validation accuracy
    min_delta=0.03, # want to see 3% change in performance
    verbose=1, # display callback message
    patience=6) # setting to 6 epochs due to consistent flucuation 
                # and strong performance in late epochs

In [29]:
history = model.fit(
    train_data,
    batch_size=32,
    validation_data=valid_data,
    epochs=10,
    callbacks=[early_stop])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 7: early stopping


### Model 1 Results:
- Pro: loss function consistently dropping w/accuracy consistently rising
- Con: Val_loss all over the place, val_accuracy stagnant
    

Model 1 was overfit/loss definition likely errored. Measures to counter: 
- added BatchNorms after each layer 
- an additional dense layer with fewer neurons w/similar penalties
- harsher l2 penalties 
- harsher dropouts

## Model 2

In [24]:
# instantiating model
model2 = Sequential()

# Convulutional Input Layer
model2.add(Conv2D(
    input_shape=(256,256,1),
    filters=32, # keeping filters at 32 
    kernel_size=(3,3), 
    activation='relu', 
    padding='same')) 

# maintaining mean and std output close to 0 and 1
model2.add(BatchNormalization())

model2.add(MaxPool2D(
      pool_size=(2,2)))

# hidden layer with dropout of 20 percent and l2 regularizer of 5%
model2.add(Dense(128, 
                activation='relu',
                kernel_regularizer=l2(0.05)))
model2.add(Dropout(.20))

# adding Conv2D layer
model2.add(Conv2D(
filters=16, # reducing filters to 16 to counter overfitting
kernel_size=(3,3),
activation='relu',
padding='same'))

model2.add(BatchNormalization())

# hidden layer with dropout of 10 percent and l2 regularizer of 1%
model2.add(Dense(32, 
                activation='relu',
                kernel_regularizer=l2(0.01)))

# batch norm
model2.add(BatchNormalization())

model2.add(Dropout(.15))

# downsampling input along spatial dimensions
model2.add(MaxPool2D
          (pool_size=(2, 2)))

# make the multidimensional input one-dimensional
model2.add(Flatten())

# binary output layer
model2.add(Dense(1,
    activation='sigmoid'))

In [25]:
model2.compile(loss='binary_crossentropy', 
              optimizer='adam', metrics=['accuracy'])

early_stop = EarlyStopping(
    monitor='val_accuracy', # monitoring validation accuracy
    min_delta=0.03, # want to see 3% change in performance
    verbose=1, # display callback message
    patience=6) # setting to 6 epochs due to consistent flucuation 
                # and strong performance in late epochs

In [26]:
history2 = model2.fit(
    train_data,
    batch_size=32,
    validation_data=valid_data,
    epochs=10,
    callbacks=[early_stop])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### Model 2 Results
Model 2 is better, still overfit, potential issue could be "complexity" of model. Val_loss jumped on the last epoch indicating overfiting, though may just need more time to run.

Model 3 will:
- reduce filters in input layer
- remove dense layer 
- drastically harsh l2 penalty
- harsher dropout
- learning rate from .001 to .01

## Model 3

In [27]:
# instantiating model
model3 = Sequential()

# Convulutional Input Layer
model3.add(Conv2D(
    input_shape=(256,256,1), # Input Shape 
    filters=16, # reducing filters to 16 to avoid overfitting
    kernel_size=(3,3), # 3x3x1 for grayscale images 
    activation='relu', 
    padding='same')) # same spatial dimensions as inputs padding evenly left to right

# adding batch norm 
model3.add(BatchNormalization())

# downsampling input along spatial dimensions
model3.add(MaxPool2D(
      pool_size=(2,2)))

# hidden layer with dropout of 30 percent and l2 regularizer of 10% to counter overfitting
model3.add(Dense(128, 
                activation='relu',
                kernel_regularizer=l2(0.1)))
model3.add(Dropout(.30))

# another batch norm
model3.add(BatchNormalization()) # wrong spot!

# make the multidimensional input one-dimensional
model3.add(Flatten())

# binary output layer
model3.add(Dense(1,
    activation='sigmoid'))

In [28]:
opt = keras.optimizers.Adam(learning_rate=0.01) # increasing learning rate to counter overfitting

model3.compile(loss='binary_crossentropy', 
              optimizer=opt, metrics=['accuracy'])

early_stop = EarlyStopping(
    monitor='val_accuracy', # monitoring validation accuracy
    min_delta=0.03, # want to see 3% change in performance
    verbose=1, # display callback message
    patience=6) # setting to 6 epochs due to consistent flucuation 
                # and strong performance in late epochs

In [29]:
history3 = model3.fit(
    train_data,
    batch_size=32,
    validation_data=valid_data,
    epochs=10,
    callbacks=[early_stop])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 10: early stopping


### Model 3 Results
Model 3 loss consisently decreasing, accuracy is stagnant. Val loss and acc stagnant as well, learning rate maybe too high but first will try:
- will try conv > batch norm > activation > dropout > pool
- keeping only droput and not regularizer
- reduce dense neurons to 64
- reduce batch_size to 16
- increase/decresing learning rate?

## Model 12

In [93]:
model12 = tf.keras.models.Sequential([
    # Input Layer
    tf.keras.layers.Conv2D(input_shape=(256,256,1),padding='same',filters=16,kernel_size=(3,3)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.MaxPool2D(2,padding='same'),
    
    # Dense Layer
    tf.keras.layers.Dense(32),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dropout(.3),
    tf.keras.layers.MaxPool2D(2,padding='same'),
    
    # Dense to Output
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1),
    tf.keras.layers.Activation('sigmoid')
])

opt = keras.optimizers.Adam(learning_rate = 0.003, 
                            beta_1 = 0.9, beta_2 = 0.999, # averaging over learned iterations
                            epsilon = 0.1, decay = 0.0)

model12.compile(loss='binary_crossentropy', 
                optimizer=opt, metrics=['accuracy'])

annealer = ReduceLROnPlateau(monitor = 'val_accuracy', # Reduce LR if val_acc is stagnant
                             factor = 0.70, # Rate the learning rate will decrease
                             patience = 5, # epochs to wait 
                             verbose = 1, # Notify
                             min_lr = 1e-4) # lower limit on LR

# es = EarlyStopping(
#     monitor='val_loss', 
#     mode='min',
#     baseline=.03,
#     verbose=1, 
#     patience=3)