<a href="https://colab.research.google.com/github/ashley-ferreira/PHYS449_FinalProject/blob/main/notebooks/CNN_4way_TrainTest_Outline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **PHYS 449: Final Project Notebook**
#### Reproducing results from "Morphological classification of galaxies with deep learning: comparing 3-way and 4-way CNNs" by Mitchell K. Cavanagh, Kenji Bekki and Brent A. Groves

*This all just assumed 4-way classification for now

# **Import Packages**

Begin by importing all the needed packages

In [None]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from sklearn.model_selection import train_test_split
from tqdm import tqdm
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization, Flatten, Conv2D, MaxPool2D
from keras.layers.core import Dropout



# **Define Network Structure**
We are considering two 2D CNNs, C1 and C2, which are described in the paper and outlined below

In [None]:
def C1(input_shape, unique_labels=4, dropout_rate=0.5):
    '''
    Defines the 2D Convolutional Neural Network (CNN) called C1
    Parameters:    
    
        input_shape (arr): input shape for network
        unique_labels (int): number unique labels 
        dropout_rate (float): dropout rate as fraction

    Returns:
        
        model (keras model class): CNN to train
    '''

    model = Sequential()

    model.add(Conv2D(filters=32, input_shape=input_shape, activation='relu', kernel_size=(5,5)))
    model.add(Conv2D(filters=64, input_shape=input_shape, activation='relu', kernel_size=(5,5)))
    model.add(MaxPool2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dropout(dropout_rate))
    model.add(Dense(256, activation='relu'))

    model.add(Dense(unique_labels, activation='softmax')) 

    return model

In [None]:
def C2(input_shape, unique_labels=2, dropout_rate=0.5):
    '''
    Defines the 2D Convolutional Neural Network (CNN) called C2
    Parameters:    
    
        input_shape (arr): input shape for network
        unique_labels (int): number unique labels 
        dropout_rate (float): dropout rate as fraction

    Returns:
        
        model (keras model class): CNN to train
    '''

    model = Sequential()

    model.add(Conv2D(filters=32, input_shape=input_shape, activation='relu', kernel_size=(7,7)))
    model.add(BatchNormalization())
    model.add(MaxPool2D(pool_size=(2,2)))

    model.add(Conv2D(filters=64, input_shape=input_shape, activation='relu', kernel_size=(5,5)))
    model.add(BatchNormalization())
    model.add(Conv2D(filters=64, input_shape=input_shape, activation='relu', kernel_size=(5,5)))
    model.add(BatchNormalization())
    model.add(MaxPool2D(pool_size=(2,2)))

    model.add(Conv2D(filters=128, input_shape=input_shape, activation='relu', kernel_size=(3,3)))
    model.add(BatchNormalization())
    model.add(MaxPool2D(pool_size=(2,2)))

    model.add(Flatten())
    model.add(Dropout(dropout_rate))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(256, activation='relu'))

    model.add(Dense(unique_labels, activation='softmax')) 

    return model

# **Load Data**

Galaxy10.h5:  98%|█████████▊| 206M/210M [00:07<00:00, 44.7MB/s]

Downloaded Galaxy10 successfully to /root/.astroNN/datasets/Galaxy10.h5


Galaxy10.h5: 210MB [00:08, 24.0MB/s]                           


# **Sample Data**
Here we check that the data files are how we expect them to be

# **Split Data**
Here we split data into trainng, testing datasets (validation split will be done by keras during training)

In [None]:
# splitting into training and testing
X_train, X_test, y_train, y_test = train_test_split(images, labels, test_size = 0.15)
print(X_train.shape)
print(y_train.shape)

(18517, 69, 69, 3)
(18517, 10)


# **Training**
Ideally we use seperate notebooks to train each one

C2 uses Adam, wheras C1 uses Adadelta: 

  https://www.aanda.org/articles/aa/full_html/2020/09/aa37963-20/aa37963-20.html


In [None]:
network_to_train = 'C1'

# define hyperparameters of training
if network_to_train == 'C1':
  n_epochs = 13
  # can't find learning rate mentioned so I'm leaving it as default for now
  opt = optimizers.Adadelta()
  cn_model = C1(X_train.shape[1:])
elif network_to_train == 'C2':
  n_epochs = 20
  lr = 2*pow(10,-4)
  opt = keras.optimizers.Adam(learning_rate=lr)
  cn_model = C2(X_train.shape[1:])

In [None]:
# show model architecture
cn_model.summary()

In [None]:
# setup W&B tracking 


In [None]:
# add early stopping (optional, if used set epochs to 100 as max)

In [None]:
 # train the model 
cn_model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy', 'loss'])

print('Model initialized and prepped, begin training...')
classifier = cn_model.fit(X_train_1layer, y_train, epochs=n_epochs, validation_data=(X_test_1layer, y_test)) # fix, keep test seperate and use validation split

^ add specific batch data with keras? worse comes to worse we do it in pytorch but these articles seem helpful to get it going

1.  https://meatba11.medium.com/keras-loading-and-processing-images-in-batches-1cff1b0f4aa4
2. https://machinelearningmastery.com/how-to-load-large-datasets-from-directories-for-deep-learning-with-keras/
3. https://stackoverflow.com/questions/61021025/split-data-into-batches
4. https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly

And more online, so I think we can figure it out


In [None]:
# plot accuracy/loss versus epoch
fig1 = plt.figure(figsize=(10,3))

ax1 = plt.subplot(121)
ax1.plot(classifier.history['accuracy'], color='darkslategray', linewidth=2, label='training')
ax1.plot(classifier.history['val_accuracy'], linewidth=2, label='valiation') 
ax1.legend()
ax1.set_title('Model Accuracy')
ax1.set_ylabel('Accuracy')
ax1.set_xlabel('Epoch')

ax2 = plt.subplot(122)
ax2.plot(classifier.history['loss'], color='crimson', linewidth=2, label='training')
ax2.plot(classifier.history['val_loss'], linewidth=2, label='validation')
ax2.legend()
ax2.set_title('Model Loss')
ax2.set_ylabel('Loss')
ax2.set_xlabel('Epoch')

fig1.savefig(model_dir_name +'/plots/'+'CNN_training_history.png')

plt.show()

# **Testing**
Here we apply the model to the test set and create a confusion matrix to gauge performance

In [None]:
# make predictions on test set and compare to real labels
preds_test = cn_model.predict(X_test, verbose=1)
results = cn_model.evaluate(X_test, y_test) 
print("test loss, valid acc:", results)

In [None]:
# plot confusion matrix
fig2 = plt.figure()
cm = confusion_matrix(y_valid, preds_valid)
plt.matshow(cm)

for (i, j), z in np.ndenumerate(cm):
    pyl.text(j, i, '{:0.1f}'.format(z), ha='center', va='center')
plt.title('Confusion matrix (validation data)')
plt.colorbar()
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.show()
plt.savefig(model_dir_name +'plots/'+'CNN_confusion_matrix.png')