# Supplementary 4: Using grid search to train sex classification model

To train a sex classification model using masks obtained from MaskRCNN, I used a grid search method for hyper parameter tuning, which involves defining a set of hyper parameters, then looping through each possible combination to obtain the best possible model. 

In [1]:
#load packages
library(keras)

## Model Architecture

I wrote a function to specify the model architecture, compile it and train it for 150 epochs. I also adopted an early stopping algorithm so that training stops if validation loss did not decrease for 10 epochs in a row, meaning the model is overfitting.

In [None]:
#Function to train a simple CNN model:

TrainSimple <- function(ModelName = 1, CNN,Dense_1,Drop_1,Dense_2,Drop_2){ #hyper parameters as input
  #define model architecture:
    SexModel <- keras_model_sequential()%>%
    layer_conv_2d(filters=CNN, kernel_size=c(3,3), activation = 'relu',
                          input_shape=c(256,144,3), data_format="channels_last") %>%
    layer_max_pooling_2d(pool_size = c(3,3)) %>%
    layer_flatten() %>%
    layer_dense(units=Dense_1, activation = 'relu')%>%
    layer_dropout(rate=Drop_1) %>%
    layer_dense(units=Dense_2, activation='relu')%>%
    layer_dropout(rate=Drop_2)%>%
    layer_dense(units=1, activation='sigmoid') %>%
    compile(
      optimizer=optimizer_adam(lr=1e-05),
      loss='binary_crossentropy',
      metrics=c('accuracy')
    )
    
  #train model with early stopping algorithm:
  Best.Loss <- Inf #pre allocate
  
  for(i in 1:150){
    print(paste("epoch:",i))
    Fit.History <- SexModel %>% fit(TrainArray,TrainVect,epochs= 1,
                                    validation_data=list(ValArray,ValVect),
                                    shuffle=TRUE, bacth_size = 16)
    Val.Loss <- Fit.History[[2]]$val_loss
    
    #if loss didnt drop, stop training
    if(Val.Loss<Best.Loss){
      print("New lowest val loss")
      Best.Loss<- Val.Loss
      counter <- 0
      BestModel <- Fit.History
      BestEpoch <- i
        
      SexModel %>% save_model_hdf5(paste("../Models/Sex_M",ModelName,"_E",i, sep=""))# saves model
      
    }else{
      counter <- counter +1
      #If val loss did not improve for 10 epochs in a row, training will stop
        
      if(counter >10){ 
        print("Val.Loss increased did not drop lower than best for 10 epochs")
        print(paste("Best Model is Epoch", BestEpoch))
        #make a copy of best model:
        system(paste("cp ../Models/Sex_M",ModelName,"_E",BestEpoch," ../Models/BestModelSex-",ModelName, sep=""))
        #remove all the other models for each epoch5
        system(paste("rm ../Models/Sex_M",ModelName,"*", sep=""))
        break #stops looping
      }
    }
  }
  
  return(BestModel)
  
  
}


## Defining Hyperparamters

Hyperparamters were defined in a list, then each unique combination will be computed into a dataframe

In [3]:
## Listing possible combinations
Params = list(CNN = c(32,64,128),
                       Drop_1 = c(0,0.1,0.2), Drop_2 = c(0.1,0.2,0.3),
                       Dense_1 = c(32,64,128),Dense_2=c(32,64,128))

ParamComb <- expand.grid(Params)
ParamComb$Acc <- NA
ParamComb$Loss <- NA
ParamComb$ValLoss <- NA
ParamComb$ValAcc <- NA
head(ParamComb)

Unnamed: 0_level_0,CNN,Drop_1,Drop_2,Dense_1,Dense_2,Acc,Loss,ValLoss,ValAcc
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<lgl>,<lgl>,<lgl>,<lgl>
1,32,0.0,0.1,32,32,,,,
2,64,0.0,0.1,32,32,,,,
3,128,0.0,0.1,32,32,,,,
4,32,0.1,0.1,32,32,,,,
5,64,0.1,0.1,32,32,,,,
6,128,0.1,0.1,32,32,,,,


## Model Training

The script then loops through each possible hyperparameter combination to determine the model with the lowest validation loss, saving the results after each model was attempted.

In [None]:
for (j in 1:nrow(ParamComb)){
print(ParamComb[j,])
History <- TrainSimple(ModelName = j,CNN = ParamComb$CNN[j], Drop_1= ParamComb$Drop_1[j],Drop_2 = ParamComb$Drop_2[j],
                  Dense_1=ParamComb$Dense_1[j],Dense_2 = ParamComb$Dense_2[j])
ParamComb[j,"Acc"] <- History[[2]][[2]]
ParamComb[j,"Loss"] <-  History[[2]][[1]]
ParamComb[j,"ValLoss"] <- History[[2]][[3]]
ParamComb[j,"ValAcc"] <- History[[2]][[4]]
write.csv(ParamComb, file="../Data/SimpleCNN.csv")
}


The best model obtained was combination 158, with the following hyperparamters:

In [7]:
SimpleCNN <- read.csv("../../MaskPipeline/Data/SimpleCNN.csv")
print(SimpleCNN[158,])

    X.4 X.3 X.2 X.1   X CNN Drop_1 Drop_2 Dense_1 Dense_2       Acc      Loss
158 158 158 158 158 158  64    0.1    0.3     128      64 0.9196666 0.2562514
      ValLoss   ValAcc
158 0.5525219 0.772296
