I. Data Preprocessing
Firstly , the training data is divided to two sets, validation (10%) and training set (90%).Then , the mean image of the training set is calculated by taking the average value of all pixels as follows:
mean_img = Xtr.mean()
The mean image is then used to zero center all input data to the neural network by subtracting it from the training , validation , and testing sets.
Also, the image values are scaled to values between 0 and 1 by dividing every pixel value by 255.0 to make the input data and output data in the same range.
II. Sanity Checks
Sanity checks were applied to make sure that the neural network implementation is correct. The first check is observing the inital lost. Since the cifar 100 contains 20 classes the loss is expected to be -ln(1/20) which was the approximately the same value of the initial loss with the randomly initialized weights as shown below:
The second sanity check is overfitting. I trained the neural network with small numbers of training examples till the training accuracy reached 100%.
III. Hyperparameter Optimization
- Min Range of the learning rate For a learning rate of value 10^(-10) , the loss is barely changing as shown below:
- Max Range of the learning rate For a learning rate of value 10^(-4) , the loss explodes as shown below:
- Coarse Search After finding the min and max range of the learning rate, coarse search was run for 100 iterations to fine tune the learning and regularization paramters together. The search was done in the range shown below:
- Fine Search After examining the coarse search results , the range was fixed to be :
The best 20 learning paramters that gave the highest validation accuracy are shown below:
IV. Training with model 0
Training loss:
validation loss:
The validation loss starts to increase after it was decreasing because the model suffered from overfitting.
The history of the learned paramaters (Weights and biases) were saved and the model with the highest validation accuracy was used to testing giving an accuracy of 38.05%.
The correct classification rate for the 20 classes are shown below: