# Assignment 7 
*<span style="float:right;">Norine NDOUDI</span>*


### Generation dataset

In [2]:
import numpy as np
import pickle
import keras

# Load the training data
with open('cifar-100-python/cifar-100-python/train', 'rb') as fo:
    trainEntire_dataset = pickle.load(fo, encoding='bytes')

# Load the test data
with open('cifar-100-python/cifar-100-python/test', 'rb') as fo:
    testEntire_dataset = pickle.load(fo, encoding='bytes')


### Data Division & Label Prediction Requirement
We will divide the training dataset into two parts: a sub-
training set and a validation set  
We are exclusively focus on the "fine" labels and not the coarse ones. That's why, we will only get back the `b'fine_labels'` elements 

In [3]:
from sklearn.model_selection import train_test_split

# Spliting dataset into training, validation and allocate 1/5 of the training dataset as the validation set randomly
train_data, valid_data, train_labels, valid_labels = train_test_split(trainEntire_dataset[b'data'], trainEntire_dataset[b'fine_labels'], test_size=1/5)

In [4]:
print(train_data.shape)

(40000, 3072)


The training and validation data have a shape of (40000, 3072) instead of being in the form of 32x32x3 matrices. The number 3072 stands for the calculation of $ 32 \times 32 \times 3$. Therefore, we'll reshape the data to have the correct format (40000, 32, 32, 3) that we need to create the model.

In [5]:
train_data = train_data.reshape(-1, 32, 32, 3)
valid_data = valid_data.reshape(-1, 32, 32, 3)
print(train_data.shape)

(40000, 32, 32, 3)


### Model Experimentation

We need to vectorize the labels. In fact, `train_labels` contains an integer label for each image. Each label corresponds to one of the 100 classes in the CIFAR-100 dataset. For example, if train_labels[i] is 1, it means that the image at index i in the train_data array belongs to class 1 over the 100 classes.  
So, we will use the one-hot encoding method to embed each label as an all-zero vector with a 1 in the place of the label index with the `to_categorical` Keras function

In [6]:
from keras.utils import to_categorical
train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)


For a Deep Neural Network, we should normalize the image pixel data. To do so, we will divide the image pixel values by 255 to scale them to a range from 0 to 1.  
That way, we will obtain small numbers (instead of a high numeric value) and the computation will become easier and faster.
*<span style="float:right">[Source page](https://medium.com/analytics-vidhya/a-tip-a-day-python-tip-8-why-should-we-normalize-image-pixel-values-or-divide-by-255-4608ac5cd26a#:~:text=When%20using%20the%20image%20as,computation%20becomes%20easier%20and%20faster.)</span>*

In [7]:
# Normalize training dta
train_data_normalized = train_data / 255.0

# Normalize validation data
valid_data_normalized = valid_data / 255.0

In [8]:
#Do the same for the test data
test_data = testEntire_dataset[b'data']
test_data = test_data.reshape(-1, 32, 32, 3)
test_labels = testEntire_dataset[b'fine_labels']
test_labels_encoding = to_categorical(test_labels, 100)
testEntire_normalized = test_data / 255.0

For **image classification** of CIFAR-100, we will use a Convolutional Neural Network. We will experiment CNN models with different architectures, hyperparameters, to find 3 models that work best for the problem.  
The last layer uses a **softmax** function activation. In that case, the network will output a probability distribution over the 100 different output classes. That is to say it will produce a 100-dimensional output vector, where output[i] is the probability that the sample belongs to class i.

For the first model, I just take the inspiration from the assignment 6 with less convolutional layers and the softmax function for the last layers as explained above.  
After several tests with different epochs, the first model undergoes overfitting after 8 epochs so finally, we will train this model for 10 epochs

In [9]:
### Building the neural network
from keras import models
from keras import layers
from keras import optimizers

#Model 1
model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(100, activation='softmax'))  # 100 classes for CIFAR-100

# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(train_data_normalized, train_labels_encoding, epochs=10, validation_data=(valid_data_normalized, valid_labels_encoding))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [10]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 2.8989667892456055, Test accuracy: 0.3068000078201294


For the second model, we took the same architecture but we have changed the activation function of the layers (tanh instead of relu). We have also increased the weights of the convolutional layers and reduce the one of the Dense layer.  
We add a batch size and personalize the Adam optimizer by specify the hyperparameters beta values.

In [11]:
# Model 2
# Get back the original data befor training

train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)

train_data_normalized = train_data / 255.0
valid_data_normalized = valid_data / 255.0

model = models.Sequential()
model.add(layers.Conv2D(64, (3, 3), activation='tanh', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='tanh'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='tanh'))
model.add(layers.Dense(100, activation='softmax'))  # 100 classes for CIFAR-100

model.compile(optimizer=optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999), loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(train_data_normalized, train_labels_encoding, epochs=20, batch_size=512, validation_data=(valid_data_normalized, valid_labels_encoding))


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [12]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 2.9504668712615967, Test accuracy: 0.302700012922287


Same for the third one, we have the more or less the same architecture as the first model but we change the optimizer for the rmsprop.

In [13]:
# Model 3 with rmsprop optimizer
train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)

train_data_normalized = train_data / 255.0
valid_data_normalized = valid_data / 255.0

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(100, activation='softmax'))  # 100 classes for CIFAR-100
model.compile(optimizer=optimizers.RMSprop(learning_rate=0.01), loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(train_data_normalized, train_labels_encoding, epochs=10, validation_data=(valid_data_normalized, valid_labels_encoding))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [14]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 3.856492519378662, Test accuracy: 0.1266999989748001


For the fourth model, we added one convulotional layer and a dropout regularization with a probability of 0.5 to do something against overfitting we got before.

In [25]:
#model 4
from keras.layers import Dropout
train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)

train_data_normalized = train_data / 255.0
valid_data_normalized = valid_data / 255.0

model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(layers.Dense(100, activation='softmax'))
            
# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(train_data_normalized, train_labels_encoding, epochs=40, batch_size=256, validation_data=(valid_data_normalized, valid_labels_encoding))

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


In [26]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 2.8500149250030518, Test accuracy: 0.32499998807907104


In [17]:
#model 5
train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)

train_data_normalized = train_data / 255.0
valid_data_normalized = valid_data / 255.0

model = models.Sequential()

model.add(layers.Conv2D(64, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(256, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(layers.Dense(100, activation='softmax'))
            
# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.01), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(train_data_normalized, train_labels_encoding, epochs=10, validation_data=(valid_data_normalized, valid_labels_encoding))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [18]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 4.60841178894043, Test accuracy: 0.009999999776482582


For the sixth model, we have more or less the same achitecture as the second model, but we have set a batch size.

In [30]:
#model 6
train_labels_encoding = to_categorical(train_labels, 100)
valid_labels_encoding = to_categorical(valid_labels, 100)

train_data_normalized = train_data / 255.0
valid_data_normalized = valid_data / 255.0

model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='tanh', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='tanh'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='tanh'))
model.add(layers.Dense(100, activation='softmax'))  # 100 classes for CIFAR-100

# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(train_data_normalized, train_labels_encoding, epochs=20, batch_size=256, validation_data=(valid_data_normalized, valid_labels_encoding))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [31]:
test_loss, test_accuracy = model.evaluate(testEntire_normalized, test_labels_encoding)
print(f' Test loss: {test_loss}, Test accuracy: {test_accuracy}')

 Test loss: 3.297576904296875, Test accuracy: 0.313400000333786


### Top 3 model Selection
After several tests on different models by modifying their achitecture, learning rate, optimizer... and evaluate these models with the test data, our **top three models will be model 1, model 4 and 6.** 

### Full Training

In [40]:
trainEntire_data = trainEntire_dataset[b'data']
trainEntire_labels = trainEntire_dataset[b'fine_labels']
trainEntire_data = trainEntire_data.reshape(-1, 32, 32, 3)
trainEntire_labels_encoding = to_categorical(trainEntire_labels, 100)
trainEntire_normalized = trainEntire_data / 255.0

#model 1
model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(100, activation='softmax')) 

# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(trainEntire_normalized, trainEntire_labels_encoding, epochs=10)

test_loss1, test_accuracy1 = model.evaluate(testEntire_normalized, test_labels_encoding)

model.summary()


Epoch 1/10
Epoch 2/10
Epoch 4/10
Epoch 5/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model: "sequential_15"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_34 (Conv2D)          (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d_34 (MaxPooli  (None, 15, 15, 32)        0         
 ng2D)                                                           
                                                                 
 conv2d_35 (Conv2D)          (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_35 (MaxPooli  (None, 6, 6, 64)          0         
 ng2D)                                                           
                                                                 
 flatten_15 (Flatten)        (None, 2304)              0         
                              

In [41]:
trainEntire_data = trainEntire_dataset[b'data']
trainEntire_labels = trainEntire_dataset[b'fine_labels']
trainEntire_data = trainEntire_data.reshape(-1, 32, 32, 3)
trainEntire_labels_encoding = to_categorical(trainEntire_labels, 100)
trainEntire_normalized = trainEntire_data / 255.0

#model 4
model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(layers.Dense(100, activation='softmax'))
            
# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(trainEntire_normalized, trainEntire_labels_encoding, epochs=40, batch_size=256)

test_loss2, test_accuracy2 = model.evaluate(testEntire_normalized, test_labels_encoding)

model.summary()

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_36 (Conv2D)          (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d_36 (MaxPooli  (None, 15, 15, 32)        0         
 ng2D)                                                           
                                                                 
 conv2d_37 (Conv2D)          (None, 13, 13, 

In [43]:
trainEntire_data = trainEntire_dataset[b'data']
trainEntire_labels = trainEntire_dataset[b'fine_labels']
trainEntire_data = trainEntire_data.reshape(-1, 32, 32, 3)
trainEntire_labels_encoding = to_categorical(trainEntire_labels, 100)
trainEntire_normalized = trainEntire_data / 255.0

#model 6
model = models.Sequential()

model.add(layers.Conv2D(64, (3, 3), activation='tanh', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='tanh'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='tanh'))
model.add(layers.Dense(100, activation='softmax'))  # 100 classes for CIFAR-100

# Compile the model with Adam Optimizer
model.compile(optimizer=optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999), loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(trainEntire_normalized, trainEntire_labels_encoding, epochs=20, batch_size=256)

test_loss3, test_accuracy3 = model.evaluate(testEntire_normalized, test_labels_encoding)

model.summary()

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model: "sequential_18"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_41 (Conv2D)          (None, 30, 30, 64)        1792      
                                                                 
 max_pooling2d_41 (MaxPooli  (None, 15, 15, 64)        0         
 ng2D)                                                           
                                                                 
 conv2d_42 (Conv2D)          (None, 13, 13, 128)       73856     
                                                                 
 max_pooling2d_42 (MaxPooli  (None, 6, 6, 128)         0         
 ng2D)                                                           
                    

In [48]:
predictions = model.predict(testEntire_normalized)
print("Result of the prediction for the second image: \n" + str(predictions[1]))
print("\nDimension of the second image prediction:" + str(predictions[1].shape))
print("Sum of the coefficient of this vector prediction: " + str(np.sum(predictions[1])))
print("The class with the highest probability: " + str(np.argmax(predictions[1])))

Result of the prediction for the second image: 
[4.04499497e-06 3.11825832e-04 2.03277729e-03 2.91703714e-06
 2.78368330e-04 3.95851021e-05 1.42988889e-03 7.07291008e-04
 7.19506759e-04 7.77823880e-06 1.93979125e-04 9.37384204e-04
 1.45714903e-05 9.17073976e-06 1.02241558e-03 2.76570790e-05
 1.17752228e-04 3.18500906e-06 3.92717280e-04 8.12614326e-06
 1.24783276e-06 9.33885531e-06 3.89744207e-04 3.14274221e-04
 1.24060170e-06 9.12259839e-06 2.39469427e-05 4.47547827e-05
 9.52544451e-06 3.17959674e-03 3.02218687e-05 4.30881017e-04
 1.42304704e-01 2.80868699e-04 2.05219825e-04 1.30253914e-03
 5.38144552e-04 9.64941282e-05 5.73405589e-04 1.37569441e-04
 1.11582094e-05 2.13942694e-05 2.39796583e-02 2.74498598e-04
 6.31501852e-03 6.96358038e-03 1.68743078e-04 1.49578074e-04
 5.80078949e-06 6.94251003e-06 3.21175531e-02 1.87523142e-06
 7.53472932e-06 7.67067974e-08 2.20514194e-04 1.69260195e-03
 1.55681209e-03 4.52766562e-06 1.08493634e-06 4.48138118e-02
 5.72448289e-06 1.99535978e-03 3.8847

### Accuracy Testing

In [45]:
print(f' Model 1: Test loss: {test_loss1}, Test accuracy: {test_accuracy1}')
print(f' Model 2: Test loss: {test_loss2}, Test accuracy: {test_accuracy2}')
print(f' Model 3: Test loss: {test_loss3}, Test accuracy: {test_accuracy3}')

 Model 1: Test loss: 2.8213131427764893, Test accuracy: 0.3230000138282776
 Model 2: Test loss: 2.7572784423828125, Test accuracy: 0.33799999952316284
 Model 3: Test loss: 3.4235785007476807, Test accuracy: 0.32100000977516174


Model 2 made the most correct predictions among the three models with around 34% of accuracy.

### Benchmarking 

Here are the results of the benchmark models:  
* 
Stochastic Pooling: Percentage correct 57.
* %
NiN (Network In Network): Percentage correct 64
* 3%
DSN (Deeply-Supervised Nets): Percentage correct 6
* HD-CNN (Hierarchical Deep Convolutional Neural Network for Large Scale Visual Recognition): Percentage correct 67.4
* Spectral Representations for Convolutional Neural Networks: Percentage correct 68.4
* .4%
Dspike (ResNet-18): Percentage correct 7
* .24%
WRN-28-8 (SAMix+DM): Percentage correct 
* 5.59%
Dynamics 1 (Particle Swarm Optimization): Percentage correct
* 87.48%
Astroformer: Percentage correc  
  
As we can see my models have lower performance compared to the benchmark models. There are using more advanced architectures and training methods, which can explain their superior performance. Some models are using regularization or/and data augmentation method and other type of neural networks than CNN.t 93.36%