**The CIFAR-100 dataset** (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. There are 600 images per class. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 500 training images and 100 testing images per class.

We need to import these dependencies. You can do this as follows:

In [None]:
from tensorflow.keras.datasets import cifar100
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropout
from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt

Model configuration
Now, let’s set some configuration options for our model:

**The batch size** is the amount of samples that will be fed forward in your model at once, after which the loss value is computed. You could either feed the model the entire training batch, one sample every time or a minibatch – and you can set this value by specifying batch_size.

**The image width, image height and number of channels **. Width and height are 32, respectively, and number of channels is 3, as the dataset contains RGB images.

The **loss function** used to compare predictions with ground truth during training. We use sparse categorical crossentropy loss. We skip the “why” for now – I’ll show you later why we use sparse instead of regular categorical crossentropy loss.

The **number of classes** and **number of epochs** (or iterations), which we set to 10 and 100, respectively. We set the first to 10 because we have ten distinct classes – the digits 0 to 9. The second is set to 100 because I’m assuming that we’ll have passed maximum model performance by then. We don’t want to be training infinitely, as this induces overfitting.

The** optimizer**, or the method by which we update the weights of our neural network. We use Adam optimization – which is a relatively state-of-the-art optimizer and common in today’s neural networks.

20% of our training data will be used for validation purposes; that is, used to test the model with non-training-data during training.

**Verbosity mod**e is set to “1”, which means “True”, which means that all the output is displayed on screen. This is good for understanding what happens during training, but it’s best to turn it off when you actually train models, as it slows down the training process.

In [None]:
# Configuration of our machine learning model
batch_size = 2048
img_width, img_height, img_num_channels = 32, 32, 3
loss_function = sparse_categorical_crossentropy
no_classes = 100
no_epochs = 90
optimizer = Adam()
validation_split = 0.25
verbosity = 1

**Loading & preparing CIFAR-10 data**







Now, let’s load some CIFAR-100 data. We can do so easily because Keras provides intuitive access to the dataset by design:

cifar100. load_data() Loads the CIFAR100 dataset. This is a dataset of 50,000 32x32 color training images and 10,000 test images, labeled over 100 fine-grained classes that are grouped into 20 coarse-grained classes.

In [None]:
# Loading of CIFAR100 data set
(input_train_data, target_train_data), (input_test_data, target_test_data) = cifar100.load_data()



The next step is to **determine the shape of one sample**.

 This is required by Keras to understand what data it can expect in the input layer of your neural network.

In [None]:
# Determine shape of the data
input_shape = (img_width, img_height, img_num_channels)



Firstly, we’ll convert our data into float32 format, which presumably speeds up training. Then, we normalize the data, into the [−1,1] range.


In [None]:
# Parse numbers as floats
input_train = input_train_data.astype('float32')
input_test = input_test_data.astype('float32')

As the distribution of the feature values in the images can be very different from each other, the images are normalized by dividing each image by 255 as the range of each individual color is [0,255]. Thus, the rescaled images have all features in the new range [0,1].

In [None]:
# Normalize data
input_train = input_train_data / 255
input_test = input_test_data / 255

We can then create the architecture of our model.

**Sequential imodel** we have chosen to build for this model. It allows you to build a model layer by layer. Each layer has weights that correspond to the layer the follows it. We use the 'add()' function to add layers to our model.

Next, it’s time to stack a few layers. 

Firstly, we’ll use three convolutional blocks – which is the nickname I often use for convolutional layers with some related ones. 

In this case, the related layer that is applied every time is a MaxPooling2D one directly after the Conv2D layer. As you can see, each time, the numer of feature maps increases – from 32, to 64, to 128. This is done because the model then learns a limited number of “generic” patterns (32) and a high amount of patterns unique to the image (128). Max Pooling ensures translation invariance, as we discussed before.

After the convolutional blocks, we add a Flatten layer. The Dense layers, which are responsible for generating the actual classifications, only work with one-dimensional data. Flatten makes this happen: it converts the multidimensional feature maps into one-dimensional shape. Great!

The Dense layers ensure that classification is possible. As you can see, in terms of the number of outputs per layer, we create an information bottleneck that eventually converges in no_classes – thus 10 – outputs, exactly the number of unique classes in our dataset. As we’re using the Softmax activation function, we’ll get a discrete multiclass probability distribution as our output for any input. From this distribution, we can draw the one with the highest value, which is the most likely class for our input.

In [None]:
# Create the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(no_classes, activation='softmax'))

As above we had created skeleton for our model. We don’t have a model yet, as it must be compiled first. This can be done by calling model.compile. It involves specifying settings for the training process, such as the loss function and the optimizer.


In [None]:
# Compile the model
model.compile(loss=loss_function,
              optimizer=optimizer,
              metrics=['accuracy'])

Once the model is compiled, we do have a model, but it’s not yet trained. We can start the training process by calling model.fit, which fits our data (in this case our training data and the corresponding targets) and specifies some settings for our training process, ones that we configured before.

In [None]:
# Fit data to model


history_fit_data = model.fit(input_train_data, target_train_data,
            batch_size=batch_size,
            epochs=no_epochs,
            verbose=verbosity,
            validation_split=validation_split,
            )

Epoch 1/90
Epoch 2/90
Epoch 3/90
Epoch 4/90
Epoch 5/90
Epoch 6/90
Epoch 7/90
Epoch 8/90
Epoch 9/90
Epoch 10/90
Epoch 11/90
Epoch 12/90
Epoch 13/90
Epoch 14/90
Epoch 15/90
Epoch 16/90
Epoch 17/90
Epoch 18/90
Epoch 19/90
Epoch 20/90
Epoch 21/90
Epoch 22/90
Epoch 23/90
Epoch 24/90
Epoch 25/90
Epoch 26/90
Epoch 27/90
Epoch 28/90
Epoch 29/90
Epoch 30/90
Epoch 31/90
Epoch 32/90
Epoch 33/90
Epoch 34/90
Epoch 35/90
Epoch 36/90
Epoch 37/90
Epoch 38/90
Epoch 39/90
Epoch 40/90
Epoch 41/90
Epoch 42/90
Epoch 43/90
Epoch 44/90
Epoch 45/90
Epoch 46/90
Epoch 47/90
Epoch 48/90
Epoch 49/90
Epoch 50/90
Epoch 51/90
Epoch 52/90
Epoch 53/90
Epoch 54/90
Epoch 55/90
Epoch 56/90
Epoch 57/90
Epoch 58/90
Epoch 59/90
Epoch 60/90
Epoch 61/90
Epoch 62/90
Epoch 63/90
Epoch 64/90
Epoch 65/90
Epoch 66/90
Epoch 67/90
Epoch 68/90
Epoch 69/90
Epoch 70/90
Epoch 71/90
Epoch 72/90
Epoch 73/90
Epoch 74/90
Epoch 75/90
Epoch 76/90
Epoch 77/90
Epoch 78/90
Epoch 79/90
Epoch 80/90
Epoch 81/90
Epoch 82/90
Epoch 83/90
Epoch 84/90
E

Generating evaluation metrics

Evaluation metrics are used to measure the quality of the statistical or machine learning model. Evaluating machine learning models or algorithms is essential for any project. There are many different types of evaluation metrics available to test a model.We have used accuracy model.

In [None]:
# Generate generalization metrics
score = model.evaluate(input_test_data, target_test_data, verbose=0)
print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')

Test loss: 7.920735836029053 / Test accuracy: 0.2808000147342682
