# CNN applied to Photos in Cifar10

checked 27.02.24 G.Paaß

CIFAR10: Adapted from [here](https://www.tensorflow.org/tutorials/images/cnn).

## Aufgaben:
* Struktur des CNNs nachvollziehen (Graph der Units skizzieren)
* CNN trainieren (Hyperparameter anpassen)
* Struktur ändern:
 * Anzahl der Faltungsfilter anpassen
 * Neue Faltungsschicht (Conv-Layer) hinzufügen

In [None]:
import sys, os
import matplotlib.pyplot as plt
import tensorflow as tf
print("tf-version",tf.__version__)
from tensorflow import keras
from tensorflow.keras import datasets, layers, models
%matplotlib inline
import numpy as np

## Import Data
The [CIFAR10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) contains 60,000 color images in 10 classes, with 6,000 images in each class. The dataset is divided into 50,000 training images and 10,000 testing images. The classes are mutually exclusive and there is no overlap between them.

Here are the classes in the dataset, as well as 10 random images from each:

<img src="img/cifar10.png" style="width:600px">

In [None]:
# Import CIFAR10 data
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0


Print data

In [None]:
print("train_images.shape",train_images.shape,"\ttrain_labels.shape",train_labels.shape)
print("test_images.shape",test_images.shape,"\ttest_labels.shape",test_labels.shape)

In [None]:
print("train_images[0]",train_images[0].shape)
print("RGB-values for one pixel")
print("train_images[0][0][0]",train_images[0][0][0])

To verify that the dataset looks correct, let's plot the first 25 images from the training set and display the class name below each image.

In [None]:
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']
print("class_names =\n",class_names[:5],"\n",class_names[5:])
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    # The CIFAR labels happen to be arrays,
    # which is why you need the extra index
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()

## Create the Model

The 6 lines of code below define the convolutional base using a common pattern: a stack of Conv2D and MaxPooling2D layers.

As input, a CNN takes tensors of shape (image_height, image_width, color_channels), ignoring the batch size. If you are new to these dimensions, color_channels refers to (R,G,B). In this example, you will configure our CNN to process inputs of shape (32, 32, 3), which is the format of CIFAR images. You can do this by passing the argument input_shape to our first layer.

In [None]:
dropProb=0.0 # fraction of units to drop
nfilter_1 = 32
nfilter_2 = 64
nfilter_3 = 64
nhid=64

model = models.Sequential()
model.add(layers.Conv2D(filters=nfilter_1, kernel_size=(3, 3),
                        activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D(pool_size=(2, 2)))
model.add(layers.Dropout(rate=dropProb))
model.add(layers.Conv2D(filters=nfilter_2, kernel_size=(3, 3), activation='relu'))
model.add(layers.MaxPooling2D(pool_size=(2, 2)))
model.add(layers.Dropout(rate=dropProb))
model.add(layers.Conv2D(filters=nfilter_3, kernel_size=(3, 3), activation='relu'))
model.summary()

Let us consider the first conv2D layer with an input of 3 features. For a kernel size `(3,3)` and `nfilter_1 = 32` output feature matrices we get 896 total free parameters
* 28 parameter per filter (896/32)
* 1 bias-parameter per filter
* 27 kernel parameters per filter. Hence each filter is a $3\times3\times3$ cube and has parameters for all of the three input features.

The second layer has 32 features and  18496 parameters.
* 289 parameter per filter (18496/64)
* 1 bias-parameter per filter
* 288 kernel parameters per filter. Hence each filter is a $3\times3\times32$ cube and has parameters for all of the 32 input features.

Above, you can see that the output of every Conv2D and MaxPooling2D layer is a 3D tensor of shape (height, width, channels). The width and height dimensions tend to shrink as you go deeper in the network. The number of output channels for each Conv2D layer is controlled by the first argument (e.g., 32 or 64). Typically, as the width and height shrink, you can afford (computationally) to add more output channels in each Conv2D layer.

### Add fully connected Layers at the top
To complete our model, you will feed the last output tensor from the convolutional base (of shape (4, 4, 64)) into one or more Dense layers to perform classification. Dense layers take vectors as input (which are 1D), while the current output is a 3D tensor. First, you will flatten (or unroll) the 3D output to 1D, then add one or more Dense layers on top. CIFAR has 10 output classes, so you use a final Dense layer with 10 outputs and a softmax activation.

In [None]:
model.add(layers.Flatten())
model.add(layers.Dense(nhid, activation='relu'))
model.add(layers.Dense(10))
model.summary()


## Training the Model
~ 80 sec

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10,
                    validation_data=(test_images, test_labels))


## Evaluate the model

In [None]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')


In [None]:
train_loss, train_acc = model.evaluate(train_images,  train_labels, verbose=2)
test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
print("train_acc",train_acc)
print(" test_acc",test_acc)

In [None]:
test_images[:10]

In [None]:
test10=test_images[:10,:,:,:]
testlabels10=test_labels[:10]

In [None]:
prb = model.predict(test10,   verbose=2)
prb

In [None]:
plt.figure(figsize=(10,10))
for i in range(10):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(test_images[i], cmap=plt.cm.binary)
    # The CIFAR labels happen to be arrays,
    # which is why you need the extra index
    plt.xlabel(class_names[test_labels[i][0]])
plt.show()

In [None]:
scores=model.predict(test_images[0:10])
nr = scores.shape[0]
for ir in range(nr):
  np.set_printoptions(precision=4,suppress=True)
  prb = np.exp(scores[ir,:])/np.sum(np.exp(scores[ir,:]))
  print(prb, test_labels[ir][0], "{:<10}".format(class_names[test_labels[ir][0]]), "\t", test_labels[ir][0]==np.argmax(scores[ir,:]))
  #print(test_labels[ir][0],['{:.4f}'.format(p) for p in prb])