

## 1. Introduction

We will achieve the following objectives in this lab:

    1. An understanding of the practical limitations of using dense networks in complex tasks
    2. Hands-on experience in building a deep learning neural network to solve a relatively complex task.
    

Each step may take a long time to run. You and your partner may want to work out how to do things simultaneously, but please do not miss out on any learning opportunities.


## 2. Submission Instructions

Please work together as a team of 2 to complete this lab. You will need to submit ONE copy of this notebook per team, but please fill in the names of both team members above. This lab is worth 55 marks.

**DO NOT SUBMIT MORE THAN ONE COPY OF THIS LAB!**

## 3. Creating a Dense Network for CIFAR-10

We will now begin building a neural network for the CIFAR-10 dataset. The CIFAR-10 dataset consists of 50,000 32x32x3 (32x32 pixels, RGB channels) training images and 10,000 testing images (also 32x32x3), divided into the following 10 categories:

    1. Airplane
    2. Automobile
    3. Bird
    4. Cat
    5. Deer
    6. Dog
    7. Frog
    8. Horse
    9. Ship
    10. Truck
    
In the first two parts of this lab we will create a classifier for the CIFAR-10 dataset.

### 3.1 Loading the Dataset

We begin firstly by creating a Dense neural network for CIFAR-10. The code below shows how we load the CIFAR-10 dataset:


In [1]:
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import cifar10

def load_cifar10():
    (train_x, train_y), (test_x, test_y) = cifar10.load_data()
    train_x = train_x.reshape(train_x.shape[0], 3072) # Question 1
    test_x = test_x.reshape(test_x.shape[0], 3072) # Question 1
    train_x = train_x.astype('float32')
    test_x = test_x.astype('float32')
    train_x /= 255.0
    test_x /= 255.0
    ret_train_y = to_categorical(train_y,10)
    ret_test_y = to_categorical(test_y, 10)
    
    return (train_x, ret_train_y), (test_x, ret_test_y)


(train_x, train_y), (test_x, test_y) = load_cifar10()

----

#### Question 1

Explain what the following two  statements do, and where the number "3072" came from (2 MARKS):

```
  train_x = train_x.reshape(train_x.shape[0], 3072) # Question 1
  test_x = test_x.reshape(test_x.shape[0], 3072) # Question 1
```

***The code reshapes the dimensions of the data of both training and test sets. The CIFAR-10 dataset consists of 60,000 32x32x3 (32x32 pixels, and 3 RGB channels) images, so 3x32x32 = 3072 i.e 50,000 training images with 3072 feature columns and 10,000 Test images with 3072 feature columns.***

*FOR GRADER: _______ / 2*

### 3.2 Building the MLP Classifier

In the code box below, create a new fully connected (dense) multilayer perceptron classifier for the CIFAR-10 dataset. To begin with, create a network with one hidden layer of 1024 neurons, using the SGD optimizer. You should output the training and validation accuracy at every epoch, and train for 50 epochs:


In [5]:
""" 
Write your code to build an MLP with one hidden layer of 1024 neurons,
with an SGD optimizer. Train for 50 epochs, and output the training and
validation accuracy at each epoch.
"""


from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD

# Create the neural network
nn = Sequential()
nn.add(Dense(1024,activation = 'relu'))
nn.add(Dense(10, activation = 'softmax'))

# Create our optimizer
sgd = SGD(lr = 0.1, momentum =0.1)

# 'Compile' the network to associate it with a loss function,
# an optimizer, and what metrics we want to track
nn.compile(loss='squared_hinge', optimizer=sgd,
          metrics = 'accuracy')

nn.fit(train_x, train_y, shuffle = True, epochs = 50, 
      validation_data = (test_x, test_y))


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7fa82be73a50>

#### Question 2

Complete the following table on the design choices for your MLP 
(3 MARKS):

| Hyperparameter       | What I used | Why?                  |
|:---------------------|:------------|:----------------------|
| Optimizer            | SGD         | Specified in question |
| # of hidden layers   | 1           | Specified in question |
| # of hidden neurons  | 1024        | Specified in question |
| Hid layer activation |  relu           |   handles vanishing gradients                    |
| # of output neurons  |       10      |        10 classes of images               |
| Output activation    |    softmax         |    the problem is multiclass classification, softmax works better               |
| learning rate        |    0.1         |      to achieve balance and avoid gradient jumps                 |
| momentum             |  0.1           |      accelerates gradient descent in the relevant direction and dampens oscillations                 |
| decay                |   0          |             default          |
| loss                 |   squared_hinge           |     As classification problem                  |

*FOR GRADER:*<br>
*Table: ___ / 3* <br>
*Code:  ___ / 5* <br>
**TOTAL: ____ / 8** <br>

#### Question 3:

What was your final training accuracy? Validation accuracy? Is there overfitting / underfitting? Explain your answer (5 MARKS)

***The final accuracy is 68% and validation accuracy is 53%. Undefitting, as the accuracy increased with the epoch count and can be much more improved.***

*FOR GRADER: ______ / 5*

### 3.3 Experimenting with the MLP

Cut and paste your code from Section 3.2 to the box below (you may need to rename your MLP). Experiment with the number of hidden layers, the number of neurons in each hidden layer, the optimization algorithm, etc. See [Keras Optimizers](https://keras.io/optimizers) for the types of optimizers and their parameters. **Train for 100 epochs.**


In [11]:
"""
Cut and paste your code from Section 3.2 below, then modify it to get
much better results than what you had earlier. E.g. increase the number of
nodes in the hidden layer, increase the number of hidden layers,
change the optimizer, etc. 

Train for 100 epochs.

"""
from keras.optimizers import Adam
# Create the neural network
nn2 = Sequential()
nn2.add(Dense(2048,activation = 'relu'))
nn2.add(Dense(1024,activation = 'relu'))
nn2.add(Dense(1024,activation = 'relu'))
nn2.add(Dense(10, activation = 'softmax'))

# Create our optimizer
adam = Adam(learning_rate= 0.1)

# 'Compile' the network to associate it with a loss function,
# an optimizer, and what metrics we want to track
nn2.compile(loss='squared_hinge', optimizer=adam,
          metrics = 'accuracy')

nn2.fit(train_x, train_y, epochs = 100, 
      validation_data = (test_x, test_y))

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100

KeyboardInterrupt: 

----

#### Question 4:

Complete the following table with your final design (you may add more rows for the # neurons (layer1) etc. to detail how many neurons you have in each hidden layer). Likewise you may replace the lr, momentum etc rows with parameters more appropriate to the optimizer that you have chosen. (3 MARKS)


| Hyperparameter       | What I used | Why?                  |
|:---------------------|:------------|:----------------------|
| Optimizer            |    Adam         |    stochastic gradient descent method                   |
| # of hidden layers   |     3        |         to improve  Accuracy             |
| # neurons(layer1)    |      2048       |      complex structure to improve accuracy                 |
| Hid layer1 activation|      relu       |        to handle vanishing graidents                 |
| # neurons(layer2)    |     1024        |    complex structure to improve accuracy                   |
| Hid layer2 activation|        relu     |         to handle vanishing graidents              |
| # of output neurons  |      10     | 10 classes of images             |
| Output activation    |   softmax          |     the problem is multiclass classification, softmax works better                    |
| learning rate        |    0.1         |      to achieve balance and avoid gradient jumps                 |
| momentum             |  0.1           |      accelerates gradient descent in the relevant direction and dampens oscillations                 |
| decay                |   0          |             default          |
| loss                 |   squared_hinge           |     As classification problem                  |

*FOR GRADER:* <br>
*TABLE: _____ / 3* <br>
*CODE: ______ / 5*<br>

***TOTAL: ______ / 8***

#### Question 5

What is the final training because of the time limitaion and validation accuracy that you obtained after 100 epochs. Is there considerable improvement over Section 3.2? Are there still signs of underfitting or overfitting? Explain your answer (5 MARKS)

***I have stopped the model at 58 epochs as the model stuck in local minima and the accuracy was not improving. The accuracy would have been same after 100 epochs. The Neural network doesnt poerform well as per the expectations and the accuracy was 10%. The model is overfitted as the accuray is stuck and local minima has to be avoided.***

*FOR GRADER: ______ / 5*

#### Question 6

Write a short reflection on the practical difficulties of using a dense MLP to classsify images in the CIFAR-10 datasets. (3 MARKS)

***The training can be stuck in local minima hence no changes in the accuracy, loss, validation accuracy. The performance with Adam optimizer is very bad, it resulted in 10% accuracy. Adding multliple layers with increased number of neurons makes the network complex and instead of increasing the accuracy, it got overfitted and the accuracy is decreased***

*FOR GRADER: _______ /3*

----

## 4. Creating a CNN for the MNIST Dataset

In this section we will now create a convolutional neural network (CNN) to classify images in the MNIST dataset that we used in the previous lab. Let's go through each part to see how to do this.

### 4.1 Loading the MNIST Dataset

As always we will load the MNIST dataset, scale the inputs to between 0 and 1, and convert the Y labels to one-hot vectors. However unlike before we will not flatten the 28x28 image to a 784 element vector, since CNNs can inherently handle 2D data.

In [12]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

def load_mnist():
    (train_x, train_y),(test_x, test_y) = mnist.load_data()
    train_x = train_x.reshape(train_x.shape[0], 28, 28, 1)
    test_x = test_x.reshape(test_x.shape[0], 28, 28, 1)

    train_x=train_x.astype('float32')
    test_x = test_x.astype('float32')
    
    train_x /= 255.0
    test_x /= 255.0
        
    train_y = to_categorical(train_y, 10)
    test_y = to_categorical(test_y, 10)
        
    return (train_x, train_y), (test_x, test_y) 

### 4.2 Building the CNN

We will now build the CNN. Unlike before we will create a function to produce the CNN. We will also look at how to save and load Keras models using "checkpoints", particularly "ModelCheckpoint" that saves the model each epoch.

Let's begin by creating the model. We call os.path.exists to see if a model file exists, and call "load_model" if it does. Otherwise we create a new model.



In [13]:
# load_model loads a model from a hd5 file.
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
import os

MODEL_NAME = 'mnist-cnn.hd5'

def buildmodel(model_name):
    if os.path.exists(model_name):
        model = load_model(model_name)                                                                                             
    else:
        model = Sequential()
        model.add(Conv2D(32, kernel_size=(5,5),
        activation='relu',
        input_shape=(28, 28, 1), padding='same')) # Question 7

        model.add(MaxPooling2D(pool_size=(2,2), strides=2)) # Question 8
        model.add(Conv2D(64, kernel_size=(5,5), activation='relu'))
        model.add(Conv2D(128, kernel_size=(5,5), activation='relu'))
        model.add(Conv2D(64, kernel_size=(5,5), activation='relu'))
        model.add(MaxPooling2D(pool_size=(2,2), strides=2))
        model.add(Flatten()) # Question 9
        model.add(Dense(1024, activation='relu'))
        model.add(Dropout(0.1))
        model.add(Dense(10, activation='softmax'))

    return model



----

#### Question 7

The first layer in our CNN is a 2D convolution kernel, shown here:

```
        model.add(Conv2D(32, kernel_size=(5,5),
        activation='relu',
        input_shape=(28, 28, 1), padding='same')) # Question 7
```

Why is the input_shape set to (28, 28, 1)? What does this mean? What does "padding = 'same'" mean? (4 MARKS)

***the images are of 28x28 size. (28, 28, 1) means that the image is of length 28, width 28 and has 1 channel i.e color instead of RGB which has 3 channels. Padding is a special form of masking where the masked steps are at the start or the end of a sequence. Padding comes from the need to encode sequence data into contiguous batches: in order to make all sequences in a batch fit a given standard length, it is necessary to pad or truncate some sequences. "SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).***

*FOR GRADER: ______ / 4*

#### Question 8

The second layer is the MaxPooling2D layer shown below:

```
        model.add(MaxPooling2D(pool_size=(2,2), strides=2)) # Question 8
```

What other types of pooling layers are available? What does 'strides = 2' mean? (3 MARKS)

***
- MaxPooling1D layer
- MaxPooling2D layer
- MaxPooling3D layer
- AveragePooling1D layer
- AveragePooling2D layer
- AveragePooling3D layer
- GlobalMaxPooling1D layer
- GlobalMaxPooling2D layer
- GlobalMaxPooling3D layer
- GlobalAveragePooling1D layer
- GlobalAveragePooling2D layer
- GlobalAveragePooling3D layer

----
Strides = 2 means, the filter will move 2 pixels while scanning the image.***

*FOR GRADER: _____ / 3*


#### Question 9

What does the "Flatten" layer here do? Why is it needed?

```
        model.add(Flatten()) # Question 9
```

***Flattening is converting the data into a 1-dimensional array for inputting it to the next layer. We flatten the output of the convolutional layers to create a single long feature vector. In some architectures, e.g. CNN an image is better processed by a neural network if it is in 1D form rather than 2D. Flattening is used to convert all the resultant 2-Dimensional arrays from pooled feature maps into a single long continuous linear vector. The flattened matrix is fed as input to the fully connected layer to classify the image.***

*FOR GRADER: ____ / 2*




----

### 4.3 Training the CNN

Let's now train the CNN. In this example we introduce the idea of a "callback", which is a routine that Keras calls at the end of each epoch. Specifically we look at two callbacks:

    1. ModelCheckpoint: When called, Keras saves the model to the specified filename.
    
    2. EarlyStopping: When called, Keras checks if it should stop the training prematurely.
    

Let's look at the code to see how training is done, and how callbacks are used.

In [14]:
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

def train(model, train_x, train_y, epochs, test_x, test_y, model_name):

    model.compile(optimizer=SGD(lr=0.01, momentum=0.7), 
                  loss='categorical_crossentropy', metrics=['accuracy'])

    savemodel = ModelCheckpoint(model_name)
    stopmodel = EarlyStopping(min_delta=0.001, patience=10) # Question 10

    print("Starting training.")

    model.fit(x=train_x, y=train_y, batch_size=32,
    validation_data=(test_x, test_y), shuffle=True,
    epochs=epochs, 
    callbacks=[savemodel, stopmodel])

    print("Done. Now evaluating.")
    loss, acc = model.evaluate(x=test_x, y=test_y)
    print("Test accuracy: %3.2f, loss: %3.2f"%(acc, loss))

Notice that there isn't very much that is unusual going on; we compile the model with our loss function and optimizer, then call fit, and finally evaluate to look at the final accuracy for the test set.  The only thing unusual is the "callbacks" parameter here in the fit function call

```
    model.fit(x=train_x, y=train_y, batch_size=32,
    validation_data=(test_x, test_y), shuffle=True,
    epochs=epochs, 
    callbacks=[savemodel, stopmodel])
```

----

#### Question 10.

What do the min_delta and patience parameters do in the EarlyStopping callback, as shown below? (2 MARKS)

```
    stopmodel = EarlyStopping(min_delta=0.001, patience=10) # Question 10
```

***To Stop training when a monitored metric has stopped improving. min_delta: Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.
patience: Number of epochs with no improvement after which training will be stopped.***

*FOR GRADER: ______ / 2*

---

### 4.4 Putting it together.

Now let's run the code and see how it goes (Note: To save time we are training for only 5 epochs; we should train much longer to get much better results):

In [15]:
    (train_x, train_y),(test_x, test_y) = load_mnist()
    model = buildmodel(MODEL_NAME)
    train(model, train_x, train_y, 5, test_x, test_y, MODEL_NAME)
    

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Starting training.
Epoch 1/5


  super(SGD, self).__init__(name, **kwargs)






INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


Epoch 2/5



INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


Epoch 3/5



INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


Epoch 4/5



INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


Epoch 5/5



INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


INFO:tensorflow:Assets written to: mnist-cnn.hd5/assets


Done. Now evaluating.
Test accuracy: 0.99, loss: 0.03


----

#### Question 11.

Compare the relative advantages and disadvantages of CNN vs. the Dense MLP that you build in sections 3.2 and 3.3. What makes CNNs better (or worse)? (3 MARKS)

***
- CNN is mostly used for Image Data, whereas it is better to use MLP on structural data
- CNN has less parameters and tries to reduce the dimensions of image whereas in case of MLP number of parameters depends on the data
- CNN is complex in nature whereas MLP is relatively simple compared to CNN
- CNN uses special Convolution and Pooling Layers whereas MLP is just a network of Neurons
- CNN is generally used for huge or bulky data as compared to MLP
  ------
  
  The accuracy is almost 100%, CNN perform very well for image classification***

*FOR GRADER: ______ / 3*

## 5. Creating a CNN for the CIFAR-10 Dataset

Now comes the fun part: Using the example above for creating a CNN for the MNIST dataset, now create a CNN in the box below for the MNIST-10 dataset. At the end of each epoch save the model to a file called "cifar.hd5" (note: the .hd5 is added automatically for you).

---

#### Question 12.

Summarize your design in the table below (the actual coding cell comes after this):

| Hyperparameter       | What I used | Why?                  |
|:---------------------|:------------|:----------------------|
| Optimizer            |  adam           |  stochastic gradient descent method                      |
| Input shape          |  (32,32,3)           |    height = 32, width =32, channels = RGB =3                   |
| First layer          |    Conv2D(64,(3,3)         |                       |
| Second layer         |    Conv2D(64,(3,3)       |                       |
| Maxpool layer         |    MaxPooling2D(pool_size=(2,2)      |                       |
| third layer         |    Conv2D(128,(3,3)       |                       |
| fourth layer         |    Conv2D(128,(3,3)       |                       |
| Maxpool layer         |    MaxPooling2D(pool_size=(2,2)      |                       |
| fifth layer         |    Conv2D(256,(3,3)     |                       |
| Flatten layer         |    Flatten(input_shape=(32,32)     |      to flatten the dimensions                 |
| Dense layer  1        |  128           |                       |
| Dense layer  2       |        100     |                       |
| Dense layer  3        |       80      |                       |
| Dense layer  4        |    10         |      Number of outputs                 |


*FOR GRADER:* <br>
*TABLE: ________ / 3* <br>
*CODE: _________/ 7* <br>
**TOTAL: _______ / 10** <br>

---

***TOTAL: _______ / 55***

In [27]:
"""
Write your code for your CNN for the CIFAR-10 dataset here. 

Note: train_x, train_y, test_x, test_y were changed when we called 
load_mnist in the previous section. You will now need to call load_cifar10
again.

"""
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras import models, layers
from keras.layers import Conv2D, MaxPooling2D, Flatten , Dense, Activation,Dropout

# Loading the dataset
Cifar10=keras.datasets.cifar10 
(X_train,y_train),(X_test,y_test)= Cifar10.load_data()


class_names =['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

# One hot Encoding
y_train=to_categorical(y_train)
y_test=to_categorical(y_test)
print('---------------------------------------------------------------------------')
# After one hot Encoding
print('Shapes of training and test sets are:')
print((y_train.shape, y_train[0]))
print((y_test.shape, y_test[1]))    


print('---------------------------------------------------------------------------')

# Creating Convolution Neural Netword
# creating an empty sequential model 
model=models.Sequential()
# Adding CNN Layers in the Neural Network with Relu activation function
model.add(layers.Conv2D(64,(3,3),input_shape=(32,32,3),activation='relu'))
model.add(layers.Conv2D(64,(3,3),input_shape=(32,32,3),activation='relu'))
# Max pooling layer
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
# Second Convolution 
model.add(layers.Conv2D(128,(3,3),activation='relu'))
model.add(layers.Conv2D(128,(3,3),activation='relu'))
# Max pooling layer
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
# Third convolutional 
model.add(layers.Conv2D(256,(3,3),activation='relu'))
model.add(layers.MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
#Flatten Layer
model.add(layers.Flatten(input_shape=(32,32))) 
# Classification segment 
model.add(layers.Dense(128, activation='relu')) 
model.add(layers.Dense(100, activation='relu'))
model.add(layers.Dense(80, activation='relu')) 

# Adding final output layer to the neural network
model.add(layers.Dense(10, activation='softmax')) 

# Compiling the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) 
#model.summary()

# Training the Convolution Neural Network and evaluating the accuracy. 
X_train2=X_train.reshape(50000,32,32,3)
X_test2=X_test.reshape(10000,32,32,3)

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
model.fit(X_train2,y_train,epochs=40,batch_size=56,verbose=True,validation_data=(X_test2,y_test))

train_loss, training_accuracy = model.evaluate(X_train2, y_train)
test_loss, test_accuracy = model.evaluate(X_test2, y_test)
print('-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*')
print("4. Accuracy of Function 4  'Convolution Neural Network' on Training set:", training_accuracy*100,'%')
print('-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*')
print("5. Accuracy of Function 4  'Convolution Neural Network' on Test set:", test_accuracy*100,'%')


---------------------------------------------------------------------------
Shapes of training and test sets are:
((50000, 10), array([0., 0., 0., 0., 0., 0., 1., 0., 0., 0.], dtype=float32))
((10000, 10), array([0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], dtype=float32))
---------------------------------------------------------------------------
(50000, 32, 32, 3)
(10000, 32, 32, 3)
(50000, 10)
(10000, 10)
Epoch 1/40
Epoch 2/40
Epoch 3/40

KeyboardInterrupt: 