# Bonus assignement

## Instructions
- Your submission should be the `.ipynb` file with your name,
  like `FirstNameLastName.ipynb`. It should include the answers to the questions in
  markdown cells.
- You are expected to follow the best practices for code writing and model
training. Poor coding style will be penalized.
- You are allowed to discuss ideas with your peers, but no sharing of code.
Plagiarism in the code will result in failing. If you use code from the
internet, cite it. 
- Read each instruction carefully and provide complete answers to each question/task
- You are allowed to use Keras or Pytorch 

> **_NOTE:_**  Write your email address in the cell below

m.rabotiaev@innopolis.university

### I- Open questions (3 points)

Read [this article](https://link.springer.com/referenceworkentry/10.1007/978-0-387-73003-5_304) and answer the following questions:

1. What is incremental learning?
    - Is the model in which the input data is continiously used to train the model, and extend its' knowledge of the data. It is considered to be a dynamic technique of combining the supervised and unsupervised learning, and can be applied when the data becomes available gradually. 
2. Why is it important for us to create neural networks that would someday be able to learn incrementally?
    - There are five main reasons why it is important. First of all, it might not be possible for us to store all the inforation, that we need in order to train our model. And even if we will be able to get all the necessary data, our model, may nont be able to deal with at once, and we will need to feed a data to it sequentially. Also, with the availability of new examples learning from scratch might waste time and computation resource. Another reason is, if the example generation is time-dependent that it suit the incremental learning style. And last, but not least the "concept shift", the learner should be able to self adapt to the learning environmnet. 
3. What is catastrophic forgetting?
     - It is the tendency of the neural net to abruptly and drastically forget previously obtained information, when learning a new one. And hence the goal of incremental learning is to avoid such catastrophic forgetting. 

### II- Train simple CNN model for digit classification (5 points)

Instructions:
- Load MNIST dataset and split it in **Tr**ainning (`Tr`) and **Te**ting set (`Te`), 80% and 20% respectively.
- Train a simple CNN for digit classification on the training set. 
- After fine tuning your CNN, evaluate the `overall` and the `class-wise` performances on `Te`. 
>**NOTE:** For the class-wise performance, you should plot (e.g., bar plots) the performance of your model on each class.

In [1]:
import tensorflow as tf
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Reshape and normalize the input data
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

# Convert the labels to one-hot encoding
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

# Split the data into training and validation sets
x_val = x_train[:12000]
y_val = y_train[:12000]
x_train = x_train[12000:]
y_train = y_train[12000:]

# Define the model architecture
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train,
          batch_size=128,
          epochs=10,
          verbose=1,
          validation_data=(x_val, y_val))

# Evaluate the model on the test set
score = model.evaluate(x_test, y_test, verbose=0)

# Print the overall and class-wise performance
print('Test loss:', score[0])
print('Test accuracy:', score[1])

# Print the class-wise performance
y_pred = model.predict(x_test)
y_pred = np.argmax(y_pred, axis=1)

2022-12-11 18:01:44.020838: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


2022-12-11 18:02:30.357209: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test loss: 0.030797217041254044
Test accuracy: 0.9912999868392944


### III- Create different tasks from the MNIST dataset (2 points)

Split `Tr` into 3 datasets (tasks) according to the following distribution.

- Task 1 contains digits of classes 0, 1, and 2. 
- Task 2 contains classes 3, 4, and 5. 
- Task 3 contains classes 6, 7, 8, and 9.
 
*The following picture showcases the general scheme*
<center>
<img src='https://drive.google.com/uc?id=1vdDgdN9BGQ2Jl3Yg4YiPvfb5fcAeJZJ-' style="width:500px;"> 
</center>


In [8]:
import tensorflow as tf
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K


# load the data from mnist 
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# create our three tasks1
dataset1_x = []
dataset1_y = []
dataset2_x = []
dataset2_y = []
dataset3_x = []
dataset3_y = []

# append to different datasets
for i in range(len(y_train)):
    if y_train[i] in [0, 1, 2]:
        dataset1_x.append(x_train[i])
        dataset1_y.append(y_train[i])
    elif y_train[i] in [3, 4, 5]:
        dataset2_x.append(x_train[i])
        dataset2_y.append(y_train[i])
    else:
        dataset3_x.append(x_train[i])
        dataset3_y.append(y_train[i])


# Reshape and normalize the input data

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
