<a href="https://colab.research.google.com/github/qidopox/Deep_Learning_for_image_processing_practice/blob/main/DL_practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Deep Learning for image processing practice

Convolutional Neural network (CNN) was usually used for image processing. Common applications include image classification, segmentation and denoising. In this practice, we are going to train a neural network for classifying handwritten numbers （MNIST dataset）. We are going to use two different trained neural networks for two separate tasks - image segmentation and denoising.

##Practice 1: Train a neural network working on MNIST dataset for classifying handwritten numbers

We first import the relevant python packages. In this case, we will use tensorflow and keras to build and train neural networks. The matlotlib package is for figure plotting. The datatime package is to provide access to date and time.

In [None]:
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt
import datetime


MNIST dataset was included in the keras dataset and thus can be loaded directly from keras.

As tensorflow takes float32 as the input, we need to convert the input images, which are in uint8 format, to float32.

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = tf.cast(x_train, tf.float32) / 255
x_test = tf.cast(x_test, tf.float32) / 255

Printing out the sizes of the training dataset and test dataset. There are 60,000 examples in the training dataset and 10,000 in the test dataset. The image inputs are in the sizes of 28 by 28.

In [None]:
print('training data input shape:',x_train.shape,'\n training data label shape',y_train.shape,'\n test data input shape:',x_test.shape,'\n test data label shape:',y_test.shape)

Plotting out an example from the training dataset and check its corresponding label is correct.

In [None]:
i = 0
plt.imshow(x_train[i,:,:])
plt.show()
print('label of the figure is ', y_train[i])

Construct a neural network with an input the same size as the images and an output equals to 10 which corresponds to the 10 different digits, 0-9.

In [None]:
inputs = inputs = keras.Input(shape=(x_train.shape[1], x_train.shape[2]))
x = keras.layers.Flatten()(inputs)
x = keras.layers.Dense(128, activation=tf.nn.relu)(x)
outputs = keras.layers.Dense(10, activation=tf.nn.softmax)(x)
model= keras.Model(inputs=inputs, outputs=outputs,name='mnist_classification_fully_corrected')

We can print out a summary of the neural network.

In [None]:
model.summary()

We select "Adam" as the optimisation algorithm for the neural network training. The loss function is "SparseCategoricalCrossentropy". We also use tensorboard to monitor the training process.

For details of "Adam", please find https://arxiv.org/abs/1412.6980

For details of "SparseCategoricalCrossentropy", please find https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy

In [None]:
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-3),
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],)
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

We fit x_train dataset as the input training dataset and y_train as the output training dataset. We select to train a maximum of 30 epochs. We fit x_test dataset as the input validation dataset and y_test as the output validation dataset.

By running this cell, you will start to train your neural network.

In [None]:
model.fit(
    x=x_train,y=y_train,
    epochs=30,
    validation_data=(x_test,y_test),
    callbacks=[tensorboard_callback],
)

Open tensorboard to monitor the network training.

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/fit

The trained network model is saved in the folder "models" as a h5 file.

In [None]:
model.save('./models/'+model.name+'_.h5')

Loading the trained model. Using the trained model to classify the digit from handwritten number.

In [None]:
model = keras.models.load_model('./models/'+model.name+'_.h5', compile=False)
i = 1
y_pred=model.predict(x_test[i,:,:].reshape([1,28,28,1]))

print('label:',y_test[i],'\n prediction:',y_pred)

##Practice 2: Train a convolutional neural network working on MNIST dataset for classifying handwritten numbers

Let's now construct a CNN for the same task. Please compare the architectures and the performances of the two types of networks.

In [None]:
inputs = keras.Input(shape=(x_train.shape[1], x_train.shape[2],1))
x = keras.layers.Conv2D(4,(3, 3),activation=tf.nn.relu,
                  kernel_initializer="glorot_uniform",
                  padding="same",name="conv1",)(inputs)
x = keras.layers.MaxPooling2D((2, 2), strides=(2, 2), name="pool1")(x)
x = keras.layers.Conv2D(8,(3, 3),activation=tf.nn.relu,
                  kernel_initializer="glorot_uniform",
                  padding="same",name="conv2",)(x)
x = keras.layers.MaxPooling2D((2, 2), strides=(2, 2), name="pool2")(x)
x = keras.layers.Flatten()(x)
x = keras.layers.Dense(16, activation=tf.nn.relu)(x)
outputs = keras.layers.Dense(10, activation=tf.nn.softmax)(x)
model_CNN = keras.Model(inputs=inputs, outputs=outputs)


In [None]:
model_CNN.summary()

In [None]:
model_CNN.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-3),
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],)
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

In [None]:
model_CNN.fit(
    x=x_train,y=y_train,
    epochs=30,
    validation_data=(x_test,y_test),
    callbacks=[tensorboard_callback],
)

 Please take a look at the loss function plot (epoch_loss) on the tensorboard. What are the differences of the two plots? Why do you think the differences?
 If you repeat the process and retrain the network, do you obtain the same result plots on tensorboard?

##Practice 3: Train a U-net for image segmentation through transfer learning