<a href="https://colab.research.google.com/github/MHadavand/Lessons/blob/master/ML/ANN/GPU_CNN/CNN-MNIST-Dataset_CPU.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<h1 align="center"><font size="5">CONVOLUTIONAL NEURAL NETWORK</font></h1>

<h2>Introduction</h2>

<p>In this section, we will use the famous <a href="http://yann.lecun.com/exdb/mnist/">MNIST</a> to build a Convolutional Neural Networks capable to perform handwritten digits classification. CNN will say, with some associated error, what type of digit is the presented input.</p>

<hr>

<div class="alert alert-block alert-info" style="margin-top: 20px">
<font size = 3><strong>Click on the links to go to the following sections:</strong></font>
<br>
<h2>Table of Contents</h2>
<ol>
    <li><a href="#deep_learning_MNIST">Deep Learning applied on MNIST</a></li>
    <li><a href="#summary_DNN">Summary of the Deep Convolutional Neural Network</a></li>
    <li><a href="#train_model">Define functions and train the model</a></li>
    <li><a href="#evaluate_model">Evaluate the model</a></li>
</ol>    
</div>

<hr>

<h2 id="deep_learning_MNIST">Deep Learning applied on MNIST</h2>

<p>We are going to create a simple CNN to perform classification tasks on the MNIST digits dataset. If you are not familiar with the MNIST dataset, you could read more about it: <a href="http://yann.lecun.com/exdb/mnist/">click here</a></p>

<h3>What is MNIST?</h3>

<p>According to Lecun's website, the MNIST is a: "database of handwritten digits that has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from the National Institute of Standards and Technology (NIST). The digits have been size-normalized and centered in a fixed-size image".</p>

<h3>Import the MNIST dataset using TensorFlow built-in feature</h3>

<p>It's very important to notice that MNIST is a high optimized data-set and it does not contain images. You will need to build your own code if you want to see the real digits. Another important side note is the effort that the authors invested on this data-set with normalization and centering operations.</p>  

In [None]:
import time
import tensorflow.compat.v1 as tf # In order to use V1 of tensor flow using tesnorflow 2.*
tf.disable_v2_behavior()
mnist = tf.keras.datasets.mnist

<p>The <code>one-hot = True</code> argument only means that, in contrast to Binary representation, the labels will be presented in a way that only one bit will be on for a specific digit.</p>

<h3>Understanding the imported data</h3>

The imported data can be divided as follow:

<ul>
    <li>Training (mnist.train):  Use the given dataset with inputs and related outputs for training of NN. In our case, if you give an image that you know that represents a "nine", this set will tell the neural network that we expect a "nine" as the output.
        <ul>
            <li>55,000 data points</li>
            <li>mnist.train.images for inputs</li>
            <li>mnist.train.labels for outputs</li>
        </ul>
    </li>
    <li>Validation (mnist.validation): The same as training, but now the data is used to generate model properties (classification error, for example) and from this, tune parameters like the optimal number of hidden units or determine a stopping point for the back-propagation algorithm.
        <ul>
            <li>5,000 data points</li>
            <li>mnist.validation.images for inputs</li>
            <li>mnist.validation.labels for outputs</li>
        </ul>        
    </li>
    <li>Test (mnist.test): the model does not have access to this informations prior to the test phase. It is used to evaluate the performance and accuracy of the model against "real life situations". No further optimization beyond this point.
        <ul>
            <li>10,000 data points</li>
            <li>mnist.test.images for inputs</li>
            <li>mnist.test.labels for outputs</li>
        </ul>         
    </li>
</ul>

<h3>Creating an interactive section</h3>

You have two basic options when using TensorFlow to run your code:
<ul>
    <li>[Build graphs and run session] Do all the set-up and THEN execute a session to evaluate tensors and run operations (ops).</li>
    <li>[Interactive session] create your coding and run on the fly. </li>
</ul>

For this first part, we will use the interactive session that is more suitable for environments like Jupyter notebooks.

In [None]:
sess = tf.InteractiveSession()

<h3>Creating placeholders</h3>

<p>It's a best practice to create placeholders before variable assignments when using TensorFlow. Here we'll create placeholders for inputs ("Xs") and outputs ("Ys").</p>

<b>Placeholder 'X':</b> represents the "space" allocated input or the images.
<ul>
    <li>Each input has 784 pixels distributed by a 28 width x 28 height matrix.</li>
    <li>The 'shape' argument defines the tensor size by its dimensions.</li>
    <li>1st dimension = None. Indicates that the batch size, can be of any size.</li>
    <li>2nd dimension = 784. Indicates the number of pixels on a single flattened MNIST image.</li> 
</ul>

<b>Placeholder 'Y':</b> represents the final output or the labels.  
<ul>
    <li>10 possible classes (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).</li>
    <li>The 'shape' argument defines the tensor size by its dimensions.</li>
    <li>1st dimension = None. Indicates that the batch size, can be of any size.</li>
    <li>2nd dimension = 10. Indicates the number of targets/outcomes.</li> 
</ul>

<p><b>dtype for both placeholders:</b> if you not sure, use tf.float32. The limitation here is that the later presented softmax function only accepts float32 or float64 dtypes. For more dtypes, check TensorFlow's documentation <a href="https://www.tensorflow.org/api_docs/python/tf/dtypes/DType">here</a>.</p>

In [None]:
x  = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

<h3>Convolutional neural networks (CNNs)</h3>

<p>Convolutional neural networks (CNNs) is a type of feed-forward neural network, consist of multiple layers of  neurons that have learnable weights and biases. Each neuron in a layer that receives some input, process it, and optionally follows it with a non-linearity. The network has multiple layers such as convolution, max pool, drop out and fully connected layers. In each layer, small neurons process portions of the input image. The outputs of these collections are then tiled so that their input regions overlap, to obtain a higher-resolution representation of the original image; and it is repeated for every such layer. The important point here is: CNNs are able to break the complex patterns down into a series of simpler patterns, through multiple layers.</p>

<h3>CNN architecture</h3>
<p>In the first part, we learned how to use a simple CNN to classify MNIST. Now we are going to expand our knowledge using a Deep Neural Network.</p>


Architecture of our network is:
<ul> 
    <li>(Input) -> [batch_size, 28, 28, 1]  >> Apply 32 filter of [5x5]</li>
    <li>(Convolutional layer 1)  -> [batch_size, 28, 28, 32]</li>
    <li>(ReLU 1)  -> [?, 28, 28, 32]</li>
    <li>(Max pooling 1) -> [?, 14, 14, 32]</li>
    <li>(Convolutional layer 2)  -> [?, 14, 14, 64]</li>
    <li>(ReLU 2)  -> [?, 14, 14, 64]</li>
    <li>(Max pooling 2)  -> [?, 7, 7, 64]</li>
    <li>[fully connected layer 3] -> [1x1024]</li>
    <li>[ReLU 3]  -> [1x1024]</li>
    <li>[Drop out]  -> [1x1024]</li>
    <li>[fully connected layer 4] -> [1x10]</li>
</ul>

The next cells will explore this new architecture.

<h3>Initial parameters</h3>

Create general parameters for the model

In [None]:
width = 28 # width of the image in pixels 
height = 28 # height of the image in pixels
flat = width * height # number of pixels in one image 
class_output = 10 # number of possible classifications for the problem

<h3>Input and output</h3>

Create place holders for inputs and outputs

In [None]:
x  = tf.placeholder(tf.float32, shape=[None, flat])
y_ = tf.placeholder(tf.float32, shape=[None, class_output])

<h4>Converting images of the data set to tensors</h4>

<p>The input image is a 28 pixels by 28 pixels, 1 channel (grayscale). In this case, the first dimension is the <b>batch number</b> of the image, and can be of any size (so we set it to -1). The second and third dimensions are width and hight, and the last one is the image channels.</p>

In [None]:
x_image = tf.reshape(x, [-1,28,28,1])  
x_image

<h3>Convolutional Layer 1</h3>

<h4>Defining kernel weight and bias</h4>
<p>We define a kernel here. The Size of the filter/kernel is 5x5; Input channels is 1 (grayscale);  and we need 32 different feature maps (here, 32 feature maps means 32 different filters are applied on each image. So, the output of convolution layer would be 28x28x32). In this step, we create a filter / kernel tensor of shape <code>[filter_height, filter_width, in_channels, out_channels]</code>.</p>

In [None]:
W_conv1 = tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1))
b_conv1 = tf.Variable(tf.constant(0.1, shape=[32])) # need 32 biases for 32 outputs

<img src="https://ibm.box.com/shared/static/vn26neef1nnv2oxn5cb3uueowcawhkgb.png" style="width:800px; height:400px;" alt="HTML5 Icon" >

<h4>Convolve with weight tensor and add biases.</h4>

<p>To create convolutional layers, we use <b>tf.nn.conv2d</b>. It computes a 2-D convolution given 4-D input and filter tensors.</p>

Inputs:
<ul>
    <li>Tensor of shape [batch, in_height, in_width, in_channels]. x of shape [batch_size,28 ,28, 1].</li>
    <li>A filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels]. W is of size [5, 5, 1, 32].</li>
    <li>Stride which is  [1, 1, 1, 1]. The convolutional layer, slides the "kernel window" across the input tensor. As the input tensor has 4 dimensions:  [batch, height, width, channels], then the convolution operates on a 2D window on the height and width dimensions. <b>strides</b> determines how much the window shifts by in each of the dimensions. As the first and last dimensions are related to batch and channels, we set the stride to 1. But for second and third dimension, we could set other values, e.g. [1, 2, 2, 1].</li>
</ul>
    
Process:
<ul>
    <li>Change the filter to a 2-D matrix with shape [5\*5\*1, 32].</li>
    <li>Extracts image patches from the input tensor to form a <i>virtual</i> tensor of shape <code>[batch, 28, 28, 5\*5\*1]</code>.</li>
    <li>For each batch, right-multiplies the filter matrix and the image vector.</li>
</ul>

Output:
<ul>
    <li>A <code>Tensor</code> (a 2-D convolution) of size (?, 28, 28, 32).</li>
    <li>Notice: the output of the first convolution layer is 32 [28x28] images. Here 32 is considered as volume/depth of the output image.</li>
</ul>

In [None]:
convolve1= tf.nn.conv2d(x_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1

<img src="https://ibm.box.com/shared/static/iizf4ui4b2hh9wn86pplqxu27ykpqci9.png" style="width:800px;height:400px;" alt="HTML5 Icon">


<h4>Apply the ReLU activation Function</h4>

<p>In this step, we just go through all outputs convolution layer, <b>convolve1</b>, and wherever a negative number occurs,we swap it out for a 0. It is called ReLU activation Function.</p> 
<p>Let f(x) is a ReLU activation function $f(x) = max(0,x)$.</p>

In [None]:
h_conv1 = tf.nn.relu(convolve1)

<h4>Apply the max pooling</h4>

<p><b>max pooling</b> is a form of non-linear down-sampling. It partitions the input image into a set of rectangles and, and then find the maximum value for that region.</p>

<p>Lets use <b>tf.nn.max_pool</b> function to perform max pooling. 
<b>Kernel size:</b> 2x2 (if the window is a 2x2 matrix, it would result in one output pixel).</p>
    
<p><b>Strides:</b> dictates the sliding behaviour of the kernel. In this case it will move 2 pixels everytime, thus not overlapping. The input is a matrix of size 28x28x32, and the output would be a matrix of size 14x14x32.</p>

<img src="https://ibm.box.com/shared/static/kmaja90mn3aud9mro9cn8pbbg1h5pejy.png" alt="HTML5 Icon" style="width:800px; height:400px;"> 

In [None]:
conv1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') #max_pool_2x2
conv1

First layer completed

<h3>Convolutional Layer 2</h3>
<h4>Weights and Biases of kernels</h4>

We apply the convolution again in this layer. Lets look at the second layer kernel:  
<ul>
    <li>Filter/kernel: 5x5 (25 pixels).</li>
    <li>Input channels: 32 (from the 1st Conv layer, we had 32 feature maps).</li>
    <li>64 output feature maps.</li>
</ul>
    
<p><b>Notice:</b> here, the input image is [14x14x32], the filter is [5x5x32], we use 64 filters of size [5x5x32], and the output of the convolutional layer would be 64 convolved image, [14x14x64].</p>

<p><b>Notice:</b> the convolution result of applying a filter of size [5x5x32] on image of size [14x14x32] is an image of size [14x14x1], that is, the convolution is functioning on volume.</p>

In [None]:
W_conv2 = tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1))
b_conv2 = tf.Variable(tf.constant(0.1, shape=[64])) #need 64 biases for 64 outputs

<h4>Convolve image with weight tensor and add biases.</h4>

In [None]:
convolve2= tf.nn.conv2d(conv1, W_conv2, strides=[1, 1, 1, 1], padding='SAME')+ b_conv2

<h4>Apply the ReLU activation Function</h4>

In [None]:
h_conv2 = tf.nn.relu(convolve2)

<h4>Apply the max pooling</h4>

In [None]:
conv2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') #max_pool_2x2
conv2

Second layer completed. So, what is the output of the second layer, layer2?
<ul>
    <li>It is 64 matrix of [7x7].</li>
</ul>

<h3>Fully Connected Layer</h3>

<p>You need a fully connected layer to use the Softmax and create the probabilities in the end. Fully connected layers take the high-level filtered images from previous layer, that is all 64 matrices, and convert them to a flat array.</p>

<p>So, each matrix [7x7] will be converted to a matrix of [49x1], and then all of the 64 matrix will be connected, which make an array of size [3136x1]. We will connect it into another layer of size [1024x1]. So, the weight between these 2 layers will be [3136x1024].</p>


<img src="https://ibm.box.com/shared/static/pr9mnirmlrzm2bitf1d4jj389hyvv7ey.png" alt="HTML5 Icon" style="width:800px; height:400px;">

<h4>Flattening Second Layer</h4>

In [None]:
layer2_matrix = tf.reshape(conv2, [-1, 7*7*64])

<h4>Weights and Biases between layer 2 and 3</h4>

Composition of the feature map from the last layer (7x7) multiplied by the number of feature maps (64); 1027 outputs to Softmax layer.

In [None]:
W_fc1 = tf.Variable(tf.truncated_normal([7 * 7 * 64, 1024], stddev=0.1))
b_fc1 = tf.Variable(tf.constant(0.1, shape=[1024])) # need 1024 biases for 1024 outputs

<h4>Matrix Multiplication (applying weights and biases)</h4>

In [None]:
fcl=tf.matmul(layer2_matrix, W_fc1) + b_fc1

<h4>Apply the ReLU activation Function</h4>

In [None]:
h_fc1 = tf.nn.relu(fcl)
h_fc1

Third layer completed

<h4>Dropout Layer, Optional phase for reducing overfitting</h4>

<p>It is a phase where the network "forget" some features. At each training step in a mini-batch, some units get switched off randomly so that it will not interact with the network. That is, it weights cannot be updated, nor affect the learning of the other network nodes.  This can be very useful for very large neural networks to prevent overfitting.</p>

In [None]:
keep_prob = tf.placeholder(tf.float32)
layer_drop = tf.nn.dropout(h_fc1, keep_prob)
layer_drop

<h4>Readout Layer (Softmax Layer)</h4>

Type: Softmax, Fully Connected Layer.

<h4>Weights and Biases</h4>

<p>In last layer, CNN takes the high-level filtered images and translate them into votes using softmax.
Input channels: 1024 (neurons from the 3rd Layer); 10 output features.<p>

In [None]:
W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1)) #1024 neurons
b_fc2 = tf.Variable(tf.constant(0.1, shape=[10])) # 10 possibilities for digits [0,1,2,3,4,5,6,7,8,9]

<h4>Matrix Multiplication (applying weights and biases)</h4>

In [None]:
fc=tf.matmul(layer_drop, W_fc2) + b_fc2

<h4>Apply the Softmax activation Function</h4>
<b>softmax</b> allows us to interpret the outputs of <b>fcl4</b> as probabilities. So, <b>y_conv</b> is a tensor of probabilities.

In [None]:
y_CNN= tf.nn.softmax(fc)
y_CNN

<hr>

<h2 id="summary_CNN">Summary of the Deep Convolutional Neural Network</h2>
Now is time to remember the structure of  our network

<ol start="0">
    <li>Input - MNIST dataset</li>
    <li>Convolutional and Max-Pooling</li>
    <li>Convolutional and Max-Pooling</li>
    <li>Fully Connected Layer</li>
    <li>Processing - Dropout</li>
    <li>Readout layer - Fully Connected</li>
    <li>Outputs - Classified digits</li>
</ol>


<hr>

<h2 id="train_model">Define functions and train the model</h2>

<h4>Define the loss function</h4>

<p>We need to compare our output, layer4 tensor, with ground truth for all mini_batch. we can use <b>cross entropy</b> to see how bad our CNN is working - to measure the error at a softmax layer.</p>

<p>The following code shows an toy sample of cross-entropy for a mini-batch of size 2 which its items have been classified. You can run it (first change the cell type to <b>code</b> in the toolbar) to see how cross entropy changes.</p>

import numpy as np
layer4_test =[[0.9, 0.1, 0.1],[0.9, 0.1, 0.1]]
y_test=[[1.0, 0.0, 0.0],[1.0, 0.0, 0.0]]
np.mean( -np.sum(y_test * np.log(layer4_test),1))

<p><b>reduce_sum</b> computes the sum of elements of <b>(y_ * tf.log(layer4)</b> across second dimension of the tensor, and <b>reduce_mean</b> computes the mean of all elements in the tensor.</p>

In [None]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_CNN), reduction_indices=[1]))

<h4>Define the optimizer</h4>

<p>It is obvious that we want minimize the error of our network which is calculated by cross_entropy metric. To solve the problem, we have to compute gradients for the loss (which is minimizing the cross-entropy) and apply gradients to variables. It will be done by an optimizer: GradientDescent or Adagrad.</p>

In [None]:
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

<h4>Define prediction</h4>
Do you want to know how many of the cases in a mini-batch has been classified correctly? lets count them.

In [None]:
correct_prediction = tf.equal(tf.argmax(y_CNN,1), tf.argmax(y_,1))

<h4>Define accuracy</h4>
It makes more sense to report accuracy using average of correct cases.

In [None]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

<h4>Run session, train</h4>

In [None]:
sess.run(tf.global_variables_initializer())

<div class="alert alert-warning alertsuccess" style="margin-top: 20px">
<font size="3"><strong>Warning! Each step in the following loop takes around 1.2 seconds, and so in total it will take around 3 hours to run. So, you can run this cell if you REALLY have time to wait, or if you are running it using PowerAI </strong></font>
<br>
<br>

What is PowerAI? 

<p>Running deep learning programs usually needs a high performance platform. PowerAI speeds up deep learning and AI. Built on IBM's Power Systems, PowerAI is a scalable software platform that accelerates deep learning and AI with blazing performance for individual users or enterprises. The PowerAI platform supports popular machine learning libraries and dependencies including TensorFlow, Caffe, PyTorch, and Theano. You can download a <a href="https://cocl.us/ML0120EN_PAI">free version of PowerAI</a>.</p>
</div>

In [None]:

for i in range(5000):
    start = time.time()
    batch = mnist.train.next_batch(512)
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    end = time.time()
    if i%100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
        test_accuracy = accuracy.eval(feed_dict={x:mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})
        print("step", str(i), ", training accuracy", "{:.3f}".format(train_accuracy),"test accuracy", "{:.3f}".format(test_accuracy),", B_time=" , "{:.3f}".format(end - start) )

<i>PS. If you have problems running this notebook, please shutdown all your Jupyter running notebooks, clear all cells outputs and run each cell only after the completion of the previous cell.</i>

<hr>

<h2 id="evaluate_model">Evaluate the model</h2>

Print the evaluation to the user

In [None]:
print("test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

<h3>Visualization</h3>

Do you want to look at all the filters?

In [None]:
kernels = sess.run(tf.reshape(tf.transpose(W_conv1, perm=[2, 3, 0,1]),[32,-1]))

In [None]:
!wget --output-document utils1.py http://deeplearning.net/tutorial/code/utils.py
import utils1
from utils1 import tile_raster_images
import matplotlib.pyplot as plt
from PIL import Image
%matplotlib inline
image = Image.fromarray(tile_raster_images(kernels, img_shape=(5, 5) ,tile_shape=(4, 8), tile_spacing=(1, 1)))
### Plot image
plt.rcParams['figure.figsize'] = (18.0, 18.0)
imgplot = plt.imshow(image)
imgplot.set_cmap('gray')  

Do you want to see the output of an image passing through first convolution layer?


In [None]:
import numpy as np
plt.rcParams['figure.figsize'] = (5.0, 5.0)
sampleimage = mnist.test.images[1]
plt.imshow(np.reshape(sampleimage,[28,28]), cmap="gray")

In [None]:
ActivatedUnits = sess.run(convolve1,feed_dict={x:np.reshape(sampleimage,[1,784],order='F'),keep_prob:1.0})
filters = ActivatedUnits.shape[3]
plt.figure(1, figsize=(20,20))
n_columns = 6
n_rows = np.math.ceil(filters / n_columns) + 1
for i in range(filters):
    plt.subplot(n_rows, n_columns, i+1)
    plt.title('Filter ' + str(i))
    plt.imshow(ActivatedUnits[0,:,:,i], interpolation="nearest", cmap="gray")

What about second convolution layer?

In [None]:
ActivatedUnits = sess.run(convolve2,feed_dict={x:np.reshape(sampleimage,[1,784],order='F'),keep_prob:1.0})
filters = ActivatedUnits.shape[3]
plt.figure(1, figsize=(20,20))
n_columns = 8
n_rows = np.math.ceil(filters / n_columns) + 1
for i in range(filters):
    plt.subplot(n_rows, n_columns, i+1)
    plt.title('Filter ' + str(i))
    plt.imshow(ActivatedUnits[0,:,:,i], interpolation="nearest", cmap="gray")

In [None]:
sess.close() #finish the session

In [None]:
%%javascript
// Shutdown kernel
Jupyter.notebook.session.delete()

<h2>Do you want to use GPU in production?</h2>

<p>Running deep learning programs usually needs a high performance platform. PowerAI speeds up deep learning and AI. Built on IBM's Power Systems, PowerAI is a scalable software platform that accelerates deep learning and AI with blazing performance for individual users or enterprises. The <a href="https://cocl.us/ML0122EN_IBMCLOUD_PowerAI">PowerAI platform on IBM Cloud</a> supports popular machine learning libraries and dependencies including TensorFlow, Caffe, PyTorch, and Theano.</p>

<h3>Thanks for completing this lesson!</h3>



<h4>Author: <a href="https://ca.linkedin.com/in/saeedaghabozorgi">Saeed Aghabozorgi</a></h4>
<p><a href="https://ca.linkedin.com/in/saeedaghabozorgi">Saeed Aghabozorgi</a>, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients’ ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.</p>
</article>

<hr>

<p>Copyright &copy; 2018 <a href="https://cocl.us/DX0108EN_CC">Cognitive Class</a>. This notebook and its source code are released under the terms of the <a href="https://bigdatauniversity.com/mit-license/">MIT License</a>.</p>