<a href="https://colab.research.google.com/github/Daniel2291/BCNN/blob/main/BCNN_Development.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Introduction**

In this lab, you will learn to use Google Colab hardware resources e.g. CPU, GPU and TPU for Python based-neural networks created using Tensorflow deep learning framework.

# **Selecting hardware device, dataset and neural network model**

Variables "hw_device", "input_dataset", "nn_model" are used in this notebook for some internal decision-making in this notebook which simplifies things for students of this lab. These are not standard configuration parameters for Google Colab or TensorFlow. **Hence, make sure that the runtime in "Runtime > Change runtime type" is same as the value inside "hw_device" variable before running the notebook. You must do this manually.**

Also, while generating results for different hardware devices or datasets or neural network models by changing the variables "hw_device" or "input_dataset" or "nn_model", **always use: "Runtime > Restart and run all". Otherwise it will resume training from the point where you previously stopped and that won't be correct as we want new training to start fresh after every change.**

If you get the error: "**Failed to assign a backend. No backend with GPU (or TPU) available. Would you like to use a runtime with no accelerator?**" while using Runtime > Change runtime type<br>
Reason: No GPU/TPU is free at the moment.<br>
Solution: Chill out and try after sometime.

In [None]:
# hw_device, possible options: "CPU", "GPU", "TPU"
hw_device = "GPU"

# input_dataset, possible options: "MNIST", "FMNIST", "CIFAR10"
input_dataset = "MNIST"

# nn_model, possible options: "FC", "Lenet5", "VGG16"
nn_model = "Lenet5"

# **Setting parameters related to neural network training**

We now set the variables which control the neural network training.

In [None]:
# Batch size (No. of images processed by the neural net at a time)
batch_size = 256

# Training epochs (No. of training iterations over the entire training set)
epochs = 10

# **About TensorFlow and Keras**

https://www.tensorflow.org/<br>
TensorFlow is an open-source deep learning framework developed by Google. It provides many optimized building blocks such as layers, optimizers etc. to build neural network models. However, it is not very easy to use in terms of programming. <br>

https://keras.io/<br>
Keras is a high level API built on top of different frameworks as backends e.g. TensorFlow, Theano etc. It is more user-friendly to use as compared to using TensorFlow directly and describes different deep learning aspects e.g. model definition, training etc.<br>

https://stackoverflow.com/questions/55178230/what-is-the-difference-between-keras-and-tf-keras<br>
tf.keras configures Keras API to use TensorFlow backend and thus allows usage of Tensorflow specific features e.g. tf.data.Dataset as input objects. From Tensorflow 2.0, tf.keras is the default and highly recommended to start working using tf.keras.

# **Importing necessary libraries**
1.   tensroflow :
<br> Exaplained above.
2.   pandas:
<br> https://pandas.pydata.org/<br>
It is a data analysis and manipulation library.<br>
We will use its dataframe feature (2D tabular data) later.
3. numpy:
<br>https://numpy.org/<br>
It is a fast C-based library that offers many useful optimized functions for scientific computing.<br>
We use it to preprocess our datasets.
4. matplotlib:<br>
https://matplotlib.org/<br>
Matplotlib is a library for creating static, animated, and interactive visualizations in Python. <br> We use its pyplot module to visualize the input images before training and the predictions after the training.<br>
https://stackoverflow.com/questions/43027980/purpose-of-matplotlib-inline<br>
With %matplotlib inline, output of plotting commands is displayed in the Jupyter notebook directly below the code cell that produced it. The resulting plots will then also be stored in the notebook document.

**Note:** The parts of the code labelled as "reproducibility settings" need not always be a part of your deep learning code. <br>Interested students can refer to the following links for more insights:<br>
https://suneeta-mall.github.io/2019/12/22/Reproducible-ml-tensorflow.html <br>
https://towardsdatascience.com/reproducible-models-with-weights-biases-415776c4cbb7 <br>
https://stackoverflow.com/questions/58433555/why-are-my-results-still-not-reproducible <br>
https://stackoverflow.com/questions/60041860/reproduce-same-results-on-each-run-keras-google-colab <br>
https://stackoverflow.com/questions/59075244/if-keras-results-are-not-reproducible-whats-the-best-practice-for-comparing-mo/59075958




In [None]:
!pip uninstall -y tensorflow keras

!pip install --upgrade tensorflow==2.14.0 keras==2.14.0
!pip install --upgrade larq

Found existing installation: tensorflow 2.14.0
Uninstalling tensorflow-2.14.0:
  Successfully uninstalled tensorflow-2.14.0
Found existing installation: keras 2.14.0
Uninstalling keras-2.14.0:
  Successfully uninstalled keras-2.14.0
Collecting tensorflow==2.14.0
  Using cached tensorflow-2.14.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB)
Collecting keras==2.14.0
  Using cached keras-2.14.0-py3-none-any.whl.metadata (2.4 kB)
Using cached tensorflow-2.14.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (489.9 MB)
Using cached keras-2.14.0-py3-none-any.whl (1.7 MB)
Installing collected packages: keras, tensorflow
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-decision-forests 1.11.0 requires tensorflow==2.18.0, but you have tensorflow 2.14.0 which is incompatible.
tensorflow-text 2.18.1 requires tensorf

In [None]:
#===== Reproducibility Settings (Before TensorFlow Import) =====#
import os
#*IMPORANT*: Have to do this line *before* importing tensorflow
os.environ['PYTHONHASHSEED']=str(1)
# Set GPU execution to be deterministic
# os.environ['TF_CUDNN_DETERMINISTIC']='true'
# os.environ['TF_DETERMINISTIC_OPS']='true'
#===============================================================#


# Magic function to switch between tensorflow versions 1.x and 2.x
%tensorflow_version 2.x
# Import the libraries discussed in the text above.
import tensorflow as tf
import pandas as pd
import numpy as np

import larq as lq
%matplotlib inline
import matplotlib.pyplot as plt
# Check the tensorflow version: should be 2.x
print("Tensorflow version " + tf.__version__)


#===== Reproducibility Settings (After TensorFlow Import) =====#
# Import initializers to set initial weights for neural network layers.
from tensorflow.keras import initializers
# Limit tensorflow multithreading to a single thread.
# tf.config.threading.set_inter_op_parallelism_threads(1)
# tf.config.threading.set_intra_op_parallelism_threads(1)
import random
import tensorflow.keras.backend as K

def reset_random_seeds():
   #os.environ['PYTHONHASHSEED']=str(1)
   tf.random.set_seed(1)
   np.random.seed(1)
   random.seed(1)

def reset_graph(reset_graph_with_backend=None):
    if reset_graph_with_backend is not None:
        K = reset_graph_with_backend
        K.clear_session()
        tf.compat.v1.reset_default_graph()
        print("KERAS AND TENSORFLOW GRAPHS RESET")

reset_random_seeds()
reset_graph(K)
# K.set_floatx('float32')
#==============================================================#

Colab only includes TensorFlow 2.x; %tensorflow_version has no effect.
Tensorflow version 2.14.0
KERAS AND TENSORFLOW GRAPHS RESET


# **Setting Up TensorBoard**
https://medium.com/ydata-ai/how-to-use-tensorflow-callbacks-f54f9bb6db25
<br>TensorBoard is a toolkit from TensorFlow which creates visualizations for internal states of the neural network model.<br>TensorBoard callback is a set of functions which are applied to record these internal states in a specified log directory during training. These logs are turned into visulaizations by invoking Tensorboard. <br>If TensorBoard is invoked before the training, the visualization is created parallely with data logging (dynamic plot). For TensorBoard invocation after training, the visualization is a static plot.

https://sarbashis.github.io/installation/2019/How-to-configure-tensorboard-jupyter-inline/ <br>
There are many ways you can call the tensorbords.
1. Calling the tensorboard local server and open in the browser. (Most common way)
2. Using "%load_ext tensorboard" which enables the TensorBoard visualization creation within the jupyter notebook. (Used in this lab.)

Different parameters used in TensorBoard callback are described here: https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard.


In [None]:
## CREATING AND SETTING UP TENSORBOARD

# To use tensorboard within a Jupyter Notebook or Google’s Colab.
%load_ext tensorboard

# Import TensorBoard callback
from tensorflow.keras.callbacks import TensorBoard

# Remove old logs
!rm -rf logs

# Include date and time in the log folder title to keep logs for different runs separate.
import datetime
log_folder = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")

# Setup the callback configuration
callbacks = [TensorBoard(log_dir=log_folder, histogram_freq=1,write_graph=True,
            write_images=True, update_freq='epoch', profile_batch=2, embeddings_freq=1)]

# **Setting Up Colab's CPU/GPU/TPU for usage**

https://www.tensorflow.org/guide/distributed_training

tf.distribute.Strategy is a TensorFlow API to distribute training across multiple GPUs or TPUs.
We will later create our neural network models within the scope of the strategy which allows the training to use that strategy.

CPUs: **Default distribution strategy** <br>
Obtained using tf.distribute.get_strategy()<br>
It provides no actual distribution and you can think of it as a "no-op" strategy.

GPUs: **tf.distribute.MirroredStrategy** <br>
Provides distributed training on multiple GPUs.

TPUs: **tf.distribute.TPUStrategy**<br>
Provides distributed training across multiple TPU cores. <br>




# **Additional information for Colab GPU/TPU usage**

**GPU Types in Colab:**<br>
https://research.google.com/colaboratory/faq.html#gpu-availability <br>
The types of GPUs that are available in Colab vary over time. This is necessary for Colab to be able to provide access to these resources for free. The GPUs available in Colab often include Nvidia K80s, T4s, P4s and P100s. There is no way to choose what type of GPU you can connect to in Colab at any given time. Users who are interested in more reliable access to Colab’s fastest GPUs may try Colab Pro.<br>

**TPU API Links:**<br>
https://www.tensorflow.org/guide/distributed_training <br>
The TPUClusterResolver instance helps locate the TPUs. In Colab, you don't need to specify any arguments to it.

https://www.tensorflow.org/api_docs/python/tf/config/experimental_connect_to_cluster <br> It will connect to the given cluster and make devices on the cluster available to use.

https://www.tensorflow.org/api_docs/python/tf/tpu/experimental/initialize_tpu_system <br> Initializes the TPU devices.<br>

**TPU Set Up tutorials:**<br>
https://colab.research.google.com/notebooks/tpu.ipynb <br>
https://www.tensorflow.org/guide/tpu


In [None]:
if hw_device == "TPU":
  # Locate the TPUs.
  try:
    resolver = tf.distribute.cluster_resolver.TPUClusterResolver()  # TPU cluster detection
    print('Running on TPU ', resolver.cluster_spec().as_dict()['worker'])
  except ValueError:
    raise BaseException('ERROR: Not connected to a TPU runtime!')

  # Connect to the given TPU cluster and make TPU devices available for use.
  tf.config.experimental_connect_to_cluster(resolver)

  # Initialize the TPU devices.
  tf.tpu.experimental.initialize_tpu_system(resolver)

  # Create the strategy.
  strategy = tf.distribute.TPUStrategy(resolver)

elif hw_device == "GPU":
  # Directly create the strategy.
  strategy = tf.distribute.MirroredStrategy()

elif hw_device == "CPU":
  # Directly create the strategy.
  strategy = tf.distribute.get_strategy()

else:
  raise Exception("Not a valid hardware device.")

# **Getting details of Colab hardware devices being used**

In [None]:
# # Function to print the details of hardware devices being used
# def detailed_info_hw_devices():
#   print("-------------------------------------------")
#   # print("CPU Info")
#   print("-------------------------------------------")
#   # !cat /proc/cpuinfo
#   print("===========================================")
#   if hw_device == "GPU":
#     print("GPU Info")
#     print("-------------------------------------------")
#     !nvidia-smi
#     print("===========================================")
#   if hw_device == "TPU":
#     print("TPU Info")
#     print("-------------------------------------------")
#     print(tf.config.list_logical_devices('TPU'))
#     print("===========================================")

# # Print the details of hardware devices being used
# detailed_info_hw_devices()

Another way of getting the hardware info is shown below. It is not required to use this cell, just for additional information.

In [None]:
## Get the details of all devices being used.
# from tensorflow.python.client import device_lib
# print(device_lib.list_local_devices())

## Extract details of the GPUs
# def check_gpus():
#     gpus = [x.physical_device_desc.split(",")[1] for x in device_lib.list_local_devices() if x.device_type == 'GPU']
#     if len(gpus) == 0:
#       print('No GPU devices found')
#     else:
#       print("GPUs:")
#       print(gpus)
#     print("-------------------------------------------")

# **Preprocessing the datasets**
TensorFlow provides built-in support for some datasets that can be loaded directly using load_data API.<br>
For example, https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist/load_data <br>
It returns tuple of Numpy arrays for training and test set: (x_train, y_train), (x_test, y_test).<br>
The data in both of these sets will be of uint8 datatype.

https://towardsdatascience.com/guide-to-coding-a-custom-convolutional-neural-network-in-tensorflow-bec694e36ad3 <br>
Tensrflow model will expect the input shape to be [batch_size, height, width, color_channels]. Since the images are grayscale (single-channel) for example in MNIST and Fashion MNIST, they have shape [60000,28,28] so we need to add a dummy color channel dimension to make the shape [60000,28,28,1].

In [None]:
if input_dataset == "MNIST":
  (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
  # actual names of the classes
  txt_labels = ['0','1','2','3','4','5','6','7','8','9']
  out_classes = 10
  # add dummy color channel dimension as image is grayscale
  x_train = x_train / 255.0
  x_test = x_test / 255.0
  # Scale to [-1, 1]
  x_train = x_train * 2.0 - 1.0
  x_test = x_test * 2.0 - 1.0
  # Binarize with tf.sign
  x_train_bin = tf.sign(x_train)
  x_test_bin = tf.sign(x_test)
  # Replace zeros (if any) with +1
  x_train_bin = tf.where(tf.equal(x_train_bin, 0), tf.ones_like(x_train_bin), x_train_bin)
  x_test_bin = tf.where(tf.equal(x_test_bin, 0), tf.ones_like(x_test_bin), x_test_bin)
  # Convert back to numpy arrays if needed
  x_train_bin = x_train_bin.numpy()
  x_test_bin = x_test_bin.numpy()
  x_train_bin = np.expand_dims(x_train_bin, axis=-1)  # shape (60000, 28, 28, 1)
  x_test_bin = np.expand_dims(x_test_bin, axis=-1)    # shape (10000, 28, 28, 1)
  # Pad to (N, 32, 32, 1) with constant = -1 (background)
  pad_width = ((0, 0),  # no pad on batch
               (2, 2),  # pad 2 pixels top/bottom
               (2, 2),  # pad 2 pixels left/right
               (0, 0))  # no pad on channel
  x_train_bin = np.pad(x_train_bin,
                         pad_width,
                         mode='constant',
                         constant_values=-1)
  x_test_bin  = np.pad(x_test_bin,
                         pad_width,
                         mode='constant',
                         constant_values=-1)
  print(x_train_bin.min(), x_train_bin.max())  # should print -1 and 1
  print(x_train_bin.shape)
  print(x_train.shape)


else:
    raise Exception("Not a valid dataset.")

-1.0 1.0
(60000, 32, 32, 1)
(60000, 28, 28)


# **Dataset distribution statistics and visualization**

In [None]:
# # Extract information about the dataset distribution statistics.
print('Distribution of train and test set:')
print('Number of training images:', x_train.shape[0])
print('Number of test images:', x_test.shape[0])
print('--------------------------------------------------')

# print('Distribution of digits in the dataset:')
# # Find the unique elements of an array and also return the number of times each unique item appears in array.
# train_labels_count = np.unique(y_train, return_counts=True)
# # Data structure with two-dimensional tabular data with labeled axes (rows and columns).
# dataframe_train_labels = pd.DataFrame({'Label':train_labels_count[0], 'Count':train_labels_count[1]})
# display(dataframe_train_labels)
# print('==================================================')


# ## Both data visualization functions below are inspired from: https://colab.research.google.com/notebooks/tpu.ipynb
# ## The same functions display training data now and will display the network predicitions later in "Performing inference" section.
# # Function to display one image with label.
# def display_one_sample(image, title, subplot, color):
#   plt.subplot(subplot)
#   #plt.axis('off')
#   plt.axis('on')
#   ax = plt.gca()
#   ax.axes.xaxis.set_visible(False)
#   ax.axes.yaxis.set_visible(False)
#   # plt.grid(True)
#   plt.imshow(image,cmap=plt.cm.gray_r)
#   plt.title(title, fontsize=16, color=color)

# # Function to display display a batch of 9 images with labels.
# def display_nine_samples(images, titles, nn_model_outputs=None, infer=False):
#   subplot = 331
#   plt.figure(figsize=(9,9))
#   for i in range(9):
#     if infer == True:
#       predicted_label = txt_labels[np.argmax(nn_model_outputs[i])]
#       predicted_probability = np.max(nn_model_outputs[i])
#       actual = txt_labels[np.squeeze(titles[i])]
#       img_title = 'Actual:'+ actual + '\n' +'Prediction:' + str(predicted_label) +'\n' +'Probability' + str(predicted_probability)
#       color = 'black'
#       display_one_sample(images[i], img_title, 331+i, color)
#     else:
#       title = txt_labels[np.squeeze(titles[i])]
#       color = 'black'
#       display_one_sample(images[i], title, 331+i, color)
#   plt.tight_layout()
#   plt.subplots_adjust(wspace=0.8, hspace=0.1)
#   plt.show()

# # Display one image
# print('Displaying a single image')
# image = np.squeeze(x_train_bin[44])
# label = np.squeeze(y_train[44])
# display_one_sample(image, txt_labels[label], 111, 'black')

# # Display a batch of images
# print('Displying a batch of 9 images')
# images = np.squeeze(x_test[:9])
# labels = np.squeeze(y_test[:9])
# display_nine_samples(images, labels)
# print('==================================================')

Distribution of train and test set:
Number of training images: 60000
Number of test images: 10000
--------------------------------------------------


# **Creating the neural network model**

**Model creation API:**<br>
https://www.tensorflow.org/api_docs/python/tf/keras/Sequential<br>
**Sequential** groups a stack of layers into a neural network model object.<br>
**add** method (e.g. model.add()) is used to add a layer to our neural network model.
Sequential connects the neural network layers in the same sequence in which they appear the code (using .add) after the line **tf.keras.Sequential()**.

**Clarification about VGG16 model:**<br>
Original VGG16 source - https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py<br>
Original VGG16 is too big for any of our datasets, so we commented some convolution layers and changed size of last two fully connected layers in the original model. These changes can be seen as comments in the VGG16 model code in the below cell. BatchNorm Layers are also added by us to the original VGG16 model for better accuracy.<br>
All these changes can be seen in the code for VGG16 model below.

**kernel_initializer=initializers.glorot_uniform(seed=0)**:<br>
https://www.tensorflow.org/api_docs/python/tf/keras/initializers/GlorotUniform <br>
It intializes convolution layers and fully-connected layers(also called dense layers) to same weights for each run. This provides reproducibilty and is optional. You can remove or modify the intializer later for your own use.

**Training the model to GPU/TPU:**<br>
https://colab.research.google.com/notebooks/tpu.ipynb<br>
Creating the model in the TPU/GPU Strategy scope means we will train the model on the TPU/GPU. Apart from the strategy, there is no other setting specific to GPU/TPU and you can train a model with Keras fit/compile APIs like you would normally do.

**Model Compilation API:**<br>
https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile <br>
**model.compile** API allows us to set the training environment e.g. which optimizer to use, which loss function to minimize, what output metric (e.g. accuracy) to evaluate etc.

**Meaning of "None" in model summary shapes:**<br>
https://pgaleone.eu/tensorflow/2018/07/28/understanding-tensorflow-tensors-shape-static-dynamic/<br>
None denotes partially-known shape: in this case, we know the rank, but we have an unknown size for one or more dimension (everyone that has trained a model in batch is aware of this, when we define the input we just specify the feature vector shape, letting the batch_size dimension set to None, e.g.: (None, 28, 28, 1). Batch size is set later when we actually train the model using model.fit API.



In [None]:
# Function that creats predefined model as per user choice stored in nn_model variable.
def create_model():
  # Common keyword arguments for binary layers
  kwargs = dict(input_quantizer="ste_sign",     # Quantize inputs using sign function
              kernel_quantizer="ste_sign",      # Quantize weights using sign function
              kernel_constraint="weight_clip",  # Clip weights to [-1, 1]
              use_bias=False)                   # No bias used yers

  model = tf.keras.models.Sequential([

     # First binary convolutional layer
    lq.layers.QuantConv2D(2, 4,                              # 2 filters, 4x4 kernel size
                          kernel_quantizer="ste_sign",
                          kernel_constraint="weight_clip",
                          use_bias=False,
                          input_shape=(32, 32, 1)),          # Input shape: 32x32 grayscale images, maybe look into 28x28

     # Max pooling layer to downsample feature maps
    tf.keras.layers.MaxPool2D(pool_size=(2, 2), strides=(2, 2)),
     # Batch normalization to stabilize training
    tf.keras.layers.BatchNormalization(momentum=0.999, scale=False),
     # Flatten the 2D output to 1D for dense layer
    tf.keras.layers.Flatten(),
     # Binary dense (fully connected) layer with 30 units
    lq.layers.QuantDense(30, **kwargs),
     # Another batch normalization layer
    tf.keras.layers.BatchNormalization(momentum=0.999, scale=False),
     # Final activation layer for classification
    tf.keras.layers.Activation("softmax")

    # lq.layers.QuantConv2D(16, 3, padding="same", **kwargs, input_shape=(32, 32, 1)),
    # tf.keras.layers.MaxPool2D(pool_size=(2, 2), strides=(2, 2)),
    # tf.keras.layers.BatchNormalization(momentum=0.999, scale=False),

    # tf.keras.layers.Flatten(),

    # lq.layers.QuantDense(128, **kwargs),
    # tf.keras.layers.BatchNormalization(momentum=0.999, scale=False),

    # _ _ _ _ _ _ _
  ])
  '''
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Conv2D(filters=1, kernel_size=(5, 5), activation=tf.sign, input_shape=x_train_bin.shape[1:],kernel_initializer=initializers.glorot_uniform(seed=0)))
  model.add(tf.keras.layers.MaxPool2D(pool_size=(2,2),strides=1))
  #model.add(tf.keras.layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu',kernel_initializer=initializers.glorot_uniform(seed=0)))
  #model.add(tf.keras.layers.AveragePooling2D(pool_size=(2,2),strides=(2,2)))
  model.add(tf.keras.layers.Flatten())
  #model.add(tf.keras.layers.Dense(units=84, activation='relu',kernel_initializer=initializers.glorot_uniform(seed=0)))
  model.add(tf.keras.layers.Dense(units=out_classes, activation = 'softmax',kernel_initializer=initializers.glorot_uniform(seed=0)))'''
  return model

# To run the model on CPU/GPU/TPU, create and compile the model in the scope of CPU/GPU/TPU.
with strategy.scope():
    model = create_model()
    model.compile(
      # Optimizer
      #optimizer=tf.keras.optimizers.SGD(),
      optimizer=tf.keras.optimizers.Adam(),
      # Loss function to minimize
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      # List of metrics to monitor
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],)

# Display the summary of model layers, layer output dimensions and no. of parameters.
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 quant_conv2d_1 (QuantConv2  (None, 29, 29, 2)         32        
 D)                                                              
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 14, 14, 2)         0         
 g2D)                                                            
                                                                 
 batch_normalization_2 (Bat  (None, 14, 14, 2)         6         
 chNormalization)                                                
                                                                 
 flatten_1 (Flatten)         (None, 392)               0         
                                                                 
 quant_dense_1 (QuantDense)  (None, 30)                11760     
                                                      

# **Training the neural network**

**API for training:**
https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit <br>
The neural network model is trained using **model.fit** API.

**TensorBoard visualizations for TPU:**<br>
We have included TensorBoard visualizations with CPU and GPU in this notebook by passing TensorBoard callback to **model.fit** if **hw_device is set to CPU or GPU.** <br>
For using tensorboard with TPU follow instructions at: https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/profiling_tpus_in_colab.ipynb#scrollTo=N6ZDpd9XzFeN <br> However, it needs you to create a Cloud Storage bucket for storing TensorBoard logs which needs creating a free trial on google cloud platform using billing information. Hence, we **do not include TensorBoard visulaizations** with TPU in this notebook and don't provide the callback to mode.fit when **hw_device is set to TPU**. Interested people can try this at their own financial risk and we are not responsible for any consequences.


In [None]:
# TensorFlow callback is used with CPU/GPU.
if hw_device == "GPU" or hw_device == "CPU":
  model.fit(
    x_train_bin.astype(np.float32), y_train.astype(np.float32),
    epochs= epochs,
    batch_size = batch_size,
    validation_data=(x_test_bin.astype(np.float32), y_test.astype(np.float32)),
    validation_freq=epochs,shuffle=False, callbacks=callbacks)

# TensorBoard callback is NOT used with TPU.
# Note: TPU doesnt support float64, CPU and GPU do.
if hw_device == "TPU":
  model.fit(
    x_train.astype(np.float32), y_train.astype(np.float32),
    epochs= epochs,
    batch_size = batch_size,
    validation_data=(x_test.astype(np.float32), y_test.astype(np.float32)),
    validation_freq=epochs,shuffle=False )

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


# **TensorBoard visualizations for CPU/GPU**
"%tensorboard" invokes TensorBoard for GPU/CPU.

In [None]:
# # Launch TensorBoard.
# if hw_device == "GPU" or hw_device == "CPU":
#   %tensorboard --logdir logs

# **Saving trained model.**

**API for model saving:**<br>
https://www.tensorflow.org/api_docs/python/tf/keras/models/save_model <br>
Use **model.save** to save the keras model. Keras model includes the network structure, weights and other internal variables. <br>
https://keras.io/api/models/model_saving_apis/#saveweights-method<br>
Use save_weights method if you want to save the model weights without the other internal variables.

**File format for model saving**<br>
https://www.geeksforgeeks.org/hdf5-files-in-python/<br>
We stored the model in .h5 (HDF5) file. HDF5 file stands for Hierarchical Data Format 5. It is an open-source file format which comes in handy to store large amount of data. It stores data in a hierarchical structure within a single file. So if we want to quickly access a particular part of the file rather than the whole file, we can easily do that using HDF5. This functionality is not seen in normal text files and hence HDF5 is becoming seemingly popular.

**include_optimizer flag:**<br>
https://stackoverflow.com/questions/44258739/trained-and-loaded-keras-sequential-model-is-giving-different-result<br>
Internal variables need to be restored so that the accuracy you had achieved on the test set at the end of training will be exactly same as the accuracy you obtain on the test set after loading the saved trained model for inference. This can be achived using **include_optimizer = True.**


In [None]:
# # Create h5 file to save the full model.
# trained_model = input_dataset + '_' + nn_model + '_trained' + '.h5'

# # Save the model including internal variables.
# model.save(trained_model,overwrite=True,include_optimizer=True)

# # Create h5 file to save the weights only.
# trained_model_weights_only = input_dataset + '_' + nn_model + '_trained'+ '_weights_only' + '.weights.h5'
# print(trained_model_weights_only)

# # Save the model weights, excluding internal variables.
# # Note: During your mini-project, you can train the model using this notebook, export the weights from save_weights and
# # use those weights for your RTL implementation.
# model.save_weights(trained_model_weights_only, overwrite=True)

# weights_file = 'MNIST_Lenet5_trained_weights_only.h5'

# # Define the output CSV file
# output_csv_file = 'model_weights.csv'

# binary_weights = {}
# for layer in model.layers:
#     # Only process layers that actually have weights
#     weights = layer.get_weights()
#     if not weights:
#         continue

#     # For each weight tensor (e.g. kernel, bias), apply sign quantization:
#     #   sign(w) gives -1 for w<0, +1 for w>=0
#     bin_tensors = [np.where(w >= 0, 1, -1).astype(np.int8) for w in weights]
#     print(bin_tensors)
#     binary_weights[layer.name] = bin_tensors

# # Now you can save `binary_weights` however you like, e.g. to CSV:
# with open(output_csv_file, 'w') as f:
#     f.write("layer,tensor_index,weight_value\n")
#     for lname, tensors in binary_weights.items():
#         for idx, t in enumerate(tensors):
#             for val in t.flatten():
#                 f.write(f"{lname},{idx},{val}\n")


# '''# Open the CSV file for writing
# with open(output_csv_file, 'w') as f:
#     # Write a header row (optional but good practice)
#     f.write("Layer_Name,Tensor_Index,Weight_Value\n")

#     # Iterate through the layers in the model
#     for layer in model.layers:
#         layer_name = layer.name
#         weights = layer.get_weights()

#         if weights:
#             # Iterate through the weight tensors for the current layer
#             for tensor_index, weight_tensor in enumerate(weights):
#                 # Flatten the weight tensor to a 1D array
#                 flattened_weights = weight_tensor.flatten()

#                 # Write each weight value to the CSV file
#                 for weight_value in flattened_weights:
#                     f.write(f"{layer_name},{tensor_index},{weight_value}\n")'''

# print(f"Weights converted and saved to {output_csv_file}")

# **Performing inference**

**Inference API:**<br>
https://www.tensorflow.org/api_docs/python/tf/keras/Model#predict<br>
model.predict: Generates output predictions for given input samples.

https://www.tensorflow.org/api_docs/python/tf/keras/Model#evaluate<br>
model.evaluate: Returns the loss value & accuracy for the model over the entire test set.<br>
**Note: Run the next cell (below) 2-3 times to record the time for inference. Take value from last run. No need to restart and run all, just click on the play button on the cell's leftmost part.**

In [None]:
score = model.evaluate(x_test_bin, y_test, batch_size)

In [None]:
# print('Displying a small batch of 9 images')
# images = np.squeeze(x_test_bin[:9])
# labels = np.squeeze(y_test[:9])
# nn_outputs = model.predict(x_test_bin[:9])
# display_nine_samples(images, labels, nn_outputs, True)
# print('==================================================')