<a href="https://colab.research.google.com/github/rupeshthapa123/NotebookProject/blob/main/Rupesh_Thapa_Lab01_Math.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2. Math for Deep Learning

In this lab, you will learn and apply the basic tensor math operations available in TensorFlow and Keras. These functions and datatypes are commonly used when implementing deep neural networks and familiarity with them will help faster prototyping and design of soultions. Note the differences between TensorFlow/Keras, which is streamlined for tensor operations, and libraries such as NumPy, specialized in generic linear algebra.

INSTRUCTIONS:
- Solve the exercises as required using NumPy, TensorFlow or Keras.
- Refer to the lecture slides for codes and explanations.
- Upload your completed notebook to the corresponding dropbox.

## 2.1 Data representations for neural networks

1. Provide an example of a scalar (rank-0 tensor) and find its dimension, shape and datatype.

In [None]:
import numpy as np

In [None]:
x = np.array(15)
x

array(15)

In [None]:
x.ndim

0

In [None]:
x.shape

()

In [None]:
x.dtype

dtype('int64')

2. Provide an example of a vector (rank-1 tensor) and find its dimension, shape and datatype.

In [None]:
x = np.array([15,6,9,13,8])
x

array([15,  6,  9, 13,  8])

In [None]:
x.ndim

1

In [None]:
x.shape

(5,)

In [None]:
x.dtype

dtype('int64')

3. Provide an example of a matrix (rank-2 tensor) and find its dimension, shape and datatype.

In [None]:
x = np.array([[6,15,16,2,4],
              [8,2,7,11,6],
              [9,3,17,13,10]])
x

array([[ 6, 15, 16,  2,  4],
       [ 8,  2,  7, 11,  6],
       [ 9,  3, 17, 13, 10]])

In [None]:
x.ndim

2

In [None]:
x.shape

(3, 5)

In [None]:
x.dtype

dtype('int64')

4. Provide an example of a Rank-3 tensor and find its dimension, shape and datatype.

In [None]:
x = np.array([[[6,3,5,16,0],
               [3,1,16,17,3],
               [8,2,11,19,17]],
              [[4,20,14,7,3],
               [6,21,27,3,4],
               [4,3,8,9,14]],
              [[5,9,2,26,4],
               [4,30,31,4,6],
               [7,31,36,24,9]]])
x

array([[[ 6,  3,  5, 16,  0],
        [ 3,  1, 16, 17,  3],
        [ 8,  2, 11, 19, 17]],

       [[ 4, 20, 14,  7,  3],
        [ 6, 21, 27,  3,  4],
        [ 4,  3,  8,  9, 14]],

       [[ 5,  9,  2, 26,  4],
        [ 4, 30, 31,  4,  6],
        [ 7, 31, 36, 24,  9]]])

In [None]:
x.ndim

3

In [None]:
x.shape

(3, 3, 5)

In [None]:
x.dtype

dtype('int64')

## 2.2 Tensor operations

1. Create a routine that returns the ReLU function of the argument passed.

In [None]:
def naive_relu(x):
  # Ensure the input is a 2D tensor
  assert len(x.shape) == 2
  # Create a copy of the input tensor to avoid modifying the original tensor
  x = x.copy()
  # Iterate over each element in the 2D tensor
  for i in range(x.shape[0]):
    for j in range(x.shape[1]):
      # Apply the ReLU function: max(x, 0)
      x[i,j] = max(x[i,j],0)
  # Return the modified tensor
  return x

naive_relu (np.array([[-1.0, 2.0], [-0.5, 3.0]]))

array([[0., 2.],
       [0., 3.]])

2. Create a routine that returns the sum of the two arguments passed.

In [None]:
def naive_add(x, y):
  # Ensure both inputs are 2D tensors
  assert len(x.shape) == 2
  assert x.shape == y.shape # Ensure both tensors have the same shape
  # Create a copy of the first input tensor to avoid modifying the original tensor
  x = x.copy()
  # Iterate over each element in the 2D tensors
  for i in range(x.shape[0]):
    for j in range(x.shape[1]):
       # Add corresponding elements from y to x
      x[i,j] += y[i,j]
  # Return the modified tensor
  return x

naive_add (np.array([[1.0, 2.0], [3.0, 4.0]]), np.array([[5.0, 6.0], [7.0, 8.0]]))

array([[ 6.,  8.],
       [10., 12.]])

3. Time how long it takes to sum and ReLU function of two random tensors x, y both with shape (20,100) 1000 times using the functions created in steps 1 and 2.

In [None]:
import time
# Generate two random 2D arrays (20x100) with values between 0 and 1
x = np.random.random((20, 100))
y = np.random.random((20, 100))

In [None]:
# Measure the time taken to perform the operations using NumPy functions
t0 = time.time() # Record the current time before starting the operations
for _ in range(1000):
  # Perform element-wise addition using NumPy
  z = x + y
  # Apply the ReLU function using NumPy's maximum function
  z = np.maximum(z, 0.)
print("Took: {0:.2f} s".format(time.time() - t0)) # Calculate and print the elapsed time

Took: 0.01 s


In [None]:
# Measure the time taken to perform the operations using the naive implementations
t0 = time.time() # Record the current time before starting the operations
for _ in range(1000):
  # Perform element-wise addition using the naive_add function
  z = naive_add(x, y)
  # Apply the ReLU function using the naive_relu function
  z = naive_relu(z)
print("Took: {0:.2f} s".format(time.time() - t0)) # Calculate and print the elapsed time

Took: 5.12 s


4. Repeat the previous exercise using NumPy arrays and functions.

In [None]:
def relu_numpy(x):
    return np.maximum(x, 0)

# Example usage
input_array = np.array([[-1.0, 2.0], [-0.5, 3.0]])
output_array = relu_numpy(input_array)
print(output_array)

[[0. 2.]
 [0. 3.]]


In [None]:
def add_numpy(x, y):
    assert x.shape == y.shape
    return x + y

# Example usage
x = np.array([[1.0, 2.0], [3.0, 4.0]])
y = np.array([[5.0, 6.0], [7.0, 8.0]])
result = add_numpy(x, y)
print(result)

[[ 6.  8.]
 [10. 12.]]


## 2.3 Basic Model

Implement a Sequential Neural Network for the MNIST dataset using TensorFlow predefined functions and structures.

In [None]:
from tensorflow import keras

In [None]:
from tensorflow.keras.datasets import mnist
(train_images1, train_labels1), (test_images1, test_labels1) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
from tensorflow.keras import layers
keras_model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

keras_model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

In [None]:
train_images1 = train_images1.reshape((60000,28 * 28))
train_images1 = train_images1.astype("float32") / 255
test_images1 = test_images1.reshape((10000, 28 * 28))
test_images1 = test_images1.astype("float32") / 255

In [None]:
keras_model.fit(train_images1, train_labels1, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x787a6deec910>

## 2.4 Basic Model from scratch

Implement a Sequential Neural Network for the MNIST dataset from scratch by defining your own functions.

1. Create a Dense class

In [None]:
import tensorflow as tf

class NaiveDense:
  def __init__(self, input_size, output_size, activation):
    self.activation = activation

    w_shape = (input_size, output_size)
    w_initial_value = tf.random.uniform(w_shape, minval=0, maxval=1e-1)
    self.W = tf.Variable(w_initial_value)

    b_shape = (output_size)
    b_initial_value = tf.zeros(b_shape)
    self.b = tf.Variable(b_initial_value)

  def __call__(self, inputs):
    return self.activation(tf.matmul(inputs, self.W) + self.b)

  @property
  def weights(self):
    return [self.W, self.b]

2. Create the Sequential class

In [None]:
class NaiveSequential:
    def __init__(self, layers):
      self.layers = layers

    def __call__(self, inputs):
      x = inputs
      for layer in self.layers:
        x = layer(x)
        return x

    @property
    def weights(self):
      weights = []
      for layer in self.layers:
        weights += layer.weights
      return weights

3. Create a model using your created classes

In [None]:
model = NaiveSequential([
    NaiveDense(input_size=28 * 28, output_size=512, activation = tf.nn.relu),
    NaiveDense(input_size=512, output_size=10, activation = tf.nn.softmax)
])
assert len(model.weights) == 4

4. Create a batch generator for your model

In [None]:
import math

class BatchGenerator:
    def __init__(self, images, labels, batch_size=128):
       assert len(images) == len(labels)
       self.index = 0
       self.images = images
       self.labels = labels
       self.batch_size = batch_size
       self.num_batches = math.ceil(len(images) / batch_size)

    def next(self):
      images = self.images[self.index : self.index + self.batch_size]
      labels = self.labels[self.index : self.index + self.batch_size]
      self.index += self.batch_size
      return images, labels

5. Create the update_weigths function to apply the gradients to the models weights. Use a learning rate of $10^{-3}$

In [None]:
learning_rate = 1e-3

def update_weights(gradients, weights):
   for g, w in zip(gradients, weights):
     if g is not None:
       w.assign_sub(g * learning_rate)

6. Create the one training step function that obtains one forward and one backward step for a given batch

In [None]:
def one_training_step(model, images_batch, labels_batch):
    with tf.GradientTape() as tapes:
        predictions = model(images_batch)
        per_sample_losses = tf.keras.losses.sparse_categorical_crossentropy(
                labels_batch, predictions)
        average_loss = tf.reduce_mean(per_sample_losses)
    gradients = tapes.gradient(average_loss, model.weights)
    update_weights(gradients, model.weights)
    return average_loss

7. Create the function fit that can train the model given a dataset of images/labels over the defined epochs. Set the default batch size to 128.

In [None]:
def fit(model, images, labels, epochs, batch_size=128):
  for epoch_counter in range(epochs):
    print(f"Epoch {epoch_counter}")

    batch_generator = BatchGenerator(images, labels)
    for batch_counter in range(batch_generator.num_batches):
        images_batch, labels_batch = batch_generator.next()
        loss = one_training_step(model, images_batch, labels_batch)
        if batch_counter % 100 == 0:
            print(f"loss at batch {batch_counter}: {loss:2f}")

8. Train the model using the MNIST dataset. Use 10 epochs and batch size 128

In [None]:
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

fit(model, train_images, train_labels, epochs=10, batch_size=128)

Epoch 0
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 1
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 2
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 3
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 4
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 5
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss at batch 400: 6.238325
Epoch 6
loss at batch 0: 6.238325
loss at batch 100: 6.238325
loss at batch 200: 6.238325
loss at batch 300: 6.238322
loss a

9. Manually compute the accuracy of the model using the mean function of the matching test predictions.

In [None]:
predictions = model(test_images)
predictions = predictions.numpy()
predicted_labels = np.argmax(predictions, axis=1)
matches = predicted_labels == test_labels
print(f"accuracy: {matches.mean():.2f}")

accuracy: 0.00


10. Compare the accuracy of the two methods

In [None]:
def compute_accuracy(model, images, labels):
    predictions = model(images)
    predictions = predictions.numpy()
    predicted_labels = np.argmax(predictions, axis=1)
    matches = predicted_labels == labels
    accuracy = matches.mean()
    return accuracy

In [None]:
custom_model_accuracy = compute_accuracy(model, test_images, test_labels)
print(f"Custom Model Accuracy: {custom_model_accuracy:.2f}")

Custom Model Accuracy: 0.00


In [None]:
keras_model_accuracy = keras_model.evaluate(test_images1, test_labels1, verbose=0)[1]
print(f"Keras Model Accuracy: {keras_model_accuracy:.2f}")

Keras Model Accuracy: 0.98
