## Assignment #1

* Release date: 2023.09.27 Wed
* Due date: **2023.10.06 Fri 23:59** (will not accept late submission)
* Submission format: notebook file which can be executed in Colab environment
* Weighting: 5% (total 50 pts)

1. **(10pts)** Calculate `rotation*x` and `x*rotation`. Explain how each computation is performed and why two results are the same.

  ```python
    import numpy as np

    x = np.array([[2, 0]])
    rotation = np.array([ [0, -1],
                          [1,  0] ])
  ```

In [3]:
import numpy as np

x = np.array([[2, 0]])
rotation = np.array([[0, -1], [1, 0]])

result1 = x*rotation
result2 = rotation*x

print("rotation * x:")
print(result1)

print("\nx * rotation:")
print(result2)

#x and rotation are NumPy arrays. NumPy array multiplication(*), in this problem, performs 'element-wise multiplication' rather than matrix multiplication(dot).
#This means that it multiplies corresponding elements independently
#Due to broadcasting, x is expanded to match the shape of rotation, resulting in [[2, 0], [2, 0]]
# x*rotation is [[2*0, 0*-1], [2*1, 0*0]], therefore the result is [[0, 0], [2, 0]]
# rotation * x is [[0*2, -1*0], [1*2, 0*0]]
# element-wise multiplication is commutative, meaning the order of multiplication does not affect the outcome.
# Therefore, both rotation * x and x * rotation produce the same result.


rotation * x:
[[0 0]
 [2 0]]

x * rotation:
[[0 0]
 [2 0]]


[[2*0, 0*-1],
 [2*1, 0*0]]

2. **(5pts)** Suppose we have the following 2D tensor (i.e., a matrix). How to rearrange its values into 1D tensor (i.e., a vector) in a column major order?
```python
x = np.array([[1,  2,  3,  4],
                 [5,  6,  7,  8],
                 [9, 10, 11, 12]])
```

In [4]:
x = np.array([[1,  2,  3,  4],
              [5,  6,  7,  8],
              [9, 10, 11, 12]])

# Use the flatten function to convert the matrix to a 1D array in column major order
result = x.flatten(order='F')

print(result)

[ 1  5  9  2  6 10  3  7 11  4  8 12]


3. **(5pts)** Compute a transpose of the matrix `x` in Problem 2 by using only `np.reshape` function.

In [5]:
import numpy as np

x = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])

x_transposed = np.reshape(x, (-1,), order='F')

x_transposed = np.reshape(x_transposed, (x.shape[1], x.shape[0]))

print(x_transposed)

[[ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]
 [ 4  8 12]]


4. **(5pts)** Perform vector arithmetic to create a `primes_squared_minus_one` vector, where the `i`th element is equal to the `i`th element in `primes` squared minus 1. For example, the second element of `primes_squared_minus_one` would be equal to `3^2 - 1 = 8`. Note that using `for` loop is not allowed.
```python
import numpy as np
primes = np.array([2, 3, 5, 7, 11, 13])
primes_squared_minus_one = ?
```

In [6]:
import numpy as np

primes = np.array([2, 3, 5, 7, 11, 13])
primes_squared_minus_one = primes ** 2 - 1

print(primes_squared_minus_one)

[  3   8  24  48 120 168]


5. **(10pts)** Given any random matrices, compute the element-wise multiplication using a naive Python implementation and Numpy built-in function respectively. Compare the wall-clock times of these implementations as the size of matrices increases.



In [7]:
#The NumPy implementation is much faster than the naive Python implementation.
#As the matrix size increases this speed differential becomes more obvious

In [8]:
import numpy as np
import time

In [9]:
# naive element-wise matrix multiplication
def naive_matrix_multiply(matrix1, matrix2):
    result = []
    for i in range(len(matrix1)):
        row = []
        for j in range(len(matrix1[0])):
            row.append(matrix1[i][j] * matrix2[i][j])
        result.append(row)
    return result


In [10]:
# measure wall-clock time for matrix multiplication
def measure_matrix_multiplication_time(matrix_size, num_tests=10):
    np.random.seed(0)

    # Generate random matrices of the specified size
    matrix1 = np.random.rand(matrix_size, matrix_size)
    matrix2 = np.random.rand(matrix_size, matrix_size)

    # Measure time for NumPy element-wise multiplication
    numpy_start_time = time.time()
    for _ in range(num_tests):
        result = np.multiply(matrix1, matrix2)
    numpy_end_time = time.time()
    numpy_time = (numpy_end_time - numpy_start_time) / num_tests

    # Convert matrices to lists for naive implementation
    matrix1_list = matrix1.tolist()
    matrix2_list = matrix2.tolist()

    # Measure time for naive Python element-wise multiplication
    naive_start_time = time.time()
    for _ in range(num_tests):
        result = naive_matrix_multiply(matrix1_list, matrix2_list)
    naive_end_time = time.time()
    naive_time = (naive_end_time - naive_start_time) / num_tests

    return numpy_time, naive_time

In [11]:
# Test different matrix sizes and compare execution times
matrix_sizes = [1, 10, 100, 1000]

for size in matrix_sizes:
    numpy_time, naive_time = measure_matrix_multiplication_time(size)
    print(f"Matrix Size: {size}x{size}")
    print(f"NumPy Time: {numpy_time:.6f} seconds")
    print(f"Naive Python Time: {naive_time:.6f} seconds")
    print("=" * 40)

Matrix Size: 1x1
NumPy Time: 0.000003 seconds
Naive Python Time: 0.000001 seconds
Matrix Size: 10x10
NumPy Time: 0.000004 seconds
Naive Python Time: 0.000019 seconds
Matrix Size: 100x100
NumPy Time: 0.000007 seconds
Naive Python Time: 0.000956 seconds
Matrix Size: 1000x1000
NumPy Time: 0.000998 seconds
Naive Python Time: 0.145879 seconds


6. **(15pts)** Consider MNIST classification problem covered during the class. For the details, please refer to the course material. In this example, we used the multilayer perceptron composed of an input layer with 512 hidden nodes and an output layer that produces predicted probabilities over 10 classes. In the class, we used GPU as a hardware accelerator to train our model.

  Here, let's verify the actual benefit of using GPU for training. For this, compare the wall-clock times in the case of 1) using CPU and 2) using GPU to train MNIST classifier.


In [1]:
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import time

In [2]:
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()

model = models.Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [14]:
# 1) Train on CPU
start_time_cpu = time.time()
model.fit(train_images, train_labels, epochs=15, validation_data=(test_images, test_labels))
end_time_cpu = time.time()
training_time_cpu = end_time_cpu - start_time_cpu

print(f"Training on CPU took {training_time_cpu} seconds")

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15
Training on CPU took 112.47414231300354 seconds


In [6]:
# I executed the code with 'with tf.device('/GPU:0')', but the execution time was similar,
#so I assumed that the code was not running on the GPU.
#Therefore, I manually changed the runtime type to GPU from here.
#So, before this cell, it was running on CPU, and from this cell onwards, it was running on GPU.



# 2) Train on GPU
start_time_gpu = time.time()
model.fit(train_images, train_labels, epochs=15, validation_data=(test_images, test_labels))
end_time_gpu = time.time()
training_time_gpu = end_time_gpu - start_time_gpu

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [7]:
print(f"Training on GPU took {training_time_gpu} seconds")

Training on GPU took 83.97867560386658 seconds
