# **Working with NumPy**
This optional assignment is for you to practise using the NumPy library and learn how to use vectorization to speed up computations in comparision to iterative approaches.

You only need to fill in the code between the ```< START >``` and ```< END >``` tags, the rest has been implemented for you!

```python
# < START >
"YOUR CODE GOES HERE!"
# < END >

```

<!---Need to include clarification about arrays?-->

**Start by running the below cell, to import the NumPy library.**

In [None]:
import numpy as np

### **Initializating Arrays**
NumPy offers multiple methods to create and populate arrays
- Create a $2\times3$ array identical to
$\begin{bmatrix}
1 & 2 & 4\\
7 & 13 & 21\\
\end{bmatrix}$, and assign it to a variable `arr`.  

In [3]:
# < START > 
import numpy as np
arr = np.array([[1,2,4],[7,13,21]])
print(arr)
print(f"Shape: {arr.shape}")
# < END >

[[ 1  2  4]
 [ 7 13 21]]
Shape: (2, 3)


You should be able to see that the `shape` property of an array lets you access its dimensions.  
For us, this is a handy way to ensure that the dimensions of an array are what we expect, allowing us to easily debug programs.

- Initialize a NumPy array `x` of dimensions $2\times3$ with random values.  
Do not use the values of the dimensions directly, instead use the variables provided as arguments.
<details>
  <summary>Hint</summary>
  <a href="https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html#numpy-random-randn">np.random.randn()</a>
</details>

In [4]:


# <START>
import numpy as np
arr = np.array([[0.98330809,0.74678655,0.76567102],[0.61631965,0.60772271,0.2082465]])
print(arr)
print(f"Shape: {arr.shape}")

# <END>



[[0.98330809 0.74678655 0.76567102]
 [0.61631965 0.60772271 0.2082465 ]]
Shape: (2, 3)


A few more basic methods to initialize arrays exist.
Feel free to read up online to complete the code snippets.

In [5]:
# < START >
import numpy as np
# Initialize an array ZERO_ARR of dimensions (4, 5, 2) whose every element is 0
ZeRO_ARR = np.zeros((4,5,2))
print(ZeRO_ARR)     

# < END >



# < START >
import numpy as np
# Initialize an array ONE_ARR of dimensions (4, 5, 2) whose every element is 1
ONE_ARR = np.ones((4,5,2))
print(ONE_ARR)

# < END >



[[[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]]
[[[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]]


You can also transpose arrays (same as with matrices), but a more general and commonly used function is the `array.reshape()` function.

$$
\begin{bmatrix}
a & d\\
b & e\\
c & f\\
\end{bmatrix}
\xleftarrow{\text{.T}}
\begin{bmatrix}
a & b & c\\
d & e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(3, 2)}}
\begin{bmatrix}
a & b\\
c & d\\
e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(6,1)}}
\begin{bmatrix}
a\\b\\c\\d\\e\\f\\
\end{bmatrix}
$$
`reshape` is commonly used to flatten data stored in multi-dimensional arrays (ex: a 2D array representing a B/W image)

- Try it out yourself:

In [13]:


# < START >
import numpy as np
# Given the following matrix arr
arr = np.array([[1, 2, 3],[4, 5, 6]])
# Create a new array arr_transpose that is the transpose of matrix arr
arr_transpose1 = arr.T
print(arr_transpose1)
arr_reshape = arr.reshape((1,6))
print(arr_reshape)


# < END >


# < START >
# Create a new array y_flat that contains the same elements as y but has been flattened to a column array
# < END >



[[1 4]
 [2 5]
 [3 6]]
[[1 2 3 4 5 6]]


- Create a `y` with dimensions $3\times1$ (column matrix), with elements $4,7$ and $11$.  
$$y = \begin{bmatrix}
4\\
7\\
11
\end{bmatrix}$$  

In [20]:
# <START>
import numpy as np
# Initialize the column matrix here
y = np.array([[4],[7],[11]])
print(y)
# Reshape y to a 1D array
y_flat = y.reshape(-1)
# <END>

# The above line is an assert statement, which halts the program if the given condition evaluates to False.
# Assert statements are frequently used in neural network programs to ensure our matrices are of the right dimensions.


# <START>
import numpy as np
# Initialize the second array here
y2 = np.array([3/4,5/7])
# Multiply both the arrays here
Result = y_flat[:2] * y2
print(Result)
# <END>




[[ 4]
 [ 7]
 [11]]
[3. 5.]


### **Indexing & Slicing**
Just like with normal arrays, you can access an element at the `(i,j)` position using `array[i][j]`.  
However, NumPy allows you to do the same using `array[i, j]`, and this form is more efficient and simpler to use.
<details>
<summary><i>(Optional) Why is it more efficient?</i></summary>
The former case is more inefficient as a new temporary array is created after the first index i, that is then indexed by j.
</details>

```python
x = np.array([[1,3,5],[4,7,11],[5,10,20]])

x[1][2] #11
x[1,2]  #11 <-- Prefer this
```

Slicing is another important feature of NumPy arrays. The syntax is the same as that of slicing in Python lists.
  We pass the slice as

```python
  sliced_array = array[start:end:step]
  # The second colon (:) is only needed
  # if you want to use a step other than 1
```
By default, `start` is 0, `end` is the array length (in that dimension), and `step` is 1.
Remember that `end` is not included in the slice.

  Implement array slicing as instructed in the following examples



In [18]:

# <START>
import numpy as np
x= np.array([[1,3,5],[4,7,11],[5,10,20]])
# Create a new array y with the middle 3 elements of x
y = np.array([4,7,11])
print(y)
# <END>



# <START>
import numpy as np
z = np.array([1,2,3,4,5,6,7,8,9])
# Create a new array w with alternating elements of z
w = np.array([1,3,5,7,9])
print(w)
# <END>



[ 4  7 11]
[1 3 5 7 9]


A combination of indexing and slicing can be used to access rows, columns and sub-arrays of 2D arrays.

```python
arr = np.array([
          [1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]])

print(arr[0])       #[1 2 3]
print(arr[:,2])     #[3 6 9]
print(arr[0:2,0:2]) #[[1 2]
                    # [4 5]]
```

In [19]:


# <START>
import numpy as np
x= np.array([[5,2,3],[7,9,11],[4,5,15]])
# Create a 2D array sliced_arr_2d that is of the form [[5, 2], [7, 9], [4, 5]]
sliced_arr_2d = x[:, :2]
print(sliced_arr_2d)
# <END>



[[5 2]
 [7 9]
 [4 5]]


###**Broadcasting**

This feature allows for flexibility in array operations. It lets us implement highly efficient algorithms with minimal use of memory.

In [22]:

# <START>
import numpy as np
arr1 = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = 5     
# Implement broadcasting to add b to each element of arr1
arr1_broadcast = arr1 + b
print(arr1_broadcast)
# <END>



# <START>
import numpy as np
arr2 = np.array([[1,2,3],[4,5,6]])
# Multiply each element of the first row of arr2 by 4 and each element of the second row by 5, using only arr2 and arr3
arr3 = np.array([[4],[5]])
result = arr2 * arr3
print(result)
# <END>



[[ 6  7  8]
 [ 9 10 11]
 [12 13 14]]
[[ 4  8 12]
 [20 25 30]]


### **Vectorization**

From what we've covered so far, it might not be clear as to why we need to use vectorization. To understand this, let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.

In [25]:
import time



# Non-vectorized approach
import numpy as np
large_arr = np.random.rand(1000, 1000)
start_time = time.time()
rows, cols = large_arr.shape
for i in range(rows):
    for j in range(cols):
        large_arr[i, j] *= 3
end_time = time.time()
print(large_arr)
print(f"Time taken in non-vectorized approach: {(end_time - start_time) * 1000:.4f} ms")


# uncomment and execute the below line to convince yourself that both approaches are doing the same thing


# Vectorized approach
# <START>
import numpy as np
large_arr = np.random.rand(1000, 1000)
start_time = time.time()
large_arr_vectorized = large_arr * 3
end_time = time.time()
print(large_arr_vectorized)
print(f"Time taken in vectorized approach: {(end_time - start_time) * 1000:.4f} ms")
# <END>



# uncomment and execute the below line to convince yourself that both approaches are doing the same thing


[[2.83383546 2.05413071 2.76711458 ... 1.90492789 1.0544208  1.25365687]
 [1.84754774 1.41352551 0.43436096 ... 1.91810951 0.55031033 1.39242605]
 [2.21455749 0.46885093 2.53404665 ... 0.185304   1.3688449  2.08470701]
 ...
 [2.59469767 2.31615358 1.90939896 ... 1.66956805 1.34053905 0.10853017]
 [2.12610976 1.75958473 1.48569286 ... 0.79382905 1.41438915 1.59085568]
 [0.54018063 2.62979352 2.11422878 ... 0.34975839 0.84184409 2.8975411 ]]
Time taken in non-vectorized approach: 519.7027 ms
[[0.73364209 1.25454019 2.20801501 ... 1.95221402 1.1661069  0.23326617]
 [2.38166593 0.24380451 0.07019154 ... 0.89861161 1.0567759  0.57537677]
 [1.36483532 0.231731   2.6135404  ... 2.88710424 0.36183488 0.0175764 ]
 ...
 [0.39268373 0.28649297 2.23411211 ... 1.57966709 2.44650909 1.92825129]
 [1.985804   2.207336   0.07936658 ... 1.32723249 0.91836882 2.20890094]
 [2.74757673 0.99764434 0.04180844 ... 2.20087761 2.47127594 0.46022617]]
Time taken in vectorized approach: 4.5612 ms


Try playing around with the dimensions of the array. You'll find that there isn't much difference in the execution times when the dimensions are small. But in neural networks, we often deal with very large datasets and so vectorization is a very important tool.