# **Working with NumPy**
In this assignment, you will be familiarized with the usage of the NumPy library and how to use vectorization to speed up computations in comparision to iterative approaches.

You are to only write/modify the code in between consecutive `# < START >` and `# < END >` comments. DO NOT modify other parts of the notebook, your assignments will not be graded otherwise.
```python
"Don't modify any code here"

# < START >
"YOUR CODE GOES HERE!"
# < END >

"Don't modify any code here"
```

<!---Need to include clarification about arrays?-->

**Start by running the below cell, to import the NumPy library.**

In [1]:
import numpy as np

### **Initializating Arrays**
NumPy offers multiple methods to create and populate arrays
- Create a $2\times3$ array identical to
$\begin{bmatrix}
1 & 2 & 4\\
7 & 13 & 21\\
\end{bmatrix}$, and assign it to a variable `arr`.  

In [2]:
# < START >

# < END >
arr = np.array([[1,2,4],[7,13,21]])
print(arr)
print("Shape:", arr.shape)

[[ 1  2  4]
 [ 7 13 21]]
Shape: (2, 3)


You should be able to see that the `shape` property of an array lets you access its dimensions.  
For us, this is a handy way to ensure that the dimensions of an array are what we expect, allowing us to easily debug programs.

- Initialize a NumPy array `x` of dimensions $2\times3$ with random values.  
Do not use the values of the dimensions directly, instead use the variables provided as arguments.
<details>
  <summary>Hint</summary>
  <a href="https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html#numpy-random-randn">np.random.randn()</a>
</details>

In [4]:
n_rows = 2
n_columns = 3

# <START>
x = np.random.sample((n_rows, n_columns))
# <END>

print(x)

[[0.81624948 0.60994854 0.17561999]
 [0.06144893 0.74695068 0.51016041]]


A few more basic methods to initialize arrays exist.
Feel free to read up online to complete the code snippets.

In [5]:
# < START >
# Initialize an array ZERO_ARR of dimensions (4, 5, 2) whose every element is 0

# < END >
ZERO_ARR = np.zeros((4,5,2))
print(ZERO_ARR)

# < START >
# Initialize an array ONE_ARR of dimensions (4, 5, 2) whose every element is 1

# < END >
ONE_ARR = np.ones((4,5,2))
print(ONE_ARR)

[[[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]]
[[[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]]


You can also transpose arrays (same as with matrices), but a more general and commonly used function is the `array.reshape()` function.

$$
\begin{bmatrix}
a & d\\
b & e\\
c & f\\
\end{bmatrix}
\xleftarrow{\text{.T}}
\begin{bmatrix}
a & b & c\\
d & e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(3, 2)}}
\begin{bmatrix}
a & b\\
c & d\\
e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(6,1)}}
\begin{bmatrix}
a\\b\\c\\d\\e\\f\\
\end{bmatrix}
$$
`reshape` is commonly used to flatten data stored in multi-dimensional arrays (ex: a 2D array representing a B/W image)

- Try it out yourself:

In [13]:
y = np.array([[1, 2, 3],
              [4, 5, 6]])

# < START >
# Create a new array y_transpose that is the transpose of matrix y
y_transpose = np.transpose(y)
# < END >

print(y_transpose)

# < START >
# Create a new array y_flat that contains the same elements as y but has been flattened to a column array

# < END >
y_flat = y.reshape((np.prod(y.shape), 1))
print(y_flat)

[[1 4]
 [2 5]
 [3 6]]
[[1]
 [2]
 [3]
 [4]
 [5]
 [6]]


- Create a `y` with dimensions $3\times1$ (column matrix), with elements $4,7$ and $11$.  
$$y = \begin{bmatrix}
4\\
7\\
11
\end{bmatrix}$$  

In [6]:
# <START>
# Initialize the column matrix here
y = np.array([4,7,11])
y = np.transpose(y)
# <END>

# assert y.shape == (3, 1)
# The above line is an assert statement, which halts the program if the given condition evaluates to False.
# Assert statements are frequently used in neural network programs to ensure our matrices are of the right dimensions.

print(y)

# <START>
# Multiply both the arrays here

# <END>

assert z.shape == (2, 1)

print(z)

[ 4  7 11]


NameError: name 'z' is not defined

### **Indexing & Slicing**
Just like with normal arrays, you can access an element at the `(i,j)` position using `array[i][j]`.  
However, NumPy allows you to do the same using `array[i, j]`, and this form is more efficient and simpler to use.
<details>
<summary><i>(Optional) Why is it more efficient?</i></summary>
The former case is more inefficient as a new temporary array is created after the first index i, that is then indexed by j.
</details>

```python
x = np.array([[1,3,5],[4,7,11],[5,10,20]])

x[1][2] #11
x[1,2]  #11 <-- Prefer this
```

Slicing is another important feature of NumPy arrays. The syntax is the same as that of slicing in Python lists.
  We pass the slice as

```python
  sliced_array = array[start:end:step]
  # The second colon (:) is only needed
  # if you want to use a step other than 1
```
By default, `start` is 0, `end` is the array length (in that dimension), and `step` is 1.
Remember that `end` is not included in the slice.

  Implement array slicing as instructed in the following examples



In [14]:
x = np.array([4, 1, 5, 6, 11])

# <START>
# Create a new array y with the middle 3 elements of x
y = x[1:-1]
# <END>

print(y)

z = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# <START>
# Create a new array w with alternating elements of z
w = z[::2]
# <END>

print(w)

[1 5 6]
[1 3 5 7 9]


A combination of indexing and slicing can be used to access rows, columns and sub-arrays of 2D arrays.

```python
arr = np.array([
          [1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]])

print(arr[0])       #[1 2 3]
print(arr[:,2])     #[3 6 9]
print(arr[0:2,0:2]) #[[1 2]
                    # [4 5]]
```

In [16]:
arr_2d = np.array([[4, 5, 2],
          [3, 7, 9],
          [1, 4, 5],
          [6, 6, 1]])

# <START>
# Create a 2D array sliced_arr_2d that is of the form [[5, 2], [7, 9], [4, 5]]
sliced_arr_2d = arr_2d[:-1,1:]
# <END>

print(sliced_arr_2d)

[[5 2]
 [7 9]
 [4 5]]


###**Broadcasting**

This feature allows for flexibility in array operations. It lets us implement highly efficient algorithms with minimal use of memory.

In [18]:
arr1 = np.array([1, 2, 3, 4])
b = 1

# <START>
# Implement broadcasting to add b to each element of arr1
arr1 += b
# <END>

print(arr1)

arr2 = np.array([[1, 2, 3],
                 [4, 5, 6]])
arr3 = np.array([[4],
                 [5]])

# <START>
# Multiply each element of the first row of arr2 by 4 and each element of the second row by 5, using only arr2 and arr3
arr2 *= arr3
# <END>

print(arr2)

[2 3 4 5]
[[ 4  8 12]
 [20 25 30]]


### **Vectorization**

From what we've covered so far, it might not be clear as to why we need to use vectorization. To understand this, let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.

In [19]:
import time

arr_nonvectorized = np.random.rand(1000, 1000)
arr_vectorized = np.array(arr_nonvectorized) # making a deep copy of the array

start_nv = time.time()

# Non-vectorized approach
# <START>

arr_nonvectorized = [[3*val for val in row] for row in arr_nonvectorized]  

# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000*(end_nv-start_nv), "ms")

# uncomment and execute the below line to convince yourself that both approaches are doing the same thing
# print(arr_nonvectorized)

start_v = time.time()

# Vectorized approach
# <START>
arr_vectorized *= 3
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000*(end_v-start_v), "ms")

# uncomment and execute the below line to convince yourself that both approaches are doing the same thing
# print(arr_vectorized)

Time taken in non-vectorized approach: 94.7573184967041 ms
Time taken in vectorized approach: 0.7672309875488281 ms


Try playing around with the dimensions of the array. You'll find that there isn't much difference in the execution times when the dimensions are small. But in neural networks, we often deal with very large datasets and so vectorization is a very important tool.