# **Working with NumPy**
This optional assignment is for you to practise using the NumPy library and learn how to use vectorization to speed up computations in comparision to iterative approaches.

You only need to fill in the code between the ```< START >``` and ```< END >``` tags, the rest has been implemented for you!

```python
# < START >
"YOUR CODE GOES HERE!"
# < END >

```

<!---Need to include clarification about arrays?-->

**Start by running the below cell, to import the NumPy library.**

In [1]:
import numpy as np

### **Initializating Arrays**
NumPy offers multiple methods to create and populate arrays
- Create a $2\times3$ array identical to
$\begin{bmatrix}
1 & 2 & 4\\
7 & 13 & 21\\
\end{bmatrix}$, and assign it to a variable `arr`.  

In [5]:
# < START >
arr = np.array([[1,2,4],[7,13,21]])
print(arr)
print(arr.shape)
# < END >

[[ 1  2  4]
 [ 7 13 21]]
(2, 3)


You should be able to see that the `shape` property of an array lets you access its dimensions.  
For us, this is a handy way to ensure that the dimensions of an array are what we expect, allowing us to easily debug programs.

- Initialize a NumPy array `x` of dimensions $2\times3$ with random values.  
Do not use the values of the dimensions directly, instead use the variables provided as arguments.
<details>
  <summary>Hint</summary>
  <a href="https://numpy.org/doc/stable/reference/random/generated/numpy.random.randn.html#numpy-random-randn">np.random.randn()</a>
</details>

In [7]:
rows = int(input("enter the number of rows"))
cols = int(input("enter the number of colummns"))

# <START>
x = np.random.rand(rows,cols)
print(x)
# <END>



enter the number of rows 2
enter the number of colummns 3


[[0.40080008 0.21459589 0.8847126 ]
 [0.38496968 0.00614362 0.52084153]]


A few more basic methods to initialize arrays exist.
Feel free to read up online to complete the code snippets.

In [9]:
# < START >
# Initialize an array ZERO_ARR of dimensions (4, 5, 2) whose every element is 0
a = np.zeros([4,5,2])
print(a)
# < END >



# < START >
# Initialize an array ONE_ARR of dimensions (4, 5, 2) whose every element is 1
b = np.ones([4,5,2])
print(b)
# < END >



[[[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]
  [0. 0.]]]
[[[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]
  [1. 1.]]]


You can also transpose arrays (same as with matrices), but a more general and commonly used function is the `array.reshape()` function.

$$
\begin{bmatrix}
a & d\\
b & e\\
c & f\\
\end{bmatrix}
\xleftarrow{\text{.T}}
\begin{bmatrix}
a & b & c\\
d & e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(3, 2)}}
\begin{bmatrix}
a & b\\
c & d\\
e & f\\
\end{bmatrix}
\xrightarrow{\text{.reshape(6,1)}}
\begin{bmatrix}
a\\b\\c\\d\\e\\f\\
\end{bmatrix}
$$
`reshape` is commonly used to flatten data stored in multi-dimensional arrays (ex: a 2D array representing a B/W image)

- Try it out yourself:

In [10]:
y = np.array([[1,2,3],[4,5,6]])
print(y)
print()

# < START >
# Create a new array y_transpose that is the transpose of matrix y
y_transpose = y.reshape(3,2)
print(y_transpose)
print()
# < END >


# < START >
# Create a new array y_flat that contains the same elements as y but has been flattened to a column array
y_flat = y.reshape(6,1)
print(y_flat)
# < END >



[[1 2 3]
 [4 5 6]]

[[1 2]
 [3 4]
 [5 6]]

[[1]
 [2]
 [3]
 [4]
 [5]
 [6]]


- Create a `y` with dimensions $3\times1$ (column matrix), with elements $4,7$ and $11$.  
$$y = \begin{bmatrix}
4\\
7\\
11
\end{bmatrix}$$  

In [40]:
# <START>
# Initialize the column matrix here
y = np.array([[4],[7],[11]])
print(y)
# <END>

assert a.shape[1] == b.shape[0] , 'dimensions of the matrics are not matching'
# The above line is an assert statement, which halts the program if the given condition evaluates to False.
# Assert statements are frequently used in neural network programs to ensure our matrices are of the right dimensions.


# <START>
# Multiply both the arrays here
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
c = np.matmul(a,b)
print(c)
# <END>




[[ 4]
 [ 7]
 [11]]
[[19 22]
 [43 50]]


### **Indexing & Slicing**
Just like with normal arrays, you can access an element at the `(i,j)` position using `array[i][j]`.  
However, NumPy allows you to do the same using `array[i, j]`, and this form is more efficient and simpler to use.
<details>
<summary><i>(Optional) Why is it more efficient?</i></summary>
The former case is more inefficient as a new temporary array is created after the first index i, that is then indexed by j.
</details>

```python
x = np.array([[1,3,5],[4,7,11],[5,10,20]])

x[1][2] #11
x[1,2]  #11 <-- Prefer this
```

Slicing is another important feature of NumPy arrays. The syntax is the same as that of slicing in Python lists.
  We pass the slice as

```python
  sliced_array = array[start:end:step]
  # The second colon (:) is only needed
  # if you want to use a step other than 1
```
By default, `start` is 0, `end` is the array length (in that dimension), and `step` is 1.
Remember that `end` is not included in the slice.

  Implement array slicing as instructed in the following examples



In [12]:


# <START>
x = np.array([[1,3,5],[4,7,11],[5,10,20]])
# Create a new array y with the middle 3 elements of x
y = x[1,:]
print(y)

# <END>



# <START>
z = np.array([1,2,3,4,5,6,7,8,9])
# Create a new array w with alternating elements of z
w = z[::2]
print(w)
# <END>



[ 4  7 11]
[1 3 5 7 9]


A combination of indexing and slicing can be used to access rows, columns and sub-arrays of 2D arrays.

```python
arr = np.array([
          [1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]])

print(arr[0])       #[1 2 3]
print(arr[:,2])     #[3 6 9]
print(arr[0:2,0:2]) #[[1 2]
                    # [4 5]]
```

In [28]:
arr = np.array([
          [1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]])

# <START>
# Create a 2D array sliced_arr_2d that is of the form [[5, 2], [7, 9], [4, 5]]
sliced_arr_2d = np.array([arr[1::-1,1], arr[2,0::2], arr[1,0:2]])
print(sliced_arr_2d)
# <END>



[[5 2]
 [7 9]
 [4 5]]


###**Broadcasting**

This feature allows for flexibility in array operations. It lets us implement highly efficient algorithms with minimal use of memory.

In [47]:

# <START>
# Implement broadcasting to add b to each element of arr1
arr1 = np.array([['a','b'],['c','d']])
print(arr1 + 'b')
print()
# <END>



# <START>
# Multiply each element of the first row of arr2 by 4 and each element of the second row by 5, using only arr2 and arr3
arr2 = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(arr2 * [[4],[1],[1]])
print()

arr3 = np.array([[2,4,6],[1,3,5]])
print(arr3 * [[1],[5]])
# <END>



[['ab' 'bb']
 ['cb' 'db']]

[[ 4  8 12]
 [ 4  5  6]
 [ 7  8  9]]

[[ 2  4  6]
 [ 5 15 25]]


### **Vectorization**

From what we've covered so far, it might not be clear as to why we need to use vectorization. To understand this, let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.

In [60]:
import time


# Non-vectorized approach
# <START>
li = arr.tolist()
start = time.time()

print(li*3)

end = time.time()
print(f"Execution time: {end - start} seconds") 
# <END>


# uncomment and execute the below line to convince yourself that both approaches are doing the same thing


# Vectorized approach
# <START>
#arr = np.random.rand(5,5)
#start = time.time()
#print(arr*3)
#end = time.time()
#print(f"Execution time: {end - start} seconds") 
# <END>



# uncomment and execute the below line to convince yourself that both approaches are doing the same thing


[[0.2502258037779076, 0.16148412413808844, 0.0912128051011597, 0.709970687728801, 0.5369532763949795, 0.873268215612646, 0.3808498372991954, 0.08178289553663631, 0.8149423761221208, 0.8582436812397806, 0.9801109609494583, 0.4150951253019104, 0.36129281697124715, 0.436156821638222, 0.7641605029849108, 0.8459557817507594, 0.19205688110100216, 0.3843688768032849, 0.9005521354383857, 0.13172208609122815, 0.8748820940762306, 0.8250166755165463, 0.859274832868538, 0.3392051253782109, 0.9456212359515557, 0.16656413428215677, 0.9476477812417529, 0.5604158766050038, 0.09808546999220435, 0.0010013185609906161, 0.0169863785076525, 0.493818637075932, 0.48507747201061147, 0.7409541208218651, 0.5002055863173328, 0.32551151843011206, 0.9191160244309645, 0.40223731998304735, 0.11190812233469305, 0.6599724828628615, 0.8255407276836687, 0.7224250779143399, 0.6011822532288291, 0.4647234524552638, 0.10865024740228313, 0.7987614477093774, 0.5686462198497905, 0.7991801660760882, 0.5245354613485854, 0.771919

In [59]:
# Vectorized approach
# <START>
arr = np.random.rand(100,100)
start = time.time()
print(arr*3)
end = time.time()
print(f"Execution time: {end - start} seconds") 
# <END>


[[0.75067741 0.48445237 0.27363842 ... 2.15869354 1.33584063 2.65240877]
 [0.81137124 1.24202392 1.07906602 ... 1.14594518 0.82247783 0.80071334]
 [2.71752827 1.88264603 0.74213758 ... 2.46200744 0.34343198 2.75629908]
 ...
 [0.63873559 0.00657145 1.80783072 ... 1.31828994 1.93283091 2.00345913]
 [1.23947428 0.65577828 1.95483038 ... 1.7644662  1.88812412 0.87529646]
 [1.85227618 0.50231424 2.37637567 ... 0.10981081 2.11531931 0.39644454]]
Execution time: 0.0012428760528564453 seconds


Try playing around with the dimensions of the array. You'll find that there isn't much difference in the execution times when the dimensions are small. But in neural networks, we often deal with very large datasets and so vectorization is a very important tool.