# Vectorization

Vectorization is the art of getting rid of explicit for loops in your code.
This will allow your code to run much faster, which is crucial when training on large datasets, and will also simplify your code and make it easier to read. This is why the ability to perform vectorization has become a key skill in the era of deep learning.

## so, what exactly is vectorization?

Let's start with an example.
Let's assume that you have two one dimensional vectors of size N = 1000000 (one million):

In [4]:
import numpy as np

N = 1000000
a = np.random.rand(N)
b = np.random.rand(N)

And let's say that you want to compute the following expression:
$$z = \sum_{i=1}^{N}a_i * b_i$$
A non-vectorized implementation might look like this:

In [5]:
z = 0
for i in range(N):
    z += a[i] * b[i]
print(z)

249834.64666162737


In contrast a vectorized implementation will just compute the expression above directly. <br/>
To do that in numpy, all you have to do is:

In [6]:
z = np.dot(a,b)
print(z)

249834.64666162012


Note that the number printed by both cells are exactly the same, which shows that they are doing the exact same thing. <br/>
Now remember that we said vectorization will make your code run much faster. Let's see that in action:

In [7]:
import time

tic = time.time()
z = 0
for i in range(N):
    z += a[i] * b[i]
toc = time.time()
print('non-vectorized version computed {} and took {}ms'.format(z, (toc-tic) * 1000))


tic = time.time()
z = np.dot(a,b)
toc = time.time()
print('vectorized version computed {} and took {}ms'.format(z, (toc-tic) * 1000))

non-vectorized version computed 249834.64666162737 and took 670.4468727111816ms
vectorized version computed 249834.64666162012 and took 2.0160675048828125ms


As you can see, the vectorized version is much quicker than the other version. So, if you remember to vectorize your code it will run roughly 300 times faster. That is the difference between your code taking **1 minute** to run, versus taking say **5 hours** to run. <br/>
This speed-up is due to the optimized implementation of the numpy library:
- Numpy arrays are densely packed arrays of homogenous type. Python arrays, however, are arrays of pointers to objects. So numpy arrays benefit from *locality of reference*.
- Numpy operations are implemented in C, which use *SIMD* (Single instruction multiple data) instructions to benefit from parallelism. 

So, the key takeaway is:
**<center>Whenever possible, avoid explicit for-loops.<center/>**


## Exercise
As an excercise, try to find the vectorized implementation of the following non-vectorized pieces of code.
When delivering this project, you should have the completed version of this notebook with you. Add your code only in the area specified by *WRITE YOUR CODE HERE*, and do **NOT** change anywhere else.

In [12]:
v = np.random.rand(10)
u = np.empty(10)

# non-vectorized implementation:
for i in range(10):
    u[i] = v[i] ** 2
print('non-vectorized:', u)

# vectorized implementation:
u = np.power(v, 2)


print('vectorized:', u)

non-vectorized: [6.32639487e-01 5.28714341e-04 1.62393324e-03 8.36479750e-01
 3.48053553e-01 2.18144918e-01 1.73457432e-04 4.81057520e-01
 8.62597302e-02 9.19410548e-01]
vectorized: [6.32639487e-01 5.28714341e-04 1.62393324e-03 8.36479750e-01
 3.48053553e-01 2.18144918e-01 1.73457432e-04 4.81057520e-01
 8.62597302e-02 9.19410548e-01]


In [16]:
v = np.random.rand(10)

# non-vectorized implementation:
u = float('-inf')
for i in range(10):
    if v[i] > u:
        u  = v[i]
print('non-vectorized:', u)



# vectorized implementation:
u = np.amax(v)


print('vectorized:', u)

non-vectorized: 0.8439285585162813
vectorized: 0.8439285585162813


Vectorization isn't just suitable for calculating predefined functions. you can combine different numpy functions to create more complex ones. <br/>
You should now have enough knowledge to write a vectorized code to compute the following expression (DO **NOT** use any for loops):
$$u = \frac{1}{N}\sum_{i=1}^{N}|a_i - b_i|$$

In [19]:
a = np.random.rand(10)
b = np.random.rand(10)

# non-vectorized implementation:
u = 0
for i in range(10):
    u += abs(a[i] - b[i])
u /= N
print('non-vectorized:', u)
        
# vectorized implementation:
u = 0
u = (np.sum(np.absolute(np.subtract(a, b)))) / N


print('vectorized:', u)

non-vectorized: 3.175266846495627e-06
vectorized: 3.175266846495627e-06


# Sources:
- [Neural Networks and Deep Learning: Coursera Course by Andrew NG](https://www.youtube.com/watch?v=qsIrQi0fzbY)