In [2]:
import numpy as np
import time

### Vectorization

Vectorization is the art of removing explicit for loops in your code. When applying deep learning in practice, you will train models on very large datasets, because thats when DL algorithms tend to shinit. So, it's important that your code is optimized, because otherwise, you will have to wait a long time to get the results.

In logistic regression, you have to calculate `z`, which is the result of `w.T * x + b`. If you implement it in a non-vectorized way, you would probably initialize a for loop to run `n_x` times, and multiply each individual value of `w` and `x`, and add b before adding it to `z`. This is a very inefficient implementation. If you will implement a vectorized example, the whole operation can just be performed not only using a single line of code, but also way more efficiently using this:

`z = np.dot(w, x) + b`

Lets look at an example

In [31]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)

#### Vectorized example

In [32]:
tic = time.time()
c = np.dot(a, b)
toc = time.time()

print(f"{1000*(toc-tic)} milliseconds")

2.7837753295898438 milliseconds


#### Non-Vectorized example

In [33]:
tic = time.time()
c = 0
for i in range(1000000):
    c += a[i]*b[i]
toc = time.time()

print(f"{1000*(toc-tic)} milliseconds")

795.2377796173096 milliseconds


A lot DL implementations are done using GPU, but we just used CPU using Jupyter notebook. Well, both CPU and GPU have parallelized instructions or SIMD instructions. So, if you use this  `numpy` function like `dot` and others, it enables Python NumPy to take advantage of the parallelism to do your computations much faster.

### More Vectorization Examples

If you want to calculate u as the product of matrix A and vector v, the general rule of matrix calculation like this would be Σ(A(i,j) * v(j))

#### Vectorized example

In [36]:
A = np.arange(start=0, stop=25, step=1).reshape(5, 5)
v = np.ones((5, 1), dtype='int64')

tic1 = time.time()
u = np.dot(A, v)
# u = np.matmul(A, v) another function to do the same thing
toc1 = time.time()

print(f"{1000*(toc1-tic1)} milliseconds")

0.0 milliseconds


#### Non-vectorized example

In [46]:
u = np.zeros((5, 1), dtype='int64')
A = np.arange(start=0, stop=25, step=1).reshape(5, 5)
v = np.ones((5, 1), dtype='int64')

tic = time.time()
for i in range(A.shape[0]):
    for j in range(A.shape[1]):
        u[i] += A[i, j]*v[j]
toc = time.time()

print(f"{1000*(toc-tic)} milliseconds")

6.014823913574219 milliseconds


So, whenever you are thinking of making a for loop, just see if there is some `numpy` function that can do it for you.