# Vectorisation in the Logistic Regression Algorithm
Vectorisation is the removal of 'for' loops in algorithm code which increases the speed at which the code can run due to inefficiencies in the loops' logic.

## Array Initialisation

In [6]:
# Module imports
import numpy as np
import time

In [5]:
a = np.array([1,2,3,4])
print (a)

[1 2 3 4]


## Without using Vectorisation

In [29]:
# Variable initialisation
c = 0

tic = time.time() # Start timer

for i in range(1000000): # Using 'for' loop instead of vectorisation
    c += a[i] * b[i]

toc = time.time() # End timer
ltime = 1000 * (toc-tic)

print("c =", c)
print("For loop version:", str(ltime) + "ms") # Print total time taken

c = 249796.97907020326
For loop version: 929.2960166931152ms


## With using Vectorisation

In [30]:
# Variable initialisation
a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time() # Start timer

c = np.dot(a, b) # Using vectorisation instead of 'for' loops

toc = time.time() # End timer
vtime = 1000 * (toc-tic)

print("c =", c)
print("Vectorised version:", str(vtime) + "ms") # Print total time taken

c = 249662.32800948684
Vectorised version: 0.7030963897705078ms


## Comparison

In [37]:
diff = (ltime / vtime)
print("The 'for' loop took", str(round(diff)) + "x longer than Vectorisation.")

The 'for' loop took 1322x longer than Vectorisation.


## Notes
A lot of scalable deep-learning computation is done on GPUs, but all computation above was carried out on the CPU. Both are capable of Single Instruction Multiple Data (SIMD) operations, but GPUs are particularly good at them. This makes GPUs better for parallelisation when training deep-learning models. Vectorisation significantly speeds up computation. Whenever possible, you should avoid using explicit 'for' loops.