# Vectorization

### Vectorization is an art of getting rid of explicit For Loops from our code to speedup the process.
Let's see the difference in time taken for execution of dot product function using a For Loop versus Vectorization in Python. Libraries needed for Vectorization: numpy and time

In [13]:
import numpy as np
a = np.array([1,2,3,4])     #This is how to declare a 1D array
print(a)
#numpy.random.rand() method creates array of specified shape with random values.
b = np.random.rand(5)       #Generating 5 random values
print(b)

[1 2 3 4]
[0.28304761 0.63807317 0.65433438 0.66577735 0.24154407]


In [14]:
import time

a = np.random.rand(10000000)       #a is 1D array of 10 Million Random Values
b = np.random.rand(10000000)       #b is 1D array of another 10 Million Random Values

### Comparing the time taken for dot product of a and b vectors in milliseconds

Background on time.time(): returns the time as a floating point number expressed in seconds since the epoch, in UTC. Literally speaking the epoch is Unix time 0 (midnight 1/1/1970)

In [15]:
tic = time.time()
c   = np.dot(a,b)
toc = time.time()
print(c)
print("Vectorized Version:"+str(1000*(toc-tic))+"ms")
c=0
tic = time.time()
for i in range(10000000):
    c += a[i]*b[i]
toc = time.time()
print(c)
print("For Loop Version:"+str(1000*(toc-tic))+"ms")

2498864.2366606635
Vectorized Version:5.985021591186523ms
2498864.2366605573
For Loop Version:4101.605653762817ms


# Broadcasting

### The term broadcasting refers to how numpy treats arrays with different Dimension during arithmetic operations which lead to certain constraints, the smaller array is broadcast across the larger array so that they have compatible shapes.

Suppose Mr. A and Mrs. B are a couple working in two different organizations. We have their salary data of each month for the FY 2019-20. We want to calculate the income tax they need to pay assuming they have each saved Rs. 1.5 Lac according to Form 80C. Assumption is tax is to be calculated as 10% of the taxable income and there is no tax up to Rs. 5 Lac of income.

A = (50000,50000,50000,50000,60000,60000,60000,60000,70000,70000,70000,70000)

B = (60000,60000,60000,60000,60000,60000,75000,75000,75000,75000,75000,75000)

In [16]:
IncomeArray = np.array([[50000,50000,50000,50000,60000,60000,60000,60000,70000,70000,70000,70000],
              [60000,60000,60000,60000,60000,60000,75000,75000,75000,75000,75000,75000]])
print(IncomeArray)

[[50000 50000 50000 50000 60000 60000 60000 60000 70000 70000 70000 70000]
 [60000 60000 60000 60000 60000 60000 75000 75000 75000 75000 75000 75000]]


In [5]:
TotalIncome = IncomeArray.sum(axis=1)
print(TotalIncome)

[720000 810000]


In [6]:
PayableIncomeTax = 0.1*(TotalIncome-500000-150000)
print("Tax Payable for Mr. A : " + str(PayableIncomeTax[0]) + " and Tax Payable by Mrs. B : "+str(PayableIncomeTax[1]))

Tax Payable for Mr. A : 7000.0 and Tax Payable by Mrs. B : 16000.0


Let's find their monthly salary they bring home.

In [7]:
MonthlySalary = IncomeArray.sum(axis=0)
print(MonthlySalary)

[110000 110000 110000 110000 120000 120000 135000 135000 145000 145000
 145000 145000]


### Some more examples on Broadcasting

In [18]:
A = np.array([1,2,3,4])
print(A)
A=A.reshape(4,1)
print(A+100)

[1 2 3 4]
[[101]
 [102]
 [103]
 [104]]


In [20]:
A=np.array([1,2,3,4,5,6])
print("Initial array A = "+str(A))
A=A.reshape(2,3)
print("A = "+str(A))
B=np.array([100,200,300]).reshape(1,3)
print("B = "+str(B))
print("Sum of A+B = "+str(A+B))

Initial array A = [1 2 3 4 5 6]
A = [[1 2 3]
 [4 5 6]]
B = [[100 200 300]]
Sum of A+B = [[101 202 303]
 [104 205 306]]


In [10]:
A=np.array([1,2,3,4,5,6])
A=A.reshape(2,3)
print("A = "+str(A))
B=np.array([100,200]).reshape(2,1)
print("B = "+str(B))
print("Sum of A+B = "+str(A+B))

A = [[1 2 3]
 [4 5 6]]
B = [[100]
 [200]]
Sum of A+B = [[101 102 103]
 [204 205 206]]
