# WEEK 02: 
- Vectorization: Checking timing diferences between Vector and For Loop 
- More Vectorization examples 
- Vectorizing Logistic Regression 
- Broadcasting in Python

# Checking timing diferences 
# between Vector and For Loop

Vectorization is basically
the art of getting rid of explicit folders in your code.
In the deep learning era safety in deep learning in practice,
you often find yourself training on relatively large data sets,
because that's when deep learning algorithms tend to shine.
And so, it's important that your code very quickly because otherwise,
if it's running on a big data set,
your code might take a long time to run then you just find
yourself waiting a very long time to get the result

In [2]:
import numpy as np
import time

a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time()
c = np.dot(a,b)
toc = time.time()

print(c)
print("Vectorized version: " + str(1000*(toc-tic)) + "ms")

d = 0
ticd = time.time()
for i in range(1000000):
    d += a[i]*b[i]
tocd = time.time()

print(d)
print("For Loop version: " + str(1000*(tocd-ticd)) + "ms")

249974.8653636595
Vectorized version: 7.999897003173828ms
249974.86536365506
For Loop version: 1308.1791400909424ms


Some of you might have heard that a lot of
scaleable deep learning implementations are done on a GPU or a graphics processing unit.
But all the demos I did just now in the Jupiter notebook where actually on the CPU.
And it turns out that both GPU and CPU have parallelization instructions.
They're sometimes called SIMD instructions.
This stands for a single instruction multiple data.
But what this basically means is that,
if you use built-in functions such as this
np.function or other functions that don't require you explicitly implementing a for loop.
It enables Phyton Pi to take
much better advantage of parallelism to do your computations much faster.
And this is true both computations on CPUs and computations on GPUs.
It's just that GPUs are remarkably good at
these SIMD calculations but CPU is actually also not too bad at that.
Maybe just not as good as GPUs.
You're seeing how vectorization can significantly speed up your code.
The rule of thumb to remember is whenever possible,
avoid using explicit four loops.
Let's go onto the next video to see some more examples of
vectorization and also start to vectorize logistic regression. 

##### "The rule of thumb to remember is whenever possible, avoid using explicit four loops"

---

# Avoiding For loops

In [15]:
import numpy as np
import math
v = np.random.rand(100,1)
n = len(v)

#for -loop
u = np.zeros((n,1))
for i in range(n):
    u[i] = math.exp(v[i])

# Vectorized
u2 = np.exp(v)


In [16]:
u

array([[1.26165099],
       [1.23052556],
       [2.41947929],
       [1.26952376],
       [1.06510585],
       [1.36090634],
       [1.85275198],
       [1.21663292],
       [1.31808408],
       [1.4861464 ],
       [2.63693543],
       [1.37223217],
       [2.07808928],
       [1.96081291],
       [1.01057697],
       [1.44440799],
       [1.06635924],
       [1.55702008],
       [1.63019028],
       [1.69233273],
       [1.75778377],
       [1.11064978],
       [2.23600201],
       [1.63382716],
       [1.17465799],
       [1.26188173],
       [1.55756661],
       [1.26845415],
       [1.53183925],
       [1.97950489],
       [2.34152837],
       [1.18652999],
       [1.86537725],
       [1.48818511],
       [1.59428117],
       [1.90872931],
       [1.10812672],
       [1.16813825],
       [1.70629478],
       [2.62673741],
       [1.14466679],
       [1.68671107],
       [1.4937197 ],
       [2.37403709],
       [1.79564247],
       [1.6266087 ],
       [1.19192693],
       [1.421

In [17]:
u2

array([[1.26165099],
       [1.23052556],
       [2.41947929],
       [1.26952376],
       [1.06510585],
       [1.36090634],
       [1.85275198],
       [1.21663292],
       [1.31808408],
       [1.4861464 ],
       [2.63693543],
       [1.37223217],
       [2.07808928],
       [1.96081291],
       [1.01057697],
       [1.44440799],
       [1.06635924],
       [1.55702008],
       [1.63019028],
       [1.69233273],
       [1.75778377],
       [1.11064978],
       [2.23600201],
       [1.63382716],
       [1.17465799],
       [1.26188173],
       [1.55756661],
       [1.26845415],
       [1.53183925],
       [1.97950489],
       [2.34152837],
       [1.18652999],
       [1.86537725],
       [1.48818511],
       [1.59428117],
       [1.90872931],
       [1.10812672],
       [1.16813825],
       [1.70629478],
       [2.62673741],
       [1.14466679],
       [1.68671107],
       [1.4937197 ],
       [2.37403709],
       [1.79564247],
       [1.6266087 ],
       [1.19192693],
       [1.421

#### some other functions used
np.log(v)
np.abs(v) absolute value
np.maximum(v,0)
v**2
1/2

---

# Logistic regression implementation


### Z = wT X + b
Z = np.dot(wt, X) + b
A = logistic(Z)
dz = A-Y
### dw = 1/m X dzt
db = 1/m np.sum(dz)

w = w-learningrate*dw
b = b-learningrate*db


---

# Broadcasting in Python

In [19]:
#broadcasting example
#(m,n)   if + - * or /   (1,n)  --> (m,n)
#(m,n)  if + - * or /    (m,1)  --> (m,n)
#(m,1) + R
#[1 2 3]' + 100 = [101 102 103]
#[1 2 3] + 100 = [101 102 103]
import numpy as np
A = np.array([[56.0,0.0,4.4,68.0],
              [1.2,104.0,52.0,8.0],
              [1.8,135.0,99.0,0.9]])
print(A)

[[ 56.    0.    4.4  68. ]
 [  1.2 104.   52.    8. ]
 [  1.8 135.   99.    0.9]]


In [25]:
cal = A.sum(axis=0)
print(cal)

[ 59.  239.  155.4  76.9]


In [26]:
percentage = A/cal.reshape(1,4)
print(percentage)

[[0.94915254 0.         0.02831403 0.88426528]
 [0.02033898 0.43514644 0.33462033 0.10403121]
 [0.03050847 0.56485356 0.63706564 0.01170351]]


---

# Python-Numpy vectors

In [33]:
import numpy as np
a = np.random.randn(5)  #not suitable for deeplearning codes

In [34]:
print(a)

[-1.68563813  0.91094038 -0.30209531  1.82825684  1.1724103 ]


In [35]:
print(a.shape)   #not suitable for deeplearning codes, it is a "rank 1 array" DO NOT USE IT

(5,)


In [36]:
print(a.T)  # transpose

[-1.68563813  0.91094038 -0.30209531  1.82825684  1.1724103 ]


In [37]:
print(np.dot(a,a.T))

8.479518834673927


In [39]:
#instead, define the rows and collumns
a = np.random.randn(5,1)
print(a)

[[ 0.20036487]
 [-0.17643087]
 [-2.4666283 ]
 [ 2.11084208]
 [-0.64303632]]


In [41]:
print(a.T) #now a is really visualized as transpose 

[[ 0.20036487 -0.17643087 -2.4666283   2.11084208 -0.64303632]]


In [42]:
print(np.dot(a,a.T))  #it gives us a correct product (not just a single value)

[[ 0.04014608 -0.03535055 -0.49422566  0.4229386  -0.12884189]
 [-0.03535055  0.03112785  0.43518937 -0.3724177   0.11345145]
 [-0.49422566  0.43518937  6.08425516 -5.20666281  1.58613159]
 [ 0.4229386  -0.3724177  -5.20666281  4.45565429 -1.35734813]
 [-0.12884189  0.11345145  1.58613159 -1.35734813  0.41349571]]


# reminders
- a = np.random.randn(5,1)  --> a.shape = (5,1)
- a = np.random.randn(1,5)  --> a.shape = (1,5)
- assert(a.shape == (5,1))  --> a = a.reshape((5,1))