# Vectorizing tricks

let say we have to mesure the graient for function, ...

$$
\frac{\partial J}{\partial\theta_{j}} = \frac{1}{m} \sum_{i=1}^{m}(h_{\theta}(x^{(i)}) - y^{(i)})x_{j}^{(i)}
$$


$$
h_{\theta}(x) = \theta^{T}x
$$

where small x or theta is column vector ex. $$x_{3x1}$$

capital X is matrix (array) of row vectors ex. $$X_{nxm}$$
n : number of rows.
m : number of column (numbe rof features).

In [36]:
import numpy as np
import time

In [73]:
X = np.array([
    [1, 2, 3],
    [1, 5, 6],
    [1, 8, 9],
    [1, 4, 7],
])

y = np.array([
    [13],
    [30],
    [45],
    [29]
])

theta = np.array([
    [1],
    [2],
    [3]
])

# ...
print(X.shape)
print(y.shape)
print(theta.shape)

(4, 3)
(4, 1)
(3, 1)


## Hypothesis

In [39]:
def h(X):
    return X.dot(theta)

In [48]:
def h_loop(X):
    return [theta.T.dot(x) for x in X]

In [23]:
h(X)

array([[14],
       [29],
       [44],
       [30]])

In [38]:
# This solution is similar to dot h() operation it's slower.
x0 = theta.T.dot(X[0, :])
x1 = theta.T.dot(X[1, :])
x2 = theta.T.dot(X[2, :])
x3 = theta.T.dot(X[3, :])

print(x0)
print(x1)
print(x2)
print(x3)

[14]
[29]
[44]
[30]


In [64]:
XX = np.random.rand(10000, 3)

s0 = time.time()
h(XX)
e0 = time.time()

s1 = time.time()
h_loop(XX)
e1 = time.time()

print('Time of vectorizing : ', (e0 - s0))
print('Time of looping : ', (e1 - s1))
print('is vectorizing time slower than loop time : ', (e0 - s0) > (e1 - s1))

Time of vectorizing :  0.00844573974609375
Time of looping :  0.0724332332611084
is vectorizing time slower than loop time :  False


## Gradient

$$
\frac{\partial J}{\partial\theta_{j}} = \frac{1}{m} \sum_{i=1}^{m}(h_{\theta}(x^{(i)}) - y^{(i)})x_{j}^{(i)}
$$


In [171]:
def grad_1(xx, yy):
    m = len(xx)
    error = h(xx) - yy
    return (1./m) * np.sum(error * xx, axis=0)

In [172]:
def grad_2(xx, yy):
    m = len(xx)
    J = [0, 0, 0]

    for i in range(m):
        x0 = theta.T.dot(xx[i, :])
        e0 = x0 - yy[i]
        for ii in range(len(J)):
            J[ii] +=  (e0 * xx[i][ii])[0]

    J = [(1. / m) * j for j in J]
    return J

In [173]:
X.T

array([[1, 1, 1, 1],
       [2, 5, 8, 4],
       [3, 6, 9, 7]])

In [158]:
X

array([[1, 2, 3],
       [1, 5, 6],
       [1, 8, 9],
       [1, 4, 7]])

In [170]:
def grad_3(xx, yy):
    m = len(xx)
    error = h(xx) - yy
    return (1./m) * xx.T.dot(error)

In [174]:
print(grad_1(X, y))
print(grad_2(X, y))
print(grad_3(X, y))

[ 0.   -1.75 -1.25]
[0.0, -1.75, -1.25]
[[ 0.  ]
 [-1.75]
 [-1.25]]


In [180]:
XX = np.random.rand(10000, 3)
yy = np.random.rand(10000, 1)


s1 = time.time()
j1 = grad_1(XX, yy)
e1 = time.time()

s2 = time.time()
j2 = grad_2(XX, yy)
e2 = time.time()


s3 = time.time()
j3 = grad_3(XX, yy)
e3 = time.time()

print('Time of grad_1 : ', (e1 - s1))
print('Time of looping : ', (e2 - s2))
print('Time of vectorizing : ', (e3 - s3))
print('is vectorizing time slower than loop time : ', (e3 - s3) > (e2 - s2))

print('\n\n',j1)
print('\n\n',j2)
print('\n\n',j3)

Time of grad_1 :  0.0025784969329833984
Time of looping :  0.19057941436767578
Time of vectorizing :  0.00023317337036132812
is vectorizing time slower than loop time :  False


 [1.34718835 1.42779259 1.50030532]


 [1.3471883509081484, 1.4277925899084303, 1.5003053187969504]


 [[1.34718835]
 [1.42779259]
 [1.50030532]]
