## What is Vectorization?

$ z = w^T x + b $ where 
$ \mathbf{w}  = \left[
\begin{matrix}
w_1\\ w_2\\ \vdots\\ w_{n_x}
\end{matrix}
\right] $ and 
$ \mathbf{x}  = \left[
\begin{matrix}
x_1\\ x_2\\ \vdots\\ x_{n_x}
\end{matrix}
\right] $

#### Non-vectorized implementation

>z = 0
>
>for i in range(n_x):
>
>     z += w[i] * x[i]
>
>z += b

#### Vetorized implementation

>z = np.dot(w,x)


In [2]:
import numpy as np
import time

a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time()
c = np.dot(a,b)
toc = time.time()

print(c)
print("Vectorized version:" + str(1000*(toc-tic)) + "ms")

c = 0
tic = time.time()
for i in range(1000000):
    c += a[i]*b[i]
toc = time.time()

print(c)
print("For loop:" + str(1000*(toc-tic)) + "ms")

250062.17600667116
Vectorized version:1.2202262878417969ms
250062.17600667023
For loop:537.5235080718994ms


## Neural network programming guideline

가능하면 명시적인 for 루프를 피하십시오..

### matrix vector multiplication $u = Av$

$ u_i = \sum_j A_{ij} v_j$

#### Non-vectorized implementation


>
> u = np.zeros((n,1))
>
> for i in range(m):
>
>     for j in range(n):
>
>          u[i] += A[i][j] * v[j]


#### Vetorized implementation

> u = np.dot(A,v)


### application of a function on a matrix/vector

$ \mathbf{v}  = \left[
\begin{matrix}
v_1\\ v_2\\ \vdots\\ v_{n}
\end{matrix}
\right] $ 
$\rightarrow$ 
$ \mathbf{u}  = \left[
\begin{matrix}
e^{v_1}\\ e^{v_2}\\ \vdots\\ e^{v_n}
\end{matrix}
\right] $ 

#### Non-vectorized implementation

>
> u = np.zeros((n,1))
>
> for i in range(n):
>
>      u[i] += math.exp(v[i])
>

#### Vetorized implementation

> u = np.exp(v)

Similarly,
> np.log(v)

> np.abs(v)

> np.maximum(v,0)

> v**2

> 1/v

 ## Logistic regression derivatives
 
$z^{(1)} = w^Tx^{(1)} + b$ $\Rightarrow$  $a^{(1)} = \sigma(z^{(1)})$

$z^{(2)} = w^Tx^{(2)} + b$ $\Rightarrow$  $a^{(2)} = \sigma(z^{(2)})$

$z^{(3)} = w^Tx^{(3)} + b$ $\Rightarrow$  $a^{(3)} = \sigma(z^{(3)})$

$\cdots$

where $X$ is $(n_x,m)$ matrix, $\in \mathbb{R}^{n_x \times m}$.

$ \mathbf{X}  =  \left[ \begin{matrix} | & | & \cdots & | \\
x^{(1)} & x^{(2)} & \cdots & x^{(m)} \\
 | & | & \cdots & | \end{matrix} \right] $ 

then

$  \left[ z^{(1)},  z^{(2)},  \cdots  z^{(m)}  \right] = w^T  \mathbf{X} + [b, b, \cdots b] \\
=  [w^Tx^{(1)}+b, w^Tx^{(2)}+b,\cdots,w^Tx^{(m)}+b] $ 

then in python

> z = np.dot(w.T, X) + b

where $b$ can be a real number in size$(1,1)$ due to the "broadcasting", and

$ A = \left[ a^{(1)},  a^{(2)},  \cdots  a^{(m)}  \right] = \sigma(z) $ 

> A = sigma(z)

## Vectorizing Logistic Regression

#### Without vectorization

$dz^{(1)} = a^{(1)} - y^{(1)}$,  $dz^{(2)} = a^{(2)} - y^{(2)}$, $ \cdots$

$dZ = [dz^{(1)} dz^{(2)} \cdots dz^{(m)} ] $

$ A = \left[ a^{(1)},  a^{(2)},  \cdots  a^{(m)}  \right]$

$ Y = \left[ y^{(1)},  y^{(2)},  \cdots  y^{(m)}  \right]$

$ dZ = A - Y =  \left[ a^{(1)}-y^{(1)},  a^{(2)}-y^{(2)},  \cdots  a^{(m)}-y^{(m)}  \right] $


$ dw = 0 $  
$ dw += x^{(1)}dz^{(1)} $    
$ dw += x^{(2)}dz^{(2)} $    
$ \cdots $   
$ dw  += x^{(m)}dz^{(m)}$   
$ dw /=m $

$ db = 0 $   
$ db  += dz^{(1)} $   
$ db  += dz^{(2)} $   
$ \cdots $   
$ db  += dz^{(m)} $   
$ db /=m $

#### With vectorization

```
db = 1 / m * np.sum(dz)
dw = 1 / m * np.dot(X,dz.T)
```

**without vectorization**:
```
J = 0, dw1=0, dw2=0, db=0

for iter in range(1000):
    for i in range(m):
        Z[i] = w.T X[i] + b
        a[i] = sigma(z[i])
        J += -[y[i]*np.log(a[i]) + (1-y[i])*np.log(1-a[i])]
        dz[i] = a[i] - y[i]
        dw1 += x[i][1] * dz[i]
        dw2 += x[i][2] * dz[i]
        db += dz[i]
    J = J / m, dw1 = dw/m, dw2 = dw2/m, db = db/m
```

**With vectoriazation**:
```
for iter in range(1000):
    Z = np.dot(w.T, X) + b
    A = sigmoid(Z)
    dZ = A - Y
    dw = 1/m * np.dot(X,dZ.T)
    db = 1/m * np.sum(dZ)
    w = w - alpha*dw
    b = b - alpha*db
```

## Broadcasting example

Caliries from Carbs, Proteins, Fats in 100g of different foods:

$ \begin{matrix}  \cdots \cdots & Apples & Beef & Eggs & Potatoes  \end{matrix}$

$ \begin{matrix}   Carb \\ Protain \\ Fat  \end{matrix}$
$ \left[ \begin{matrix} 
 56.0 & 0.0 & 4.4 & 68.0 \\
 1.2 & 104.0 & 52.0 & 8.0 \\
 1.8 & 135.0 & 99.0 & 0.9 
 \end{matrix} \right] 
 $
 
 Calculate % of calories from Carb, Protain, Fat. Can you do this without explicit for-loop?


In [3]:
import numpy as np

A = np.array([[56.0, 0.0, 4.4, 68.0],
             [1.2, 104.0, 52.0, 8.0],
             [1.8, 135.0, 99.0, 0.9]])
print(A)

[[ 56.    0.    4.4  68. ]
 [  1.2 104.   52.    8. ]
 [  1.8 135.   99.    0.9]]


In [4]:
cal = A.sum(axis=0)
print(cal)

[ 59.  239.  155.4  76.9]


In [5]:
percentage = 100 * A / cal.reshape(1,4)  # or percentage = 100 * A / cal
print(percentage)

[[94.91525424  0.          2.83140283 88.42652796]
 [ 2.03389831 43.51464435 33.46203346 10.40312094]
 [ 3.05084746 56.48535565 63.70656371  1.17035111]]


$  \left[ \begin{matrix} 1 \\ 2 \\ 3 \\ 4 \end{matrix}  \right]  + 100 =  \left[ \begin{matrix} 101 \\ 102 \\ 103 \\ 104 \end{matrix}  \right]$

$  \left[ \begin{matrix} 1 & 2 & 3\\
4 & 5 & 6 \end{matrix}  \right]  + \left[ \begin{matrix} 100 & 200 & 300 \end{matrix}  \right]
= \left[ \begin{matrix} 101 & 202 & 303\\
104 & 205 & 306 \end{matrix}  \right]  $

$  \left[ \begin{matrix} 1 & 2 & 3\\
4 & 5 & 6 \end{matrix}  \right]  + 
\left[ \begin{matrix} 100 \\ 200  \end{matrix}  \right]
= \left[ \begin{matrix} 101 & 102 & 103\\
204 & 205 & 206 \end{matrix}  \right]  $


#### General Principle

$$ \text{matrix in size of  } (m,n)  \text{   } \begin{matrix} + \\ - \\ * \\ /   \end{matrix} \text{   } \begin{matrix}  \text{matrix in size of  }  (1,n) \Rightarrow (m,n) \\ \text{matrix in size of  }  (m, 1) \Rightarrow (m,n) \end{matrix}$$
$ \left[ \begin{matrix} 1 \\ 2 \\3  \end{matrix} \right] + 100 = \left[ \begin{matrix} 101 \\ 102 \\ 103  \end{matrix} \right] $

$ \left[ \begin{matrix} 1& 2& 3  \end{matrix} \right] + 100 = \left[ \begin{matrix} 101 & 102 & 103  \end{matrix} \right] $

### Python-numpy vectors


In [6]:
import numpy as np

a = np.random.randn(5)     # => don't use it !
print(a)

[ 0.18624956  0.28232509 -0.90260203  0.2089896  -1.16819876]


In [7]:
print(a.shape)    # rank 1 array =

(5,)


In [8]:
print(a.T)        # same as a

[ 0.18624956  0.28232509 -0.90260203  0.2089896  -1.16819876]


In [9]:
print(np.dot(a,a.T))

2.3374517776930843


In [10]:
a = np.random.randn(5,1)  # a column vector => use this !
print(a)

[[ 1.08409355]
 [-0.34657684]
 [-0.41501582]
 [ 1.34656157]
 [ 1.58396164]]


In [11]:
print(a.T)

[[ 1.08409355 -0.34657684 -0.41501582  1.34656157  1.58396164]]


In [12]:
print(np.dot(a, a.T))

[[ 1.17525883 -0.37572172 -0.44991597  1.45979871  1.7171626 ]
 [-0.37572172  0.1201155   0.14383487 -0.46668705 -0.54896442]
 [-0.44991597  0.14383487  0.17223813 -0.55884435 -0.65736914]
 [ 1.45979871 -0.46668705 -0.55884435  1.81322805  2.13290186]
 [ 1.7171626  -0.54896442 -0.65736914  2.13290186  2.50893447]]


In [26]:
assert a.shape == (5,1)