* [Vectorization](#vect)
* [Vectorizing Logistic Regression](#vectlogreg)
* [Broadcasting](#broadcasting)
* [Python Numpy](#numpy)
* [Explanation of Logistic Regression Cost function](#explanation-of-logistic-regression-cost-function)

<img id="vect" src="https://i.imgur.com/TxXjiYe.png" style="width:650px;height:370px; float: left;">

In [1]:
import numpy as np

a = np.array([1,2,3,4])
print(a)
a = a.reshape(4,1)
print(a)

[1 2 3 4]
[[1]
 [2]
 [3]
 [4]]


In [2]:
import time

a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time()
c = np.dot(a,b)
toc = time.time()

print(c)
print("Vectorized version:" + str(1000*(toc-tic)) + "ms")

c = 0
tic = time.time()
for i in range(1000000):
    c += a[i]*b[i]
toc = time.time()

print(c)
print("For loop version:" + str(1000*(toc-tic)) + "ms")

250424.8350604418
Vectorized version:16.0214900970459ms
250424.83506043287
For loop version:364.59803581237793ms


<img src="https://i.imgur.com/dazXdzS.png" style="width:650px;height:380px; float: left;">

<img src="https://i.imgur.com/TAYpdTB.png" style="width:650px;height:380px; float: left;">

<img src="https://i.imgur.com/kWcoJQh.png" style="width:650px;height:380px; float: left;">

<img id="vectlogreg" src="https://i.imgur.com/H42svVk.png" style="width:650px;height:380px; float: left;">

$$z = [z^{(1)}, ..., z^{(m)}] = [w^{T}x^{(1)}+b, ..., w^{T}x^{(m)}+b] = np.dot(w\cdot T, x) + b$$

$$A = \sigma(z) = [\sigma(w^{T}x^{(1)}+b), ..., \sigma(w^{T}x^{(m)}+b)] = [a^{(1)}, ... , a^{(m)}]$$

<img src="https://i.imgur.com/Xng06L5.png" style="width:650px;height:380px; float: left;">

<img src="https://i.imgur.com/SAyvy3n.png" style="width:650px;height:380px; float: left;">

<img id="broadcasting" src="https://i.imgur.com/yKUl3eP.png" style="width:650px;height:400px; float: left;">

In [3]:
import numpy as np

A = np.array([[56.0, 0.42, 1.0, 68.0],
             [1.2,104.0,52.0,8.0],
             [1.8,135.0,99.0,0.9]])

print(A.sum(axis=0))
print(A.sum(axis=1))

[ 59.   239.42 152.    76.9 ]
[125.42 165.2  236.7 ]


In [4]:
cal = A.sum(axis=0)
print(cal)

percentage = A/cal.reshape(1,4)
print(percentage)

[ 59.   239.42 152.    76.9 ]
[[0.94915254 0.00175424 0.00657895 0.88426528]
 [0.02033898 0.43438309 0.34210526 0.10403121]
 [0.03050847 0.56386267 0.65131579 0.01170351]]


<img src="https://i.imgur.com/W3neiIl.png" style="width:650px;height:400px; float: left;">

<img src="https://i.imgur.com/6TyoPUd.png" style="width:650px;height:400px; float: left;">

<img id="numpy" src="https://i.imgur.com/ydP5Lq3.png" style="width:620px;height:340px; float: left;">

DO NOT USE $$a=np.random.randn(5)$$
USE $$a=np.random.randn(5,1)$$

In [5]:
import numpy as np

a = np.random.randn(5)
print(a)

[ 0.81703899 -1.64804466 -1.58170196  1.33225987  0.88114986]


In [6]:
#DO NOT USE rank 1 array(neither row/column vector)
print(a.shape)

(5,)


In [7]:
print(a.T)

[ 0.81703899 -1.64804466 -1.58170196  1.33225987  0.88114986]


In [8]:
print(np.dot(a,a.T))

8.436726417092613


In [9]:
# column vector
a = np.random.randn(5,1)
print(a)

[[ 2.74656694]
 [ 0.90541408]
 [ 1.48085694]
 [ 0.08489925]
 [-2.11077875]]


In [10]:
print(a.T)

[[ 2.74656694  0.90541408  1.48085694  0.08489925 -2.11077875]]


In [11]:
print(np.dot(a,a.T))

[[ 7.54362995e+00  2.48678037e+00  4.06727272e+00  2.33181486e-01
  -5.79739512e+00]
 [ 2.48678037e+00  8.19774654e-01  1.34078872e+00  7.68689804e-02
  -1.91112880e+00]
 [ 4.06727272e+00  1.34078872e+00  2.19293728e+00  1.25723651e-01
  -3.12576136e+00]
 [ 2.33181486e-01  7.68689804e-02  1.25723651e-01  7.20788344e-03
  -1.79203542e-01]
 [-5.79739512e+00 -1.91112880e+00 -3.12576136e+00 -1.79203542e-01
   4.45538692e+00]]


In [12]:
assert(a.shape == (5,1))

# Explanation of Logistic Regression Cost function
$$\hat{y}^{(i)}= \sigma(w^T X^{(i)} + b) = \sigma(z^{(i)}) = \frac{1}{1+e^{-z}},\ where\ z^{(i)}= w^T X^{(i)} + b$$

$$\hat{y}= P(y=1|X)$$

$$If\ y=1:\ P(y|x) = \hat{y}$$
$$If\ y=0:\ P(y|x) = 1 - \hat{y}$$

$$P(y|x) = \hat{y}^{y}\cdot(1 - \hat{y})^{(1-y)}$$

$$\log{P(y|x)} = \log{\hat{y}^{y}\cdot(1 - \hat{y})^{(1-y)}} = y\cdot\log{\hat{y}} + (1 - y)\cdot\log(1-\hat{y}) = -L(\hat{y}, y)$$

# Cost on m examples
find maximum likelyhood estimation; choose parameters to maximize probability
$$\log{p(labels\ in\ training\ set)} = \log{\prod_{i=1}^{m}p(y^{(i)}|x^{(i)})} = \sum_{i=1}^{m}\log{p(y^{(i)}|x^{(i)})}$$
$$ = \sum_{i=1}^{m}-L(\hat{y}^{(i)}, y^{(i)})$$

minimize cost to maximize the probability
$$Cost\ = J(w,b) = \frac{1}{m}\cdot\sum_{i=1}^{m}-L(\hat{y}^{(i)}, y^{(i)})$$