# Linear regression

$n=[\text{number of features}]$  
$m=[\text{sample size}]$  
$x=[\text{input value}]$  
$w=[\text{input weight}]$

$\displaystyle h=\sum_{i=1}^{n}w_ix_i$  
$\displaystyle l=\frac{1}{m}\sum_{i=1}^{m}(h(x_i)-y_i)^2$  
$\displaystyle l'_w=\frac{2}{m}\sum_{i=1}^{m}(h(x_i)-y_i)x_{ij}, \quad j \in 1,...,n$  
$\displaystyle w_j=w_j - \frac{\alpha}{m}\sum_{i=1}^{m}(h(x_i)-y_i)x_{ij}, \quad j \in 1,...,n$

$H=X \cdot W$

$W = W - \frac{\alpha}{m}(H(X)-Y)X$

In [100]:
import numpy as np

#W = np.array([3,2,5])[np.newaxis].T
W = np.array([0,0,0])[np.newaxis].T
X = np.array([
        [1,3,2],
        [1,4,5],
        [1,7,4]
    ])
Y = np.array([4,5,2])[np.newaxis].T

m = Y.shape[0]
alpha = 0.01

def h(X, W):
    return X.dot(W)

def gradient(H, Y):
    return np.sum((H - Y) * X, axis=1, keepdims=True)

def gradient_descent(X, Y, W):
    return W - alpha * gradient(h(X, W), Y) / m

def loss(H, Y):
    return np.sum((H - Y)**2) / m

#X = np.insert(X, 0, 1, axis=1)

H = h(X, W)
print(loss(h(X,W), Y))
W = gradient_descent(X, Y, W)
print(W)
print(loss(h(X,W), Y))
W = gradient_descent(X, Y, W)
print(W)
print(loss(h(X,W), Y))
W = gradient_descent(X, Y, W)
print(W)
print(loss(h(X,W), Y))

#np.append(W)

[[0]
 [0]
 [0]]
15.0
[[ 0.08      ]
 [ 0.16666667]
 [ 0.08      ]]
8.55451851852
[[ 0.1452    ]
 [ 0.29511111]
 [ 0.09733333]]
6.07374964082
[[ 0.200696  ]
 [ 0.40136741]
 [ 0.07332089]]
5.22930713313


# Logistic regression

$\sigma=\displaystyle\frac{1}{1+e^{-z}}$

$\sigma'=\sigma(z)(1-\sigma(z))$

$h = \displaystyle\sum_{i=1}^{n}\sigma(w_{i}x_i)$

$l = \displaystyle\frac{1}{m}\sum_{i=1}^{m}-y\log(h(x_i)) - (1 - y)\log(1-h(x_i))$

$l_j' = \displaystyle\frac{1}{m}\sum_{i=1}^{m}(h(x_i)-y_i)x_{ij}, \quad j \in 1,...,n$

$\displaystyle w_j = w_j - \frac{\alpha}{m}\sum_{i=1}^{m}(h(x_i)-y_i)x_{ij}, \quad j \in 1,...,n$

### Vectorization

$H(X)=\displaystyle\frac{1}{1+e^{-WX}}$

$W = W - \frac{\alpha}{m}(H(X)-Y)X$

In [28]:
#W = np.array([
#        [3],
#        [2],
#        [5]
#    ])

W = np.zeros(3).reshape()

X = np.array([
        [1,3,2],
        [1,4,5],
        [1,7,4]
    ])

Y = np.array([
        [4],
        [5],
        [2]
    ])

print(W)

[[3]
 [2]
 [5]]


# Neural networks
Neural networks offers an alternate way to perform machine learning when we have complex hypotheses with many features.