## 제목

>## Machine Learning 기본 프로세스
1. Hypothesis 설정 -> 데이터를 가장 잘 표현할 수 있는 함수 H(x) 설정
2. Cost Function 설정 -> Hypothesis의 결과와 label간의 차이를 평가할 수 있는 함수 설정
3. Learning Algorithm 설계 -> <b>Cost가 최소가 되도록 H(x)의 파라미터를 조정하는 것</b>

>### Perceptron
>>##### Single Perceptron 구성
 - Activation Function = Step f, Sigmoid, ReLU 등, sum(wx+b)을 입력받아 정해진 출력을 내보내는 함수

$S=X\cdot W + b= \begin{bmatrix}x_{1}&x_{2}&x_{3}\end{bmatrix}\begin{bmatrix}w_{1}\\w_{2}\\w_{3}\end{bmatrix} + b = x_{1}w_{1} + x_{2}w_{2} + x_{3}w_{3} + b $

In [None]:
import numpy as np

X = np.array([1, 2, 3, 4, 5])
W = np.array([4, 5, 6, 7, 8])
B = 3

print(X*W) # 각자의 원소끼리 곱해짐
print(np.sum(W*X) + B) # 각자의 원소끼리 곱한 후 모두 더해짐
print(np.matmul(W, X) + B)



>### Linear Regression
- Activation func "f(x)=x" 사용... H(x) = wx + b
> Cost Function : Mean Squared Error (MSE)
> $(1/m)*\sum_{i=1}^{n}(h(x_{i})-y_{i})^{2}$
- Cost func: label과 예측값 간의 차이(Error)를 수치화 하기 위한 함수.<br>

In [1]:
import numpy as np
def Activation(x):
    return W*x + B

def Cost():
    return np.mean((Activation(X) - Y)**2)

In [4]:
X = np.array([1,2,3,4,5,6,7,8,9,10], dtype=np.float32)
Y = np.array([3,5,7,9,11,13,15,17,19,21], dtype=np.float32)

W = 3
B = 1
print(Cost())

W = 7
B = 5
print(Cost())

for W in range(10):
    for B in range(10):
        print(W, B, Cost())

38.5
1198.5
0 0 177.0
0 1 154.0
0 2 133.0
0 3 114.0
0 4 97.0
0 5 82.0
0 6 69.0
0 7 58.0
0 8 49.0
0 9 42.0
1 0 50.5
1 1 38.5
1 2 28.5
1 3 20.5
1 4 14.5
1 5 10.5
1 6 8.5
1 7 8.5
1 8 10.5
1 9 14.5
2 0 1.0
2 1 0.0
2 2 1.0
2 3 4.0
2 4 9.0
2 5 16.0
2 6 25.0
2 7 36.0
2 8 49.0
2 9 64.0
3 0 28.5
3 1 38.5
3 2 50.5
3 3 64.5
3 4 80.5
3 5 98.5
3 6 118.5
3 7 140.5
3 8 164.5
3 9 190.5
4 0 133.0
4 1 154.0
4 2 177.0
4 3 202.0
4 4 229.0
4 5 258.0
4 6 289.0
4 7 322.0
4 8 357.0
4 9 394.0
5 0 314.5
5 1 346.5
5 2 380.5
5 3 416.5
5 4 454.5
5 5 494.5
5 6 536.5
5 7 580.5
5 8 626.5
5 9 674.5
6 0 573.0
6 1 616.0
6 2 661.0
6 3 708.0
6 4 757.0
6 5 808.0
6 6 861.0
6 7 916.0
6 8 973.0
6 9 1032.0
7 0 908.5
7 1 962.5
7 2 1018.5
7 3 1076.5
7 4 1136.5
7 5 1198.5
7 6 1262.5
7 7 1328.5
7 8 1396.5
7 9 1466.5
8 0 1321.0
8 1 1386.0
8 2 1453.0
8 3 1522.0
8 4 1593.0
8 5 1666.0
8 6 1741.0
8 7 1818.0
8 8 1897.0
8 9 1978.0
9 0 1810.5
9 1 1886.5
9 2 1964.5
9 3 2044.5
9 4 2126.5
9 5 2210.5
9 6 2296.5
9 7 2384.5
9 8 2474.5
9 9 2566.

>### Gradient Descent Algorithm(경사하강법)
- Cost Function의 기울기가 최저가 되도록..

> Gradient -> partial derivative<br>
> $\frac{\partial}{\partial w}cost(w, b) = \frac{1}{m}  \sum_{i=1}^{m}(x_{i}(x_{i}w+(b-y_{i})))$
<br>
> $\frac{\partial}{\partial b}cost(w, b) = \frac{1}{m}  \sum_{i=1}^{m}(x_{i}w - y_{i} + b)$


In [16]:
x_input = np.array([1,2,3,4,5,6,7,8,9,10], dtype=np.float32)
labels = np.array([3,5,7,9,11,13,15,17,19,21], dtype=np.float32)

W, B = np.random.normal(), np.random.normal()

def Hypothesis(x):
    return W*x + B

def Cost():
    return np.mean((Hypothesis(x_input) - labels)**2)

def Gradient(x, y):
    return np.mean(x*(x*W + (B-y))), np.mean((W*x-y+B))


In [17]:
%%time

epochs = 10000
learning_rate = 0.001

for cnt in range(0, epochs+1):
    if cnt%(epochs//10) == 0:
        print(f"{cnt:5} W = {W:.3f}, B = {B:.3f} Cost = {Cost():.3f}")
    
    grad_w, grad_b = Gradient(x_input, labels)
    W-=learning_rate*grad_w
    B-=learning_rate*grad_b

    0 W = -0.282, B = 0.597 Cost = 210.673
 1000 W = 2.009, B = 0.941 Cost = 0.001
 2000 W = 2.007, B = 0.952 Cost = 0.000
 3000 W = 2.006, B = 0.961 Cost = 0.000
 4000 W = 2.005, B = 0.968 Cost = 0.000
 5000 W = 2.004, B = 0.974 Cost = 0.000
 6000 W = 2.003, B = 0.979 Cost = 0.000
 7000 W = 2.002, B = 0.983 Cost = 0.000
 8000 W = 2.002, B = 0.986 Cost = 0.000
 9000 W = 2.002, B = 0.989 Cost = 0.000
10000 W = 2.001, B = 0.991 Cost = 0.000
CPU times: total: 156 ms
Wall time: 217 ms


In [None]:
#p 57