<a href="https://colab.research.google.com/github/kangwonlee/nmisp/blob/main/15_optimization/030_Classification_Optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# 분류 최적화<br>Classification Optimization


* Let's say there are two sets : $set_0$ & $set_1$.<br>두 집합 $set_0$ & $set_1$ 이 있다고 하자.
* Each set has $\frac{n}{2}$ entries.<br>각 집합에는 각각 $\frac{n}{2}$ 원소가 있다.
* We can measure two variables : $\textbf{x} = (x_1, x_2)$.<br>
우리는 두 집합의 각 원소에 대해 $\textbf{x} = (x_1, x_2)$ 두가지 값을 측정할 수 있다.
* Can we decide which entry belongs to which set based on these two measurements?<br>이 두 측정값을 이용하여 어떤 원소가 어떤 집합에 속하는지 알 수 있을까?



In [None]:
import matplotlib.pyplot as plt
import numpy as np
import numpy.random as nr
import scipy.optimize as so



$(x_1, x_2)$ 데이터 집합 두개 생성<br>Generating two data sets



In [None]:
set_0_bar = (2, 0)
set_1_bar = (0, 2)



In [None]:
n = 2000

set_0 = nr.normal(set_0_bar, [1, 1], (n//2, 2))
set_1 = nr.normal(set_1_bar, [1, 1], (n//2, 2))



생성한 두 데이터 집합을 표시<br>
Visualizing the two data sets



In [None]:
def plot_two_sets(set_a, set_b, set_a_x_bar=set_0_bar, set_b_x_bar=set_1_bar):

    plt.plot(set_a[:, 0], set_a[:, 1], '.', label="y=0", alpha=0.5)
    plt.plot(set_b[:, 0], set_b[:, 1], '+', label="y=1", alpha=0.5)

    plt.plot(set_a_x_bar[0], set_a_x_bar[1], 'kx')
    plt.plot(set_b_x_bar[0], set_b_x_bar[1], 'kx')
    
    plt.grid(True)
    plt.axis('equal')
    plt.xlabel("$x_1$")
    plt.ylabel("$x_2$")



In [None]:
plot_two_sets(set_0, set_1, set_0_bar, set_1_bar)

plt.legend(loc=0)
plt.show()
plt.close();



## 데이터 준비<br>Prepare data



모든 입력 값과 출력 값을 하나의 배열 안에 모음<br>
Collect all input & output values into one `numpy.ndarray`

| 입력 input | 출력 output |
|:----------:|:-----------:|
| $(x_1, x_2)$ | 0 |
| $(x_1, x_2)$ | 1 |



In [None]:
y0 = np.zeros((len(set_0), 1))
y1 = np.ones((len(set_1), 1))

data_0 = np.concatenate([set_0, y0], axis=1)
data_1 = np.concatenate([set_1, y1], axis=1)

data = np.concatenate([
        data_0,
        data_1
    ], axis=0)



행과 열의 갯수 확인<br>
Check the number of rows and columns



In [None]:
data.shape



처음 10개의 data<br>First 10 data points



In [None]:
data[:10, :]



마지막 10개의 data<br>Last 10 data points



In [None]:
data[-10:, :]



## 첫번째 (순진한) 시도<br>First (naive) attempt



### 모델<br>Model



$x_1$, $x_2$ 로부터 $y$ 값을 추정<br>
Estimate $y$ from $x_1$ and $x_2$



$$
\hat y = H(\textbf x)= w_1 x_1 +  x_2 + w_2
$$



In [None]:
def wx(w:np.ndarray, x_y:np.ndarray) -> np.ndarray:
    w1 = w[0]
    w2 = w[1]
    
    x1 = x_y[:, 0]
    x2 = x_y[:, 1]

    return w1 * x1 + x2 + w2



이 $\hat y$ 값이 0.5 보다 크면 1, 아니면 0 인 것으로 가정.<br>Let's assume it was one if this $\hat y$ is larger than 0.5. Zero otherwise.


### 비용 함수<br>Cost function



$$
C = \frac{1}{n}\sum_{i=1}^{n} \left( \hat y_i - y_i \right)^2
$$


예상되는 문제점?<br>
Can you expect any possible issues?



In [None]:
def cost_function_first_attempt(w:np.ndarray, x_y:np.ndarray) -> float:
    n = len(x_y)
    y_hat = wx(w, x_y)
    y = x_y[:, -1]

    error = y_hat - y
    error_sqr = error ** 2
    result = error_sqr.sum() / n

    return result



최적화<br>
Optimize



In [None]:
result = so.minimize(cost_function_first_attempt, x0=nr.rand(2,), args=(data,), method="Nelder-Mead")
weights, cost_value, n_iter, n_call, warning = result.x, result.fun, result.nit, result.nfev, result.message
result



Decision bounday satsfying $\hat y = 0.5$<br>
0 과 1이 나누어지는 경계 : $\hat y = 0.5$



$$
\begin{align}
    \hat y &= 0.5 \\
    w_1 x_1 +  x_2 + w_2 &= 0.5 \\
     x_2 &= -w_1 x_1 -w_2 + 0.5
\end{align}
$$



In [None]:
def plot_decision_boundary(x_min, x_max, weights_array):
    x1_array = np.linspace(x_min, x_max)
    x2_array = - weights_array[0] * x1_array - weights_array[1] + 0.5

    plt.plot(x1_array, x2_array, label="$\hat y = 0.5$")



In [None]:
plot_two_sets(set_0, set_1)
plot_decision_boundary(data[:, 0].min(), data[:, 0].max(), weights)

plt.legend(loc=0)
plt.show()
plt.close();



## 계단함수<br>Step function



0 또는 1로 바꾸어주는 함수<br>
A function generating 0 or 1<br>



$$
s(z)=
    \begin{cases}
        0, z < 0\\
        1, z >= 0 \\
    \end{cases}
$$



In [None]:
def step(z):
    return np.heaviside(z, 0)



In [None]:
z_array = np.linspace(-10, 10)
g_z_array = step(z_array)
plt.plot(z_array, g_z_array)
plt.grid(True)
plt.xlabel('$z$')
plt.ylabel("$g(z)$")
plt.show()



계단 함수를 사용하는 비용함수<br>
Cost function using the sigmoid function



$$
H(\textbf x) = s(w_1 x_1 +  x_2 + w_2)\\
cost(w_1, w_2) = \frac{1}{n} \sum_{i=1}^n {\left(H(\textbf{x}_i) - y_i\right)^2} 
$$


In [None]:
def cost_function_step(w:np.ndarray, x_y:np.ndarray) -> float:
    n = len(x_y)
    y_hat = step(wx(w, x_y))
    y = x_y[:, -1]

    error = y_hat - y
    error_sqr = error ** 2
    result = error_sqr.sum() / n

    return result



최적화<br>
Optimize



In [None]:
result = so.minimize(cost_function_step, x0=nr.rand(2,), args=(data,), method="Nelder-Mead")
weights, cost_value, n_iter, n_call, warning = result.x, result.fun, result.nit, result.nfev, result.message
result



In [None]:
plot_two_sets(set_0, set_1)
plot_decision_boundary(data[:, 0].min(), data[:, 0].max(), weights)

plt.legend(loc=0)
plt.show()
plt.close();



## Sigmoid function



0 과 1 사이를 부드럽게 연결하는 함수<br>
A function connecting 0 and 1 smoothly<br>
ref : [![youtube](https://i.ytimg.com/vi/PIjno6paszY/hqdefault.jpg)](https://youtu.be/PIjno6paszY?t=650)



$$
z = w_1 x_1 +  x_2 + w_2
$$



$$
\hat y =  g(z)=\frac{1}{1+exp\left(-z\right)}
$$



$$
C = \frac{1}{n}\sum_{i=1}^{n} \left( \hat y_i - y_i \right)^2
$$


In [None]:
def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-z))



In [None]:
z_array = np.linspace(-10, 10)
g_z_array = sigmoid(z_array)
plt.plot(z_array, g_z_array)
plt.grid(True)
plt.xlabel('$z$')
plt.ylabel("$g(z)$")
plt.show()



시그모이드 함수를 사용하는 비용함수<br>
Cost function using the sigmoid function



In [None]:
def cost_function_sigmoid(w:np.ndarray, x_y:np.ndarray) -> float:
    n = len(x_y)
    y_hat = sigmoid(wx(w, x_y))
    y = x_y[:, -1]

    error = y_hat - y
    error_sqr = error ** 2
    result = error_sqr.sum() / n

    return result



최적화<br>
Optimize



In [None]:
result = so.minimize(cost_function_sigmoid, x0=nr.rand(2,), args=(data,), method="Nelder-Mead")
weights, cost_value, n_iter, n_call, warning = result.x, result.fun, result.nit, result.nfev, result.message
result



In [None]:
plot_two_sets(set_0, set_1)
plot_decision_boundary(data[:, 0].min(), data[:, 0].max(), weights)

plt.legend(loc=0)
plt.show()
plt.close();



## 교차 엔트로피 비용함수<br>Cross entropy cost function



국소 최소점을 피해 전역 최소점을 찾기 위해 사용<br>To find global and avoid local minimum.



ref : (14:23)
[![youtube](https://i.ytimg.com/vi/6vzchGYEJBc/hqdefault.jpg)](https://youtu.be/6vzchGYEJBc)



$$
C = \frac{1}{n}\sum_{i=1}^{n} \left[ -y_i log \left( \hat y_i \right) - \left(1 - y_i \right) log \left( 1 - \hat y_i \right)\right]
$$


In [None]:
def cost_function_cross_entropy(w:np.ndarray, x_y:np.ndarray) -> float:
    n = len(x_y)
    y_hat = sigmoid(wx(w, x_y))
    y = x_y[:, -1]

    cost = -y * np.log2(y_hat) - (1 - y) * np.log2(1 - y_hat)

    return np.mean(cost)



최적화<br>
Optimize



In [None]:
result = so.minimize(cost_function_cross_entropy, x0=nr.rand(2,), args=(data,), method="Nelder-Mead")
weights, cost_value, n_iter, n_call, warning = result.x, result.fun, result.nit, result.nfev, result.message
result



In [None]:
plot_two_sets(set_0, set_1)
plot_decision_boundary(data[:, 0].min(), data[:, 0].max(), weights)

plt.legend(loc=0)

plt.show()
plt.close();



## scikit-learn



ref :
* [[0](https://scikit-learn.org/stable/modules/lda_qda.html)] description
* [[1](https://scikit-learn.org/stable/auto_examples/classification/plot_lda_qda.html)] example


In [None]:
import sklearn.discriminant_analysis as sd
lda = sd.LinearDiscriminantAnalysis(solver="svd", store_covariance=True)

X = data[:, :2]
y = data[:, 2]

lda.fit(X, y)



In [None]:
def plot_mesh_pred(lda, x_min, x_max, y_min, y_max, nx=100, ny=100):

    x_mesh, y_mesh = np.meshgrid(
        np.linspace(x_min, x_max, nx),
        np.linspace(y_min, y_max, ny),
    )

    xy_mesh_columns = np.c_[x_mesh.ravel(), y_mesh.ravel()]

    z_column = lda.predict_proba(xy_mesh_columns)

    z_mesh = z_column[:, 1].reshape(x_mesh.shape)

    plt.pcolor(x_mesh, y_mesh, z_mesh, shading="auto")
    plt.contour(x_mesh, y_mesh, z_mesh, [0.5], colors="white")
    
    plt.grid(True)



In [None]:
plot_two_sets(set_0, set_1)
plot_mesh_pred(lda, -10, 10, -6, 10)

plt.legend(loc=0)
plt.show()
plt.close();



## Tensorflow



Got some help from https://chat.openai.com to generate following code.<br>
아래의 코드를 생성하기 위해 일부 https://chat.openai.com 의 도움을 받았음.



In [None]:
import numpy as np
import tensorflow as tf


X = data[:, :2]
y = data[:, 2]


# Define the model architecture
model = tf.keras.Sequential([
  tf.keras.layers.Dense(1, input_shape=(2,), activation='sigmoid',)
])


# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


# Train the model
model.fit(X, y, epochs=50, batch_size=10)


# Evaluate the model
loss, accuracy = model.evaluate(X, y)
print("Accuracy:", accuracy)


In [None]:
def plot_decision_boundary_tf(tf_model, x_min, x_max, nx=100):

    x_array = np.linspace(x_min, x_max, nx)

    weights, bias = tf_model.get_weights()

    # w0 x0 + w1 x1 + bias
    # -> x1 = -(w0/w1) x0 - bias/w1
    slope = -weights[0] / weights[1]
    intercept = -bias / weights[1]

    decision_bounday = slope * x_array + intercept
    plt.plot(x_array, decision_bounday, 'k-')
    
    plt.grid(True)



In [None]:
plot_two_sets(set_0, set_1)
plot_decision_boundary_tf(
    model,
    *(plt.gca().get_xlim())
)



## Final Bell<br>마지막 종



In [None]:
# stackoverfow.com/a/24634221
import os
os.system("printf '\a'");

