#Logistic Regresion
Ta định nghĩa với mọi $t \in R$ thì hàm sigmoid được định nghĩa như sau:
$$f(t)=\frac{1}{1+\epsilon^{-t}}$$
$x_i \in R^{n \times 1}$ là một sample thứ $i$ trong tập dữ liệu,  $y_i \in R^{1 \times 1}$ là class của sample thứ $i$, $W \in R^{n \times 1}$ là trọng số mà ta cần tìm, ta có:

$$\hat{y_i}=f(x_i^TW)$$

Ta định nghĩa hàm Loss như sau:

$$Loss=\frac{1}{N}\sum^N-Ylog[f(X^TW)]-(1-Y)log[1-f(X^TW)]$$

Với $X \in R^{n \times 1}$, $Y \in R^{n \times 1}$, $W \in R^{n \times 1}$.

Đặt $Z=f(X^TW)$, ta có
$$\nabla_W Loss = -\frac{1}{N} \sum^N(\frac{Y}{Z}-\frac{1-Y}{1-Z})\frac{\partial Z}{\partial W}$$

Mà: $\frac{\partial Z}{\partial W}=Z(1-Z)X$ nên:

$$\nabla_W Loss = -\frac{1}{N} \sum^N(Y-Z)X$$

suy ra: 
$$W:=W-lr\frac{1}{N} \sum^N(Z-Y)X$$





#Softmax Regression
Ta hàm softmax như sau:
$$a_i=\frac{e^{X^Tw_i}}{\sum_{j=1}^Ce^{X^Tw_i}}$$

Với $w_i$ chính là trọng số cho hàm softmax của class thứ $i$. Nghĩa là với class thứ $i$ ta tương ứng có $a_i$ là hàm dự đoán xác xuất để sample $x_i$ rơi vào class này. $W=[w_1,w_2,...,w_C]$ là ma trận trọng số cần tìm, $W \in R^{n \times C}$, Với C là số Classes có trong dữ liệu

Ngoài ta ta phải đổi y từ dạng scaler sang vector theo onehot encoding, tức là:
$$y=[y_1,y_2,...,y_C]$$ 
với $\sum_{i=1}^Cy_i=1$


Ta định nghĩa hàm Loss như sau:

$$Loss=\frac{1}{N} \sum^N(-\sum_{i=1}^C y_i log(\frac{e^{x^Tw_i}}{\sum_j^Ce^{x^Tw_j}})) $$

$$Loss=\frac{1}{N} \sum^N(-\sum_{i=1}^C (y_ix^Tw_i-y_ilog(\sum_j^Ce^{x^Tw_j}))) $$

$$Loss=\frac{1}{N} \sum^N(-\sum_{i=1}^C (y_ix^Tw_i) +log(\sum_j^Ce^{x^Tw_j})) $$

Gradient:
$$\nabla_W Loss = [\frac{\partial Loss}{\partial w_1}, \frac{\partial Loss}{\partial w_2},...,\frac{\partial Loss}{\partial w_C}]$$

Với $\frac{\partial Loss}{\partial w_i}=\frac{1}{N} \sum^N(-y_i+\frac{e^{x^Tw_i}}{\sum_j^Ce^{x^Tw_j}})x$

Từ đây ta có công thức cập nhật:

$$W:= W- lr\nabla_W Loss $$

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

- Data này là dùng các features như tuổi, giới tính, lượng cholesterol để dự đoán bệnh nhân có bị mắc bệnh tim mạch hay không.

- target gồm 2 label 1 và 0 tương ứng là mắc bệnh hay không mắc bệnh.

#Bài tập

1. Hãy xây dựng mô hình logistic regression bằng tất cả các features trong file heart, so sánh với thư viện sklearn.
2. Hãy xây dựng mô hình softmax regression trên bộ Iris (nên Normalize data), so sánh với thư viện sklearn.

# Bài 1

In [2]:
data = pd.read_csv("heart.csv")
data

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
298,57,0,0,140,241,0,1,123,1,0.2,1,0,3,0
299,45,1,3,110,264,0,1,132,0,1.2,1,0,3,0
300,68,1,0,144,193,1,1,141,0,3.4,1,2,3,0
301,57,1,0,130,131,0,1,115,1,1.2,1,1,3,0


In [3]:
X = data.drop('target',axis = 1).values
#X = df.loc[:,'chol'].values.reshape(df.shape[0],-1)
y = data.loc[:,'target'].values.reshape(data.shape[0],-1)


In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
                                                    random_state=42)
y_train = y_train.reshape(-1,1)
y_test = y_test.reshape(-1,1)
min = np.min(X_train, axis=0)
max = np.max(X_train, axis=0)
X_train = (X_train-min) / (max -min)

X_test = (X_test-min) / (max-min)


In [5]:
#Hàm sigmoid
def sigmoid(W,X):
  """
  W là trọng số
  X là sample(s)
  """
  return 1/(1+np.exp((-X.dot(W))))

In [6]:
def logictisRegression(X, y, learning_rate, epoch):
    #W_init=np.random.rand(14,1)
    #W_init=np.array([[ 2.23789173],
    #       [-0.96136103]]) #Đây là khởi tạo
    Xbar = np.concatenate([np.ones([X.shape[0],1]),X],axis = 1)
    W = np.zeros([Xbar.shape[1],1])
    """W_init=np.array([[ 0.38652764],
           [-1.0147613 ]]) #lưu ý đây là W_init gần với W cần tìm để chạy cho nhanh
           """
    bestLoss = 1e9
    bestW = W
    preLoss = 1e9
    iStop = -1
    for i in range(epoch):
        prediction = sigmoid(W,Xbar).reshape(y.shape[0],-1) # Dự đoán

        error = np.matmul(Xbar.T,prediction - y).reshape(W.shape[0],-1) #Tính error

        gradient = 1/Xbar.shape[0]*error #Tính Gradient
        preShape = W.shape
        W = W - learning_rate * gradient #cập nhật W
        loss = np.mean(-y*np.log(prediction)-(1-y)*np.log(1-prediction))
        if(i%1000 == 0):
            print("Loss at epoch {}: {}".format(i,loss))
        # Cap nhat W tot nhat
        if(bestLoss > loss):
            bestLoss = loss
            bestW = W
        # Thay doi qua it sau 1000 epoch thi thoat
        if(abs(loss-preLoss)<1e-5):
            if(i-iStop >= 1000):
                break
        else:
            preLoss = loss
            iStop = i
    return bestW

In [7]:
W = logictisRegression(X_train, y_train, learning_rate=0.0001, epoch = 1000000)

Loss at epoch 0: 0.6931471805599452
Loss at epoch 1000: 0.68964080036551
Loss at epoch 2000: 0.6861905299088603
Loss at epoch 3000: 0.6827945207767044
Loss at epoch 4000: 0.6794511198519207
Loss at epoch 5000: 0.6761588374658
Loss at epoch 6000: 0.6729163209376794
Loss at epoch 7000: 0.6697223326018695
Loss at epoch 8000: 0.6665757315695962
Loss at epoch 9000: 0.6634754585977738
Loss at epoch 10000: 0.6604205235404841
Loss at epoch 11000: 0.6574099949461603
Loss at epoch 12000: 0.6544429914363821
Loss at epoch 13000: 0.651518674563121
Loss at epoch 14000: 0.6486362428921462
Loss at epoch 15000: 0.6457949271027799
Loss at epoch 16000: 0.6429939859295917
Loss at epoch 17000: 0.6402327028011338
Loss at epoch 18000: 0.6375103830554035
Loss at epoch 19000: 0.634826351632173
Loss at epoch 20000: 0.6321799511593531
Loss at epoch 21000: 0.6295705403647152
Loss at epoch 22000: 0.6269974927560542
Loss at epoch 23000: 0.6244601955226626
Loss at epoch 24000: 0.6219580486190871
Loss at epoch 25000:

Loss at epoch 206000: 0.43451681220868027
Loss at epoch 207000: 0.4341132111528138
Loss at epoch 208000: 0.4337123104842522
Loss at epoch 209000: 0.4333140823094494
Loss at epoch 210000: 0.4329184991105357
Loss at epoch 211000: 0.4325255337392595
Loss at epoch 212000: 0.4321351594110395
Loss at epoch 213000: 0.4317473496991267
Loss at epoch 214000: 0.4313620785288733
Loss at epoch 215000: 0.43097932017210605
Loss at epoch 216000: 0.4305990492416023
Loss at epoch 217000: 0.4302212406856668
Loss at epoch 218000: 0.4298458697828068
Loss at epoch 219000: 0.4294729121365045
Loss at epoch 220000: 0.4291023436700827
Loss at epoch 221000: 0.4287341406216645
Loss at epoch 222000: 0.4283682795392233
Loss at epoch 223000: 0.4280047372757225
Loss at epoch 224000: 0.42764349098434135
Loss at epoch 225000: 0.4272845181137877
Loss at epoch 226000: 0.4269277964036938
Loss at epoch 227000: 0.42657330388009457
Loss at epoch 228000: 0.42622101885098634
Loss at epoch 229000: 0.4258709199019649
Loss at epo

Loss at epoch 407000: 0.38551546274301535
Loss at epoch 408000: 0.38536680700287884
Loss at epoch 409000: 0.3852187062391611
Loss at epoch 410000: 0.3850711571394534
Loss at epoch 411000: 0.3849241564185864
Loss at epoch 412000: 0.3847777008183486
Loss at epoch 413000: 0.3846317871072067
Loss at epoch 414000: 0.3844864120800318
Loss at epoch 415000: 0.3843415725578262
Loss at epoch 416000: 0.3841972653874558
Loss at epoch 417000: 0.38405348744138396
Loss at epoch 418000: 0.3839102356174107
Loss at epoch 419000: 0.3837675068384123
Loss at epoch 420000: 0.38362529805208706
Loss at epoch 421000: 0.3834836062307023
Loss at epoch 422000: 0.3833424283708442
Loss at epoch 423000: 0.38320176149317264
Loss at epoch 424000: 0.38306160264217637
Loss at epoch 425000: 0.3829219488859333
Loss at epoch 426000: 0.38278279731587234
Loss at epoch 427000: 0.3826441450465389
Loss at epoch 428000: 0.382505989215363
Loss at epoch 429000: 0.38236832698242934
Loss at epoch 430000: 0.3822311555302523
Loss at e

Loss at epoch 609000: 0.3635700017880194
Loss at epoch 610000: 0.36349037662581885
Loss at epoch 611000: 0.3634109570199612
Loss at epoch 612000: 0.3633317421342059
Loss at epoch 613000: 0.3632527311370239
Loss at epoch 614000: 0.36317392320156355
Loss at epoch 615000: 0.3630953175056175
Loss at epoch 616000: 0.363016913231588
Loss at epoch 617000: 0.3629387095664551
Loss at epoch 618000: 0.3628607057017423
Loss at epoch 619000: 0.3627829008334854
Loss at epoch 620000: 0.3627052941621995
Loss at epoch 621000: 0.3626278848928466
Loss at epoch 622000: 0.3625506722348043
Loss at epoch 623000: 0.36247365540183485
Loss at epoch 624000: 0.3623968336120526
Loss at epoch 625000: 0.3623202060878942
Loss at epoch 626000: 0.36224377205608754
Loss at epoch 627000: 0.3621675307476215
Loss at epoch 628000: 0.36209148139771496
Loss at epoch 629000: 0.3620156232457889
Loss at epoch 630000: 0.36193995553543445
Loss at epoch 631000: 0.36186447751438505
Loss at epoch 632000: 0.3617891884344867
Loss at ep

Loss at epoch 808000: 0.35092552371770774
Loss at epoch 809000: 0.35087482026403766
Loss at epoch 810000: 0.350824218195312
Loss at epoch 811000: 0.35077371719635553
Loss at epoch 812000: 0.35072331695333847
Loss at epoch 813000: 0.35067301715376975
Loss at epoch 814000: 0.35062281748648944
Loss at epoch 815000: 0.3505727176416617
Loss at epoch 816000: 0.3505227173107675
Loss at epoch 817000: 0.35047281618659715
Loss at epoch 818000: 0.3504230139632437
Loss at epoch 819000: 0.35037331033609576
Loss at epoch 820000: 0.35032370500182997
Loss at epoch 821000: 0.3502741976584046
Loss at epoch 822000: 0.35022478800505236
Loss at epoch 823000: 0.3501754757422738
Loss at epoch 824000: 0.35012626057183016
Loss at epoch 825000: 0.35007714219673647
Loss at epoch 826000: 0.35002812032125497
Loss at epoch 827000: 0.349979194650889
Loss at epoch 828000: 0.3499303648923752
Loss at epoch 829000: 0.3498816307536777
Loss at epoch 830000: 0.34983299194398154
Loss at epoch 831000: 0.34978444817368554
Los

In [8]:
X_train_bar = np.concatenate([np.ones([X_train.shape[0],1]),X_train],axis = 1)
y_pred = sigmoid(W, X_train_bar)
y_pred = np.where(y_pred>0.5,1,0)
print("Code train accuracy: {}".format(np.mean(y_train==y_pred)))

Code train accuracy: 0.8571428571428571


In [9]:
X_test_bar = np.concatenate([np.ones([X_test.shape[0],1]),X_test],axis = 1)
y_pred = sigmoid(W, X_test_bar)
y_pred = np.where(y_pred>0.5,1,0)
print("Code test accuracy: {}".format(np.mean(y_test==y_pred)))

Code test accuracy: 0.8


In [10]:
#Kiểm tra lại
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(random_state=0).fit(X_train, y_train)
clf.score(X_test,y_test)

  return f(**kwargs)


0.81

# Bài 2

In [11]:
from sklearn import datasets

# import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(
  X, y, test_size=0.33, random_state=42)
#Normalize
max = np.max(X_train, axis=0)
min = np.min(X_train, axis=0)
X_train = (X_train - min) / (max -min)
X_test = (X_test - min) / (max - min)

In [12]:
#X_train bar
X_train_bar=np.concatenate([np.ones([X_train.shape[0],1]),X_train], axis=1)
#y_train onehot
y_train_onehot = np.zeros((y_train.size, y_train.max() + 1), dtype=int)
y_train_onehot[np.arange(y_train.size), y_train.reshape(-1)] = 1
#X_test_bar
X_test_bar = np.concatenate([np.ones([X_test.shape[0],1]), X_test], axis=1)
#y_test onehot
y_test_onehot = np.zeros((y_test.size, y_test.max() + 1), dtype=int)
y_test_onehot[np.arange(y_test.size), y_test.reshape(-1)] = 1

In [13]:
#Add one to X
X_train_bar = np.concatenate([np.ones([X_train.shape[0],1]),X_train], axis=1)
X_test_bar = np.concatenate([np.ones([X_test.shape[0],1]),X_test], axis=1)

In [14]:
X = X_train_bar
y = y_train_onehot

In [15]:
def grad(W , X, y, a):
    W_temp = np.zeros([X.shape[1], y.shape[1]])
    for i in range(y.shape[1]):
        temp = i + 1
        W_temp[:, i:temp] = np.mean(a[:, i:temp] * X, axis=0).reshape(-1, 1)
    return W_temp

In [16]:
def softmax(W, learning_rate, epochs):
    global X, y
    
    for i in range(epochs):
        expi = np.exp(X.dot(W))
        expsum = np.sum(expi, axis=1)
        soft1 = expi / expsum.reshape(-1, 1)
        a = -y + soft1
        W = W - learning_rate * grad(W,X,y,a)

    print(f'Final W: {W}')
    return W

In [17]:
W = np.array([[1],
              [1],
              [1],
              [1],
              [1]])

In [18]:
W0 = softmax(W, learning_rate = 0.002, epochs = 50000)

Final W: [[ 2.52887093  2.07444114 -1.60731207]
 [-0.34476822  1.16915554  2.17383033]
 [ 3.34514848 -0.69112068  0.34413401]
 [-1.93574407  1.79475293  3.13905114]
 [-1.7577722   0.87812648  3.87778072]]


In [19]:
expi = np.exp(X_test_bar.dot(W0))
expsum = np.sum(expi, axis=1)  
soft1 = expi / expsum.reshape(-1,1)
soft1

array([[4.98851932e-02, 5.99050628e-01, 3.51064179e-01],
       [9.16088748e-01, 7.41445168e-02, 9.76673512e-03],
       [2.89649475e-04, 1.48350687e-01, 8.51359663e-01],
       [4.49346340e-02, 5.22744122e-01, 4.32321244e-01],
       [2.29703194e-02, 5.01593102e-01, 4.75436579e-01],
       [8.62330835e-01, 1.24266016e-01, 1.34031495e-02],
       [1.37553275e-01, 5.90272017e-01, 2.72174708e-01],
       [4.87819469e-03, 2.10776517e-01, 7.84345288e-01],
       [1.31709181e-02, 6.04992622e-01, 3.81836459e-01],
       [8.66827014e-02, 6.42157344e-01, 2.71159955e-01],
       [1.34738736e-02, 2.91696223e-01, 6.94829904e-01],
       [8.60280035e-01, 1.33039259e-01, 6.68070583e-03],
       [9.12626802e-01, 8.04215944e-02, 6.95160341e-03],
       [8.68172144e-01, 1.24892421e-01, 6.93543466e-03],
       [9.43454820e-01, 5.11650304e-02, 5.38014978e-03],
       [5.07768460e-02, 4.13269216e-01, 5.35953938e-01],
       [3.74012873e-03, 2.32057305e-01, 7.64202566e-01],
       [7.95815900e-02, 6.97436

In [20]:
#Kết quả
np.mean(y_test.reshape(-1) == np.argmax(expi / expsum.reshape(-1,1), axis=1))

0.9

In [21]:
#dùng sklearn
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(random_state=0).fit(X_train, y_train)
clf.score(X_test, y_test)

0.9