## Task 2 Logistic regression

这个部分与学长给的tutorial方法类似，探究了不同学习率下的成功率可能性，但是发现学习率的调整似乎对于最终准确率没有实质的影响，猜测原因可能是样本集较小的原因；
此外，random_state选择正如他的名字而言，充满了随机性，在选择5、10的时候本样例可能可以达到当前分类的最高值，但是在其他随机种子的情况下，表现可能甚至无法超过50%，说明Logistic regression所训练出来的模型通用性可能不强，可能因为初始样本的随机化选择，而导致最终结果的较大差异！

In [66]:
import matplotlib as plt
import numpy as np
import pandas as pd
import sklearn
import sklearn.preprocessing as pre
from matplotlib.pyplot import plot
from sklearn.model_selection import train_test_split

In [67]:
glass = pd.read_csv('../../dataset/glass_ident/glass.data')
glass.columns = ['id', 'RI', 'Na', 'Mg', 'Al', 'Si', 'K','Ca','Ba','Fe','class']
glass.head()

Unnamed: 0,id,RI,Na,Mg,Al,Si,K,Ca,Ba,Fe,class
0,2,1.51761,13.89,3.6,1.36,72.73,0.48,7.83,0.0,0.0,1
1,3,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.0,0.0,1
2,4,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.0,0.0,1
3,5,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.0,0.0,1
4,6,1.51596,12.79,3.61,1.62,72.97,0.64,8.07,0.0,0.26,1


- there have class1,2,3,4,5,6,7, but class 4 is not in this work.
- Two sub-tasks as follows:
    1. Build a **logistic regression model to classify class 2 and not class 2**, i.e. a binary classifier to separate
class 2 from everything else. This binary classifier should be able to **get an accuracy higher than 85%**

    2. Build a **multiclass classification model** by build 6 binary classifiers. This multiclass classifier should be
able to **get an accuracy higher than 50%**

- 二分类器

In [68]:
def binClassifier(XTrain, yTrain, initRate, numEpoch=2000):
    theta = np.zeros((XTrain.shape[1]))
    lossList = []
    learningRate = initRate
    for epoch in range(numEpoch):
        # forward 
        logits = np.dot(XTrain, theta)
        hyp = 1/(1+np.exp(-logits))
        
        # uncomment this line to get an error.....
        # hyp.shape = (hyp.size, 1) 
        
        crossEntropyLoss = (-yTrain * np.log(hyp) - (1-yTrain)*np.log(1-hyp)).mean()

        # backward
        grad = (hyp - yTrain)@XTrain/yTrain.size
        theta -= learningRate*grad

        if epoch % 200 == 0:
            print('Epoch', epoch, 'loss:', crossEntropyLoss)
        if epoch % 200 == 0:
            lossList.append(crossEntropyLoss)
            
            if len(lossList) > 5:
                currentLoss = np.array(lossList[-5:-1])
                if currentLoss.std()/currentLoss.mean() < 0.01:
                    learningRate *= 0.95
                    print('almost converged, lowering learning rate')
            if learningRate/initRate < 0.5:
                print('solution already converged, exit training')
                break
    return theta

- 多分类器

In [69]:
def multiClassifier(XTrain, yTrain, initRate=0.02):
    numClass = np.unique(yTrain)
    print(len(numClass), " classes in total")
    params = np.zeros((len(numClass), XTrain.shape[1]))

    for i in numClass:
        print('\nbegin to train a binary classifer for class ', i)
        tempLabel = np.zeros_like(yTrain)
        tempLabel[yTrain == numClass[i]] = 1
        params[i,:] = binClassifier(XTrain, tempLabel, initRate)
    
    print('finish training for all classes!\n')
    return params

- 可能预测

In [70]:
def predClass(params, XTest, yTest):
    featSize = XTest.shape
    labelSize = yTest.shape
    assert(featSize[0]==labelSize[0])

    logits = np.dot(XTest, np.transpose(params)).squeeze()
    prob = 1 / (1+np.exp(-logits))

    pred = np.argmax(prob, axis=1)
    accuracy = np.sum(pred == yTest) / labelSize[0] * 100
    pred[pred>=3]+=1
    pred+=1
    return prob, pred, accuracy

In [71]:
X, y = glass.iloc[:,1:-1], pre.LabelEncoder().fit_transform(glass.iloc[:, -1])
XTrain, XTest, yTrain, yTest = train_test_split(X, y, test_size=0.3, random_state=5)
featMean, featStd = np.mean(XTrain, axis=0), np.std(XTrain, axis=0)
XTrain = (XTrain - featMean) / featStd
XTest = (XTest - featMean) / featStd

# Concatenate X with a new dimension for bias
XTrain = np.concatenate((np.ones((XTrain.shape[0], 1)), XTrain), axis=1)
XTest = np.concatenate((np.ones((XTest.shape[0], 1)), XTest), axis=1)

In [72]:
initRate = 0.003
params = multiClassifier(XTrain, yTrain, initRate)
_, preds, accu = predClass(params, XTest, yTest)
print("Prediction: {}\n".format(preds))
print("Accuracy: {:.3f}%".format(accu))

6  classes in total

begin to train a binary classifer for class  0
Epoch 0 loss: 0.69314718056
Epoch 200 loss: 0.635602231929
Epoch 400 loss: 0.597798834125
Epoch 600 loss: 0.571320764596
Epoch 800 loss: 0.551722156761
Epoch 1000 loss: 0.536585493545
Epoch 1200 loss: 0.524509817886
Epoch 1400 loss: 0.514629913154
Epoch 1600 loss: 0.506381445358
Epoch 1800 loss: 0.499379529834

begin to train a binary classifer for class  1
Epoch 0 loss: 0.69314718056
Epoch 200 loss: 0.669227108302
Epoch 400 loss: 0.653686787321
Epoch 600 loss: 0.643101211727
Epoch 800 loss: 0.635574575195
Epoch 1000 loss: 0.630028648521
Epoch 1200 loss: 0.625820747119
Epoch 1400 loss: 0.622549024969
Epoch 1600 loss: 0.619951378719
almost converged, lowering learning rate
Epoch 1800 loss: 0.617945543032
almost converged, lowering learning rate

begin to train a binary classifer for class  2
Epoch 0 loss: 0.69314718056
Epoch 200 loss: 0.598673663198
Epoch 400 loss: 0.528618247519
Epoch 600 loss: 0.476006062895
Epoch 800

- 使用循环大规模实验进行合理的学习率测试

In [73]:
def binClassifier(XTrain, yTrain, initRate, numEpoch=2000):
    theta = np.zeros((XTrain.shape[1]))
    lossList = []
    learningRate = initRate
    for epoch in range(numEpoch):
        # forward 
        logits = np.dot(XTrain, theta)
        hyp = 1/(1+np.exp(-logits))
        
        # uncomment this line to get an error.....
        # hyp.shape = (hyp.size, 1) 
        
        crossEntropyLoss = (-yTrain * np.log(hyp) - (1-yTrain)*np.log(1-hyp)).mean()

        # backward
        grad = (hyp - yTrain)@XTrain/yTrain.size
        theta -= learningRate*grad

        if epoch % 1000 == 0:
            #print('Epoch', epoch, 'loss:', crossEntropyLoss)
         if epoch % 50 == 0:
            lossList.append(crossEntropyLoss)
            if len(lossList) > 5:
                currentLoss = np.array(lossList[-5:-1])
                if currentLoss.std()/currentLoss.mean() < 0.01:
                    learningRate *= 0.95
                    #print('almost converged, lowering learning rate')
            if learningRate/initRate < 0.5:
                #print('solution already converged, exit training')
                break
    return theta

In [102]:
X, y = glass.iloc[:,1:-1], pre.LabelEncoder().fit_transform(glass.iloc[:, -1])
XTrain, XTest, yTrain, yTest = train_test_split(X, y, test_size=0.3, random_state=5)
featMean, featStd = np.mean(XTrain, axis=0), np.std(XTrain, axis=0)
XTrain = (XTrain - featMean) / featStd
XTest = (XTest - featMean) / featStd

# Concatenate X with a new dimension for bias
XTrain = np.concatenate((np.ones((XTrain.shape[0], 1)), XTrain), axis=1)
XTest = np.concatenate((np.ones((XTest.shape[0], 1)), XTest), axis=1)

In [103]:
lr_set = np.array([0.0001,0.0002,0.0005,0.001,0.002,0.003,0.004,0.005,\
             0.01,0.015,0.02,0.025,0.03,0.035,0.04,0.045,0.05])
accuList  = []
for lr in lr_set:
    print('current learing rate: ', lr)
    params = multiClassifier(XTrain, yTrain, initRate)
    _, preds, accu = predClass(params, XTest, yTest)
    predsList.append(preds)
    accuList.append(accu)
    #print("Prediction: {}\n".format(preds))
    print("Accuracy: {:.3f}%".format(accu))

current learing rate:  0.0001
6  classes in total

begin to train a binary classifer for class  0

begin to train a binary classifer for class  1

begin to train a binary classifer for class  2

begin to train a binary classifer for class  3

begin to train a binary classifer for class  4

begin to train a binary classifer for class  5
finish training for all classes!

Accuracy: 62.500%
current learing rate:  0.0002
6  classes in total

begin to train a binary classifer for class  0

begin to train a binary classifer for class  1

begin to train a binary classifer for class  2

begin to train a binary classifer for class  3

begin to train a binary classifer for class  4

begin to train a binary classifer for class  5
finish training for all classes!

Accuracy: 62.500%
current learing rate:  0.0005
6  classes in total

begin to train a binary classifer for class  0

begin to train a binary classifer for class  1

begin to train a binary classifer for class  2

begin to train a binary c

In [104]:
accuList

[62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5,
 62.5]

由此发现在一定范围内，调整学习率后，最终预测的准确率都是基本上一样敏感

多类之下，总体的分类成功率可能因为个别类别较少的分类判断错误，而拉低成功率！