逻辑回归多分类已经具备了一些神经网络的雏形，
1. 输入层：输入特征向量
2. 输出层：输出分类结果
3. sigmod函数：将输入特征映射到0-1之间，相当于激活函数
4. 损失函数：二次损失函数
5. 优化算法：梯度下降法
6. 输入层到输出层的权重矩阵：W
7. 偏置项：b

下面我们来实现逻辑回归多分类。

In [None]:
import numpy as np

In [22]:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [23]:
from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder(categories='auto')
y_train_onehot = encoder.fit_transform(y_train.reshape(-1, 1)).toarray()
y_test_onehot = encoder.transform(y_test.reshape(-1, 1)).toarray()

In [None]:
y_train.shape


(150, 3)

In [25]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def sigmoid_derivative(z):
    return sigmoid(z) * (1 - sigmoid(z))

class LogisticRegression:
    def __init__(self, learning_rate=0.01, num_iterations=1000):
        self.learning_rate = learning_rate
        self.num_iterations = num_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        n_classes = y.shape[1]

        self.weights = np.zeros((n_features, n_classes))
        self.bias = np.zeros((1, n_classes))


        for i in range(self.num_iterations):
            # Forward propagation
            
            linear_output = np.dot(X, self.weights) + self.bias
            probabilities = sigmoid(linear_output)
            z_derivative = sigmoid_derivative(linear_output)

            z_derivative = z_derivative * (probabilities - y)         

            # Backward propagation
            dw = (1 / n_samples) * np.dot(X.T, z_derivative)
            db = (1 / n_samples) * np.sum(z_derivative, axis=0)

            # Update weights and bias
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db

            # Print the cost every 100 iterations
            if i % 1000 == 0:
                # Compute cost function
                cost = (-1 / n_samples) * np.sum(y * np.log(probabilities) + (1 - y) * np.log(1 - probabilities))
                print(f"Cost after iteration {i}: {cost}")



    
    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        probabilities = sigmoid(linear_output)
        return probabilities
    


In [26]:
model = LogisticRegression(num_iterations=10000)
model.fit(X_train, y_train_onehot)

Cost after iteration 0: 2.079441541679836
Cost after iteration 1000: 1.2885968536649943
Cost after iteration 2000: 1.1406634723322193
Cost after iteration 3000: 1.066255657793502
Cost after iteration 4000: 1.017360265822469
Cost after iteration 5000: 0.9810968892580871
Cost after iteration 6000: 0.9523796181122189
Cost after iteration 7000: 0.9287073093842226
Cost after iteration 8000: 0.908663165612856
Cost after iteration 9000: 0.8913621204295719


In [27]:
y_train_pred = model.predict(X_train)
y_train_class = np.argmax(y_train_pred, axis=1)

from sklearn.metrics import accuracy_score
accuracy_score(y_train, y_train_class)

0.9333333333333333

In [29]:
y_test_pred = model.predict(X_test)
y_test_class = np.argmax(y_test_pred, axis=1)

from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_test_class)

0.9555555555555556