## 神经网络的原理

神经网络是一种模仿人脑神经元结构的计算模型，用于处理复杂的数据模式和关系。以下是神经网络的基本原理和构成要素。

### 神经元

神经网络的基本单位是神经元，神经元接收输入信号，通过加权求和和激活函数产生输出。一个神经元的数学表示为：

$$
y = f\left( \sum_{i=1}^{n} w_i x_i + b \right)
$$

其中：

- $x_i$ 是输入信号
- $w_i$ 是权重
- $b$ 是偏置
- $f$ 是激活函数

### 激活函数

激活函数用于引入非线性，常见的激活函数有：

- Sigmoid 函数：

$$
\sigma(x) = \frac{1}{1 + e^{-x}}
$$

- ReLU 函数：

$$
f(x) = \max(0, x)
$$

- Tanh 函数：

$$
\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
$$

### 网络结构

神经网络通常由多个层组成，包括输入层、隐藏层和输出层。每层由多个神经元组成。

#### 输入层

输入层接收外部输入数据，不进行任何计算，直接传递到下一层。

#### 隐藏层

隐藏层位于输入层和输出层之间，可以有一层或多层。每个隐藏层神经元接收前一层的输出，进行加权求和和激活。

#### 输出层

输出层产生最终的输出结果，其结构和神经元数量取决于具体任务。

### 前向传播

前向传播是指从输入层开始，逐层计算每个神经元的输出，直到输出层。每层的输出作为下一层的输入。

### 误差计算

在训练过程中，通过损失函数计算预测输出与真实值之间的误差。常用的损失函数有均方误差和交叉熵损失。

- 均方误差（MSE）：

$$
E = \frac{1}{2} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

- 交叉熵损失：

$$
E = -\sum_{i=1}^{n} y_i \log(\hat{y}_i)
$$

### 反向传播

反向传播用于更新神经网络的权重和偏置，以最小化损失函数。反向传播通过链式法则计算梯度，并使用优化算法（如梯度下降）进行参数更新。

#### 梯度下降

梯度下降通过以下公式更新权重和偏置：

$$
w_i = w_i - \eta \frac{\partial E}{\partial w_i}
$$

$$
b = b - \eta \frac{\partial E}{\partial b}
$$

$$
\frac {\partial E}{\partial w_{j,k}}=-(t_k-o_k)\cdot \sigma(\sum_j{w_{j,k}\cdot o_j})\cdot (1 - \sigma(\sum_j{w_{j,k}\cdot o_j}))\cdot o_j
$$

其中 \( \eta \) 是学习率。

### 训练过程

神经网络的训练过程包括以下步骤：

1. 初始化权重和偏置。
2. 进行前向传播计算输出。
3. 计算损失函数值。
4. 进行反向传播计算梯度。
5. 更新权重和偏置。
6. 重复以上步骤，直到损失函数收敛或达到预定的训练次数。

通过这些步骤，神经网络能够逐渐优化其参数，提高对复杂模式和数据关系的建模能力。

## 激活函数的选择


> 一室之不治，何以家国天下为？

## 任务

利用神经网络构建个分类动物的神经网络模型

通过输入的 3 个属性（年龄，体重，体长），输出为动物类型（0,0,1）的单层神经网络

## 网络设计

输入层 隐藏层 输出层
3 3 3


In [None]:
import random
import math
from tqdm import trange


class GenData:
    def __init__(self):
        self.animals = ["dog", "cow", "monkey"]

    def generate_data(self):
        data = []
        labels = []
        data = [
            # Dogs
            [1.2, 18.7, 30.5],
            [3.4, 45.2, 41.3],
            [5.6, 70.1, 50.2],
            [7.8, 95.5, 62.3],
            [10.0, 120.3, 70.4],
            [1.5, 22.1, 32.7],
            [2.5, 35.0, 38.4],
            [4.2, 55.6, 46.2],
            [6.3, 82.0, 58.3],
            [8.4, 105.5, 66.0],
            [1.8, 25.0, 33.5],
            [2.8, 38.4, 39.7],
            [4.8, 60.2, 48.5],
            [6.5, 85.0, 59.8],
            [9.5, 115.0, 68.0],
            [1.0, 15.0, 28.0],
            [3.0, 40.0, 42.0],
            [5.0, 65.0, 52.0],
            [7.0, 90.0, 60.0],
            [9.0, 115.0, 69.0],
            # Cows
            [1.2, 160.5, 110.5],
            [3.4, 400.0, 140.0],
            [5.6, 700.0, 160.0],
            [7.8, 950.0, 180.0],
            [10.0, 1200.0, 200.0],
            [2.0, 220.0, 120.0],
            [4.0, 480.0, 150.0],
            [6.0, 750.0, 170.0],
            [8.0, 1000.0, 190.0],
            [12.0, 1350.0, 210.0],
            [1.5, 180.0, 115.0],
            [3.5, 420.0, 145.0],
            [5.5, 680.0, 165.0],
            [7.5, 930.0, 185.0],
            [9.5, 1180.0, 205.0],
            [2.5, 250.0, 125.0],
            [4.5, 500.0, 155.0],
            [6.5, 780.0, 175.0],
            [8.5, 1050.0, 195.0],
            [11.5, 1400.0, 215.0],
            # Monkeys
            [1.2, 18.7, 50.5],
            [3.4, 30.2, 60.3],
            [5.6, 45.1, 70.2],
            [7.8, 60.5, 75.3],
            [10.0, 75.3, 80.4],
            [2.5, 22.1, 55.7],
            [4.5, 35.0, 65.4],
            [6.5, 50.6, 72.2],
            [8.5, 65.0, 78.3],
            [12.0, 80.0, 82.0],
            [3.0, 25.0, 58.5],
            [5.0, 38.4, 68.7],
            [7.0, 55.2, 74.5],
            [9.0, 70.0, 79.8],
            [11.0, 85.0, 83.0],
            [2.0, 20.0, 53.0],
            [4.0, 32.0, 63.0],
            [6.0, 48.0, 71.0],
            [8.0, 62.0, 77.0],
            [10.0, 77.0, 81.0],
        ]
        labels = [
            # Dogs
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            [1, 0, 0],
            # Cows
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            [0, 1, 0],
            # Monkeys
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
            [0, 0, 1],
        ]

        return data, labels


import random
import math
import matplotlib.pyplot as plt
from tqdm import trange


class MyNN:
    def __init__(self, input_size, hidden_size, output_size, learning_rate):
        self.input = input_size
        self.hidden = hidden_size
        self.output = output_size
        self.lr = learning_rate
        self.Wi2h = [
            [random.uniform(-1, 1) for _ in range(hidden_size)]
            for _ in trange(input_size)
        ]
        self.Wh2o = [
            [random.uniform(-1, 1) for _ in range(output_size)]
            for _ in trange(hidden_size)
        ]
        self.Bi2h = [random.uniform(-1, 1) for _ in range(hidden_size)]
        self.Bh2o = [random.uniform(-1, 1) for _ in range(output_size)]

    def func(self, x):
        if x < -25:
            return 0.000001
        if x > 10:
            return 0.9999
        return 1 / (1 + math.exp(-x))

    def Dfunc(self, x):
        return x * (1 - x)

    def forward(self, inputs):
        # Input to hidden layer
        Oi2h = [
            sum(x * w for x, w in zip(inputs, col)) + b
            for col, b in zip(zip(*self.Wi2h), self.Bi2h)
        ]
        Oh = list(map(self.func, Oi2h))

        # Hidden to output layer
        Oh2o = [
            sum(x * w for x, w in zip(Oh, col)) + b
            for col, b in zip(zip(*self.Wh2o), self.Bh2o)
        ]
        Oo = list(map(self.func, Oh2o))
        return Oh, Oo

    def backward(self, inputs, hidden_outputs, actual_outputs, expected_outputs):
        # Output layer error and delta
        Eout = [
            expected - actual
            for expected, actual in zip(expected_outputs, actual_outputs)
        ]
        Dout = [
            error * self.Dfunc(output) for error, output in zip(Eout, actual_outputs)
        ]

        # Hidden layer error and delta
        Ehide = [
            sum(delta * w for delta, w in zip(Dout, col)) for col in zip(*self.Wh2o)
        ]
        Dhide = [
            error * self.Dfunc(output) for error, output in zip(Ehide, hidden_outputs)
        ]

        # Update weights and biases from hidden to output layer
        for i, hidden_output in enumerate(hidden_outputs):
            for j, output_delta in enumerate(Dout):
                self.Wh2o[i][j] += self.lr * output_delta * hidden_output
        for j, output_delta in enumerate(Dout):
            self.Bh2o[j] += self.lr * output_delta

        # Update weights and biases from input to hidden layer
        for i, input_val in enumerate(inputs):
            for j, hidden_delta in enumerate(Dhide):
                self.Wi2h[i][j] += self.lr * hidden_delta * input_val
        for j, hidden_delta in enumerate(Dhide):
            self.Bi2h[j] += self.lr * hidden_delta

    def train(self, training_data, training_labels, epochs):
        for epoch in trange(epochs):
            for inputs, expected_outputs in zip(training_data, training_labels):
                hidden_outputs, actual_outputs = self.forward(inputs)
                self.backward(inputs, hidden_outputs, actual_outputs, expected_outputs)

    def predict(self, inputs):
        Oh, Oo = self.forward(inputs)
        return Oo


def test(test_data, test_label):
    for i in range(len(test_data)):
        test, label = test_data[i], test_label[i]
        ans = nn.predict(test)
        output = g.animals[ans.index(max(ans))]
        print(
            "Predicted animal is",
            output,
            "should be",
            g.animals[label.index(1)],
            label.index(1) == ans.index(max(ans)),
        )


def draw(data, labels):
    # 将数据和标签分成三类
    dogs = [data[i] for i in range(len(labels)) if labels[i] == [1, 0, 0]]
    cows = [data[i] for i in range(len(labels)) if labels[i] == [0, 1, 0]]
    monkeys = [data[i] for i in range(len(labels)) if labels[i] == [0, 0, 1]]
    # Dogs
    dogs_x = [d[0] for d in dogs]
    dogs_y = [d[1] for d in dogs]
    dogs_z = [d[2] for d in dogs]
    # Cows
    cows_x = [c[0] for c in cows]
    cows_y = [c[1] for c in cows]
    cows_z = [c[2] for c in cows]
    # Monkeys
    monkeys_x = [m[0] for m in monkeys]
    monkeys_y = [m[1] for m in monkeys]
    monkeys_z = [m[2] for m in monkeys]
    from mpl_toolkits.mplot3d import Axes3D

    fig = plt.figure(figsize=(10, 6))
    ax = fig.add_subplot(111, projection="3d")

    # Dogs
    ax.scatter(dogs_x, dogs_y, dogs_z, c="red", label="Dogs")
    # Cows
    ax.scatter(cows_x, cows_y, cows_z, c="green", label="Cows")
    # Monkeys
    ax.scatter(monkeys_x, monkeys_y, monkeys_z, c="blue", label="Monkeys")
    ax.set_xlabel("Feature 1")
    ax.set_ylabel("Feature 2")
    ax.set_zlabel("Feature 3")
    ax.set_title("3D Scatter Plot of Dogs, Cows, and Monkeys")
    ax.legend()
    plt.show()


epochs = 50000
import itertools

g = GenData()
data, labels = g.generate_data()
draw(data, labels)
data_set = list(zip(data, labels))
shff = random.sample(data_set, len(data_set))
data_set = shff
train_set = data_set[:50]
test_set = data_set[50:]

data, labels = list(zip(*train_set))
test_data, test_label = list(zip(*test_set))

nn = MyNN(3, 5, 3, 0.01)
nn.train(data, labels, epochs)
test(test_data, test_label)

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim


# 定义神经网络
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(4, 6)
        self.fc2 = nn.Linear(6, 6)
        self.fc3 = nn.Linear(6, 2)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.relu(self.fc3(x))
        return x


# 自定义函数实现二进制转十进制
class BinaryToDecimal(torch.autograd.Function):
    @staticmethod
    def forward(ctx, inputs):
        ctx.save_for_backward(inputs)
        decimal = inputs.apply_(
            lambda x: int("".join(map(str, map(int, x.tolist()))), 2)
        )
        return decimal.float().view(-1, 1)

    @staticmethod
    def backward(ctx, grad_output):
        grad_input = grad_output.clone()
        return grad_input


binary_to_decimal = BinaryToDecimal.apply

# 创建神经网络实例
model = SimpleNN()

In [None]:
import numpy as np
import math
import numbers


class Neuron:
    def sigmoid(x: numbers.Number):
        return 1 / (1 + math.exp(-x))

    def liner(x: numbers.Number):
        return x

    def __init__(self, weight: float, bias: float, func=sigmoid, activation=0):
        self.w = weight
        self.b = bias
        self.fw_connextions = []
        self.bw_connextions = []
        self.func = func
        self.a = activation

    def add_connextions(self, neuron):
        self.fw_connextions.append(neuron)
        self.bw_connextions.append(neuron)

    def forward(self):
        for conn in self.fw_connextions:
            conn: Neuron
            conn.a += self.w * self.a + self.b

    def update(self):
        self.a = self.func(self.a)

    def backward(self):
        for conn in self.bw_connextions:
            conn: Neuron

    def __repr__(self) -> str:
        return f"Neuron[{self.w:.2f},{self.b:2f},{self.a:.2f}]"

    def __str__(self) -> str:
        return self.__repr__()


class Layer:
    def __init__(self, *args: Neuron):
        self.neurons = args
        self.next_layer = None
        self.prev_layer = None

    def connect(self, layer: Layer, full=True):  # type: ignore
        if not full:
            assert len(self.neurons) == len(layer.neurons)
            for i, _ in enumerate(self.neurons):
                self.neurons[i].add_connextions(layer.neurons[i])
        else:
            for i, _ in enumerate(self.neurons):
                for j, _ in enumerate(layer.neurons):
                    self.neurons[i].add_connextions(layer.neurons[j])
        self.next_layer = layer
        layer.prev_layer = self
        return self

    def forward(self):
        for i, _ in enumerate(self.neurons):
            self.neurons[i].forward()
        for i, _ in enumerate(self.next_layer.neurons):
            self.next_layer.neurons[i].update()

    def __repr__(self) -> str:
        return ",".join(map(str, self.neurons))

    def __str__(self) -> str:
        return ",".join(map(str, self.neurons))


class Train:
    def __init__(self, layers: list[Layer]):
        return

    def fit(self, x: list[float], Y: list[float]):
        return

    def output():
        return

In [None]:
from tqdm import trange

i1, i2 = (
    Neuron(1, 0, activation=1, func=Neuron.liner),
    Neuron(1, 0, activation=0.5, func=Neuron.liner),
)
h11, h21 = (
    Neuron(0.9, 0),
    Neuron(0.3, 0),
)

o1, o2 = Neuron(0.2, 0), Neuron(0.8, 0)

# input_layer = [i1, i2]
# hidden_layer = [h11, h21]
# output_layer = [o1, o2]
input_layer = Layer(i1, i2)
hidden_layer = Layer(h11, h21)
output_layer = Layer(o1, o2)
network = input_layer.connect(hidden_layer.connect(output_layer), full=False)

print(input_layer)
print(hidden_layer)
input_layer.forward()
print(hidden_layer)


def train():
    # ? forward
    for i, _ in enumerate(input_layer):
        input_layer[i].forward()

    for i in hidden_layer:
        print(i, end=" ")
    print()
    for i, _ in enumerate(hidden_layer):
        hidden_layer[i].forward()

    # for i in hidden_layer:
    #     print(i, end=" ")
    for i in output_layer:
        print(i, end=" ")
    # ?backward
    return


# train()

# batches = 10000
# for _ in trange(batches):
#     pass

In [None]:
from tqdm import trange

i1, i2, i3 = (
    Neuron(0, 0, activation=1),
    Neuron(0, 0, activation=1),
    Neuron(0, 0, activation=1),
)
h11, h21, n31 = Neuron(1, 1), Neuron(1, 1), Neuron(1, 1)
n12 = Neuron(1, 1)
o1 = Neuron(1, 1)
input_layer = [i1, i2, i3]
hidden_layer = [h11, h21, n31]
output_layer = [o1]

for i, neu in enumerate(input_layer):
    input_layer[i].add_connextions(hidden_layer[i])

for i, _ in enumerate(hidden_layer):
    for j, __ in enumerate(output_layer):
        hidden_layer[i].add_connextions(output_layer[j])


def train():
    # ? forward
    for i, _ in enumerate(input_layer):
        input_layer[i].forward()

    for i in hidden_layer:
        print(i, end=" ")
    print()
    for i, _ in enumerate(hidden_layer):
        hidden_layer[i].forward()
    for i in output_layer:
        print(i, end=" ")
    # ?backward
    return


train()

batches = 10000
for _ in trange(batches):
    pass

## 标准 BP 和累积误差 BP

累积 BP 算法与标准 BP 算法都很常用.一般来说,标准 BP 算法每次更新只针对单个样例,参数更新得非常频繁,而且对不同样例进行更新的效果可能出现“抵消”现象.因此,为了达到同样的累积误差极小点,标准 BP 算法往往需进行更多次数的迭代.累积 B P 算法直接针对累积误差最小化,它在读取整个训练集 D 一遍后才对参数进行更新,其参数更新的频率低得多.但在很多任务中,累积误差下降到一定程度之后,进一步下降会非常缓慢,这时标准 BP 往往会更快获得较好的解,尤其是在训练集 D 非常大时更明显.


In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms


# 定义神经网络
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x


# 加载数据集
transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
)
train_dataset = datasets.MNIST("./data", train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=1, shuffle=True)

# 初始化网络、损失函数和优化器
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# 训练网络
for epoch in range(1):
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item()}")

In [8]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms


# 定义神经网络
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x


# 加载数据集
transform = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
)
train_dataset = datasets.MNIST("./data", train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=1, shuffle=True)

# 初始化网络、损失函数和优化器
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# 训练网络
for epoch in range(1):
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item()}")

Epoch 1, Loss: 0.00018487652414478362
