# 基本信息
1. 实验名称：网络优化实验
2. 姓名：无
3. 学号：无
4. 日期：1

---

# 一、在多分类任务中分别手动实现和用torch.nn实现dropout

## 1.1 任务内容

1. 任务具体要求  
在多分类任务实验中分别手动和利用torch.nn实现dropout  
探究不同丢弃率对实验结果的影响（可用loss曲线进行展示）
2. 任务目的  
探究不同丢弃率对实验结果的影响
3. 任务算法或原理介绍    
Dropout 原理   
![]
4. 任务所用数据集   
   MNIST手写体数据集:  
     + 该数据集包含60,000个用于训练的图像样本和10,000个用于测试的图像样本。  
     + 图像是固定大小(28x28像素)，其值为0到1。为每个图像都被平展并转换为784  
        
## 1.2 任务思路及代码  

1. 构建数据集
2. 构建前馈神经网络，损失函数，优化函数
3. 手动实现dropout
4. 进行反向传播，和梯度更新  
5. 使用网络预测结果，得到损失值  
6. 对loss、acc等指标进行分析，探究不同丢弃率对实验结果的影响  

### 1.2.0数据集定义

In [1]:
import time
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torchvision
from torch.nn.functional import cross_entropy, binary_cross_entropy
from torch.nn import CrossEntropyLoss
from torchvision import transforms
from sklearn import  metrics
 # 如果有gpu则在gpu上计算 加快计算速度
print(f'当前使用的device为{device}')
# 数据集定义
# 定义多分类数据集 - train_dataloader - test_dataloader
batch_size = 128
# Build the training and testing dataset
traindataset = torchvision.datasets.FashionMNIST(root='E:\\DataSet\\FashionMNIST\\Train',
                                                  train=True,
                                                  download=True,
                                                  transform=transforms.ToTensor())
testdataset = torchvision.datasets.FashionMNIST(root='E:\\DataSet\\FashionMNIST\\Test',
                                                 train=False,
                                                 download=True,
                                                 transform=transforms.ToTensor())
traindataloader = torch.utils.data.DataLoader(traindataset, batch_size=batch_size, shuffle=True)
testdataloader = torch.utils.data.DataLoader(testdataset, batch_size=batch_size, shuffle=False)
# 绘制图像的代码
def picture(name, trainl, testl, type='Loss'):
    plt.rcParams["font.sans-serif"]=["SimHei"] #设置字体
    plt.rcParams["axes.unicode_minus"]=False #该语句解决图像中的“-”负号的乱码问题
    plt.title(name) # 命名
    plt.plot(trainl, c='g', label='Train '+ type)
    plt.plot(testl, c='r', label='Test '+type)
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True)
print(f'多分类数据集 样本总数量{len(traindataset) + len(testdataset)},训练样本数量{len(traindataset)},测试样本数量{len(testdataset)}')

当前使用的device为cuda
多分类数据集 样本总数量70000,训练样本数量60000,测试样本数量10000


**1.手动实现前馈神经网络代码**  
1. 代码中`MyNet`为手动实现的前馈神经网络模型，包含一个参数 dropout 表示丢失率用作实验一中设置不同的丢失率
2. 代码设置函数`train_and_test`可供之后需要手动实现多分类的实验调用，默认的损失函数为 `CrossEntropyLoss()`,优化函数为自己定义的随机梯度下降函数`mySGD()`,其余参数设置如下：
    + `epochs=40` 表示需要训练的总epoch数 默认为 40  
    + `lr=0.01` 表示设置的学习率, 默认值为 0.01  
    + `L2=False` 表示是否需要加入L2惩罚范数，默认值为False  

In [1]:
# 定义自己的前馈神经网络
from torch.nn import CrossEntropyLoss
from torch.optim import SGD
class MyNet():
    def __init__(self,dropout=0):
        # 设置隐藏层和输出层的节点数
        self.dropout = dropout
        self.is_train = None
        num_inputs, num_hiddens, num_outputs = 28 * 28, 256, 10  # 十分类问题
        w_1 = torch.tensor(np.random.normal(0, 0.01, (num_hiddens, num_inputs)), dtype=torch.float32,
                           requires_grad=True)
        b_1 = torch.zeros(num_hiddens, dtype=torch.float32, requires_grad=True)
        w_2 = torch.tensor(np.random.normal(0, 0.01, (num_outputs, num_hiddens)), dtype=torch.float32,
                           requires_grad=True)
        b_2 = torch.zeros(num_outputs, dtype=torch.float32, requires_grad=True)
        self.params = [w_1, b_1, w_2, b_2]
        self.w = [w_1,w_2]
        # 定义模型结构
        self.input_layer = lambda x: x.view(x.shape[0], -1)
        self.hidden_layer = lambda x: self.my_relu(torch.matmul(x, w_1.t()) + b_1)
        self.output_layer = lambda x: torch.matmul(x, w_2.t()) + b_2
    
    def my_relu(self, x):
        return torch.max(input=x, other=torch.tensor(0.0))
    # 以下两个函数分别在训练和测试前调用，选择是否需要dropout
    def train(self):
        self.is_train = True
    def test(self):
        self.is_train = False
    # 定义前向传播
    def forward(self, x):
        x = self.input_layer(x)
        if self.is_train: # 如果是训练过程，则需要开启dropout 否则 需要关闭 dropout
            x = dropout_layer(x,dropout=self.dropout) 
        x = self.my_relu(self.hidden_layer(x))
        if self.is_train:
            x = dropout_layer(x,dropout=self.dropout)
        x = self.output_layer(x)
        return x
"""
定义dropout层
x: 输入数据
dropout: 随机丢弃的概率
"""
def dropout_layer(x, dropout):
    assert 0 <= dropout <= 1 #dropout值必须在0-1之间
    # dropout==1，所有元素都被丢弃。
    if dropout == 1:
        return torch.zeros_like(x)
        # 在本情况中，所有元素都被保留。
    if dropout == 0:
        return x
    mask = (torch.rand(x.shape) > dropout).float() #rand()返回一个张量，包含了从区间[0, 1)的均匀分布中抽取的一组随机数
    return mask * x / (1.0 - dropout)

# 默认的优化函数为手写的mySGD
def mySGD(params, lr, batchsize):
    for param in params:
        param.data -= lr * param.grad

# 定义L2范数惩罚项 参数 w 为模型的 w 在本次实验中为[w_1, w_2] batch_size=128
def l2_penalty(w):
    cost = 0
    for i in range(len(w)):
        cost += (w[i]**2).sum()
    return cost / batch_size / 2
"""
定义训练函数
model:定义的模型 默认为MyNet(0) 即无dropout的初始网络
epochs:训练总轮数 默认为40
criterion:定义的损失函数，默认为cross_entropy
lr :学习率 默认为0.1
optimizer:定义的优化函数，默认为自己定义的mySGD函数
"""
def train_and_test(model=MyNet(),epochs=40,lr=0.01,L2=False):
    train_all_loss = []  # 记录训练集上得loss变化
    test_all_loss = []  # 记录测试集上的loss变化
    train_ACC, test_ACC = [], [] # 记录正确的个数
    begintime = time.time()
    optimizer=mySGD # 激活函数为自己定义的mySGD函数
    # criterion = cross_entropy # 损失函数为交叉熵函数
    criterion = CrossEntropyLoss() # 损失函数
    model.train() #表明当前处于训练状态，允许使用dropout
    for epoch in range(epochs):
        train_l,train_acc_num = 0, 0
        for data, labels in traindataloader:
            pred = model.forward(data)
            train_each_loss = criterion(pred, labels)  # 计算每次的损失值
            # 若L2为True则表示需要添加L2范数惩罚项
            if L2 == True:
                train_each_loss += lambd * l2_penalty(model.w)
            train_l += train_each_loss.item()
            train_each_loss.backward()  # 反向传播
            optimizer(model.params, lr, 128)  # 使用小批量随机梯度下降迭代模型参数
            # 梯度清零
            train_acc_num += (pred.argmax(dim=1)==labels).sum().item()
            for param in model.params:
                param.grad.data.zero_()
            # print(train_each_loss)
        train_all_loss.append(train_l)  # 添加损失值到列表中
        train_ACC.append(train_acc_num / len(traindataset)) # 添加准确率到列表中
        model.test() # 表明当前处于测试状态，无需使用dropout
        with torch.no_grad():
            is_train = False  # 表明当前为测试阶段，不需要dropout参与
            test_l, test_acc_num = 0, 0
            for data, labels in testdataloader:
                pred = model.forward(data)
                test_each_loss = criterion(pred, labels)
                test_l += test_each_loss.item()
                test_acc_num += (pred.argmax(dim=1)==labels).sum().item()
            test_all_loss.append(test_l)
            test_ACC.append(test_acc_num / len(testdataset))   # # 添加准确率到列表中
        if epoch == 0 or (epoch + 1) % 4 == 0:
            print('epoch: %d | train loss:%.5f | test loss:%.5f | train acc: %.2f | test acc: %.2f'
                  % (epoch + 1, train_l, test_l, train_ACC[-1],test_ACC[-1]))
    endtime = time.time()
    print("手动实现dropout = 0.2 %d轮 总用时: %.3f" % (epochs, endtime - begintime))
    return train_all_loss,test_all_loss,train_ACC,test_ACC

NameError: name 'torch' is not defined

### 1.2.1 手动实现-设置dropout = 0

In [None]:
# 设置dropout = 0  dropout = 0  epoch = 40  lr = 0.01  optimizer = mySGD

model_11 = MyNet(dropout=0)
train_all_loss11,test_all_loss11,\
train_ACC11,test_ACC11 \
= train_and_test(model=model_11,epochs=40,lr=0.01)

### 1.2.2 手动实现-设置dropout = 0.3  

In [None]:
# 设置dropout = 0.3  epoch = 40  lr = 0.01  optimizer = mySGD

model_12 = MyNet(dropout=0.3)
train_all_loss12,test_all_loss12,\
train_ACC12,test_ACC12 \
= train_and_test(model=model_12,epochs=40,lr=0.01)

### 1.2.3 手动实现-设置dropout = 0.6

In [None]:
# 设置dropout = 0.6  dropout = 0.6  epoch = 40  lr = 0.01  optimizer = mySGD

model_13 = MyNet(dropout=0.6)
train_all_loss13,test_all_loss13,\
train_ACC13,test_ACC13 \
= train_and_test(model=model_13,epochs=40,lr=0.01)

### 1.2.4 手动实现-设置dropout = 0.9

In [None]:
# 设置dropout = 0.9  dropout = 0.9  epoch = 40  lr = 0.01  optimizer = mySGD

model_14 = MyNet(dropout=0.9)
train_all_loss14,test_all_loss14,\
train_ACC14,test_ACC14 \
= train_and_test(model=model_14,epochs=40,lr=0.01)

  
**2.利用torch.nn实现前馈神经网络代码**  

In [17]:
# 利用torch.nn实现前馈神经网络-多分类任务
from collections import OrderedDict
from torch.nn import CrossEntropyLoss
from torch.optim import SGD
# 定义自己的前馈神经网络
class MyNet_NN(nn.Module):
    def __init__(self,dropout=0.0):
        super(MyNet_NN, self).__init__()
        # 设置隐藏层和输出层的节点数
        self.num_inputs, self.num_hiddens, self.num_outputs = 28 * 28, 256, 10  # 十分类问题
        # 定义模型结构
        self.input_layer = nn.Flatten()
        self.hidden_layer = nn.Linear(28*28,256)
        # 根据设置的dropout设置丢失率
        self.drop = nn.Dropout(dropout)
        self.output_layer = nn.Linear(256,10)
        # 使用relu激活函数
        self.relu = nn.ReLU()
        
    # 定义前向传播
    def forward(self, x):
        x = self.drop(self.input_layer(x))
        x = self.drop(self.hidden_layer(x))
        x = self.relu(x)
        x = self.output_layer(x)
        return x

# 训练
# 使用默认的参数即： num_inputs=28*28,num_hiddens=256,num_outs=10,act='relu'
model = MyNet_NN()  
model = model.to(device)

# 将训练过程定义为一个函数，方便调用
def train_and_test_NN(model=model,epochs=40,lr=0.01):
    MyModel = model
    print(MyModel)
    optimizer = SGD(MyModel.parameters(), lr=lr)  # 优化函数
    criterion = CrossEntropyLoss() # 损失函数
    train_all_loss = []  # 记录训练集上得loss变化
    test_all_loss = []  # 记录测试集上的loss变化
    train_ACC, test_ACC = [], []
    begintime = time.time()
    for epoch in range(epochs):
        train_l, train_epoch_count, test_epoch_count = 0, 0, 0
        for data, labels in traindataloader:
            data, labels = data.to(device), labels.to(device)
            pred = MyModel(data)
            train_each_loss = criterion(pred, labels.view(-1))  # 计算每次的损失值
            optimizer.zero_grad()  # 梯度清零
            train_each_loss.backward()  # 反向传播
            optimizer.step()  # 梯度更新
            train_l += train_each_loss.item()
            train_epoch_count += (pred.argmax(dim=1)==labels).sum()
        train_ACC.append(train_epoch_count.cpu()/len(traindataset))
        train_all_loss.append(train_l)  # 添加损失值到列表中
        with torch.no_grad():
            test_loss, test_epoch_count= 0, 0
            for data, labels in testdataloader:
                data, labels = data.to(device), labels.to(device)
                pred = MyModel(data)
                test_each_loss = criterion(pred,labels)
                test_loss += test_each_loss.item()
                test_epoch_count += (pred.argmax(dim=1)==labels).sum()
            test_all_loss.append(test_loss)
            test_ACC.append(test_epoch_count.cpu()/len(testdataset))
        if epoch == 0 or (epoch + 1) % 4 == 0:
            print('epoch: %d | train loss:%.5f | test loss:%.5f | train acc:%5f test acc:%.5f:' % (epoch + 1, train_all_loss[-1], test_all_loss[-1],
                                                                                                                     train_ACC[-1],test_ACC[-1]))
    endtime = time.time()
    print("torch.nn实现前馈网络-多分类任务 %d轮 总用时: %.3fs" % (epochs, endtime - begintime))
    # 返回训练集和测试集上的 损失值 与 准确率
    return train_all_loss,test_all_loss,train_ACC,test_ACC

### 1.3.1 torch.nn实现-设置dropout = 0

In [36]:
# 设置dropout = 0  dropout = 0  epoch = 40  lr = 0.01  optimizer = SGD

model_15 = MyNet_NN(dropout=0)
model_15 = model_15.to(device)
train_all_loss15,test_all_loss15,train_ACC15,test_ACC15 = train_and_test_NN(model=model_15,epochs=40,lr=0.01)

MyNet_NN(
  (input_layer): Flatten(start_dim=1, end_dim=-1)
  (hidden_layer): Linear(in_features=784, out_features=256, bias=True)
  (drop): Dropout(p=0, inplace=False)
  (output_layer): Linear(in_features=256, out_features=10, bias=True)
  (relu): ReLU()
)
epoch: 1 | train loss:659.22677 | test loss:74.46372 | train acc:0.620433 test acc:0.68020:
epoch: 4 | train loss:292.03407 | test loss:48.62702 | train acc:0.792500 test acc:0.78780:
epoch: 8 | train loss:239.93499 | test loss:41.59586 | train acc:0.827450 test acc:0.81610:
epoch: 12 | train loss:220.08005 | test loss:38.91176 | train acc:0.839300 test acc:0.82600:
epoch: 16 | train loss:208.55522 | test loss:37.24038 | train acc:0.846550 test acc:0.83370:
epoch: 20 | train loss:200.48044 | test loss:35.91030 | train acc:0.851833 test acc:0.83970:
epoch: 24 | train loss:193.93721 | test loss:35.25744 | train acc:0.857167 test acc:0.83970:
epoch: 28 | train loss:188.91335 | test loss:34.40012 | train acc:0.861000 test acc:0.84520:
e

### 1.3.2 torch.nn实现-设置dropout = 0.3

In [None]:
# 设置dropout = 0  dropout = 0  epoch = 40  lr = 0.01  optimizer = SGD
model_16 = MyNet_NN(dropout=0.3)
model_16 = model_16.to(device)
train_all_loss16,test_all_loss16,train_ACC16,test_ACC16 = train_and_test_NN(model=model_16,epochs=40,lr=0.01)

### 1.3.3 torch.nn实现-设置dropout = 0.6

In [None]:
# 设置dropout = 0  dropout = 0  epoch = 40  lr = 0.01  optimizer = SGD
model_17 = MyNet_NN(dropout=0.6)
model_17 = model_17.to(device)
train_all_loss17,test_all_loss17,train_ACC17,test_ACC17 = train_and_test_NN(model=model_17,epochs=40,lr=0.01)

### 1.3.4 torch.nn实现-设置dropout = 0.9

In [None]:
# 设置dropout = 0  dropout = 0  epoch = 40  lr = 0.01  optimizer = SGD
model_18 = MyNet_NN(dropout=0)
model_18 = model_18.to(device)
train_all_loss18,test_all_loss18,train_ACC18,test_ACC18 = train_and_test_NN(model=model_18,epochs=40,lr=0.01)

In [74]:

def l2_penalty(w):
    return torch.sqrt((w**2).sum())
a = torch.tensor([2,2,3,4,5],dtype=torch.float32)
print(l2_penalty(a))
print(torch.norm(a,2))

tensor(7.6158)
tensor(7.6158)


## 1.4实验结果分析

---

# 二、在多分类任务中分别手动实现和用torch.nn实现$L_2$正则化

## 2.1 任务内容

1. 任务具体要求  
在多分类任务中分别手动实现和用torch.nn实现$L_2$正则化  
2. 任务目的  
探究惩罚项的权重对实验结果的影响（可用loss曲线进行展示）
3. 任务算法或原理介绍    
$L_2$ 原理   

4. 任务所用数据集   
   MNIST手写体数据集:  
     + 该数据集包含60,000个用于训练的图像样本和10,000个用于测试的图像样本。  
     + 图像是固定大小(28x28像素)，其值为0到1。为每个图像都被平展并转换为784  
        
## 2.2 任务思路及代码  



### 2.2.1 手动实现-设置惩罚权重lambd= 0(即无惩罚权重)

### 2.2.2 手动实现-设置惩罚权重lambd= 2

### 2.2.3 手动实现-设置惩罚权重lambd= 4

### 2.3.1 利用torch.nn实现-设置惩罚权重lambd= 0(即无惩罚权重)

### 2.3.2利用torch.nn实现-设置惩罚权重lambd= 2

### 2.3.3 利用torch.nn实现-设置惩罚权重lambd= 4

## 2.4 实验结果分析

---

---

# A1 实验心得

学会手动构建前馈神经网络和利用torch.nn构建前馈神经网络解决回归、二分类、和多分类问题
1. 实验中发现学习率的设置至关重要，如果学习率过大则会导致准确率下降的趋势，若学习率过小会导致模型需要更多时间收敛
2. 实验过程中发现出现过拟合现象，通过修改相关参数得以纠正
3. 学会程序模块话的编写，避免重复编写代码
4. 对激活函数的选取有了更加清晰的认识
5. 隐藏层的个数和隐藏层的神经元个数对模型有着很大的影响。

# A2 参考文献  
参考课程PPT