# 环境设置代码
在开始之前，我们需要运行一些样板代码来设置我们的环境。每次启动笔记本时，您都需要重新运行此设置代码。

首先，运行此单元格来加载[autoreload](https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html?highlight=autoreload)扩展。这使得我们可以编辑`.py`源文件，并将其重新导入到笔记本中，以实现无缝的编辑和调试体验。

In [None]:
%load_ext autoreload
%autoreload 2
%load_ext tensorboard

### Google Colab 设置
接下来，我们需要运行一些命令，在 Google Colab 上设置我们的环境。如果您正在本地计算机上运行此笔记本，则可以跳过此部分。

运行以下单元格以挂载您的 Google Drive。点击链接，登录到您的 Google 帐户（与您用来存储此笔记本的帐户相同！），然后将授权代码复制到下面出现的文本框中。

In [None]:
from google.colab import drive
drive.mount('/content/drive')

现在回想一下您将此笔记本上传到 Google Drive 上的路径，并在下面填写它。如果一切正常，运行以下单元格应该会打印出作业中的文件名：

```
['08-20-第一次.ipynb', 'get_rank_cpu.py', 'get_rank_gpu.py', '__pycache__', 'mlp.py']
```

In [None]:
import os

GOOGLE_DRIVE_PATH_AFTER_MYDRIVE = '我的笔记本电脑/任务2'
GOOGLE_DRIVE_PATH = os.path.join('drive', 'Othercomputers', GOOGLE_DRIVE_PATH_AFTER_MYDRIVE)
print(os.listdir(GOOGLE_DRIVE_PATH))

一旦您成功挂载了 Google Drive 并找到了此作业的路径，运行以下单元格以使我们能够从此作业的 `.py` 文件中进行导入。如果它正常工作，它应该会打印出消息：

```
Hello from get_rank.py!
```

以及文件 `get_rank.py!` 的最后编辑时间。

In [None]:
import sys
sys.path.append(GOOGLE_DRIVE_PATH)

import time, os
os.environ["TZ"] = "US/Eastern"
time.tzset()

from get_rank_gpu import hello
hello()

get_rank_gpu_path = os.path.join(GOOGLE_DRIVE_PATH, 'get_rank_gpu.py')
get_rank_gpu_path_edit_time = time.ctime(os.path.getmtime(get_rank_gpu_path))
print('get_rank_gpu.py last edited on %s' % get_rank_gpu_path_edit_time)

# 观察原始情况

### 验证rk和Rk的计算





In [None]:
from get_rank_gpu import Effective_Ranks, get_Effective_Ranks, get_Effective_Ranks_GPU
import torch
import matplotlib.pyplot as plt
import torchvision
from torchvision.datasets import CIFAR10
from torchvision.transforms import ToTensor
import torchvision.transforms as transforms
import numpy as np

# 创建一个get_Effective_Ranks_GPU实例，并将数据转移到GPU上
get_cifar10_gpu = get_Effective_Ranks_GPU(dataset_name='CIFAR10', path_to_dataset_folder='dataset_folder', my_transform=transforms.Compose([ToTensor(),]))
get_cifar10_gpu.convert_to_matrix()
get_cifar10_gpu.convert_to_rank()
get_cifar10_gpu.plot_vectors()
rk_1, Rk_2 = get_cifar10_gpu.train_rk, get_cifar10_gpu.train_Rk
print(rk_1, "\n", Rk_2)

### 建立MLP

为了方便在以后的所有的代码块中的调用，提前定义好MLP

In [None]:
import torch.nn as nn
import torch.nn.functional as F

''' MLP '''
class MLP(nn.Module):
    def __init__(self, channel=3, num_classes=10, im_size=(32, 32)):
        super(MLP, self).__init__()
        self.fc_1 = nn.Linear(im_size[0] * im_size[1]*channel, 128)
        self.fc_2 = nn.Linear(128, 128)
        self.fc_3 = nn.Linear(128, num_classes)

    def forward(self, x):
        out = x.view(x.size(0), -1)
        out = F.relu(self.fc_1(out))
        out = F.relu(self.fc_2(out))
        out = self.fc_3(out)
        return out

### 进行模型的训练

为了简单起见，第一次训练不对图片进行任何额外的处理

In [None]:
import torch
import torch.nn as nn
from torchvision.transforms import ToTensor
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import torch.nn.functional as F
from torch.utils.tensorboard import SummaryWriter
from get_rank_gpu import get_Effective_Ranks_GPU

# 设置设备
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 添加tensorboard
writer = SummaryWriter("logs_train")

# 数据预处理,但是不在第一时间使用
CIFAR_MEAN = [0.49139968, 0.48215827, 0.44653124]
CIFAR_STD = [0.2023, 0.1994, 0.2010]
train_transform = transforms.Compose([
      transforms.RandomResizedCrop(size=(32, 32), scale=(0.8, 1.0)),  # 随机裁剪，但保持大小不变
      transforms.ToTensor(),
      transforms.Normalize(CIFAR_MEAN, CIFAR_STD)
    ])

# 加载 CIFAR-10 数据集， 使用get_Effective_Ranks_GPU，同时计算协方差
get_cifar10_gpu_normal = get_Effective_Ranks_GPU(dataset_name='CIFAR10', path_to_dataset_folder='dataset_folder', my_transform=transforms.Compose([ToTensor(),]))
train_loader = get_cifar10_gpu_normal.build_dataloader()[0]
val_loader = get_cifar10_gpu_normal.build_dataloader()[1]
test_loader = get_cifar10_gpu_normal.build_dataloader()[2]

# 得到对应的协方差
rk, Rk = get_cifar10_gpu_normal.train_rk, get_cifar10_gpu_normal.train_Rk
rk_max_value = max(rk)  # 找到列表中的最大值
rk_max_index = rk.index(rk_max_value)  # 找到最大值对应的索引
Rk_max_value = max(Rk)  # 找到列表中的最大值
Rk_max_index = Rk.index(Rk_max_value)  # 找到最大值对应的索引

暂时采用了```
momentum = 0.9
weight_decay = 0.0005
learning_rate = 0.01
batch_size = 512
num_epochs = 5```的超参数。同时以字典的形式保留训练好的网络，以方便在未来使用```model_vgg162 = torch.load("vgg16_method2.pth")```的形式实现在其他文件或者代码块中完成对网络的读取。

In [None]:
# 暂定的超参数
momentum = 0.9
weight_decay = 0.0005
learning_rate = 0.01
batch_size = 512
num_epochs = 5

# 创建模型实例并将其移至 GPU
channel=3
num_classes=10
im_size=(32, 32)
model = MLP()
model.to(device)

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum, weight_decay=weight_decay)

# 记录训练的次数
total_train_step = 0
# 记录训练过程中的准确率
correct = 0
total = 0

# 训练循环
for epoch in range(num_epochs):
    print("------------第{}轮训练开始了-----------".format(epoch + 1))

    # 训练步骤开始
    for batch_idx, (imgs, targets) in enumerate(train_loader):
      # 将数据移至 GPU
        imgs, targets = imgs.to(device), targets.to(device)
        # imgs: 一个形状为 (batch_size, channels, height, width) 的张量
        # targets: 一个形状为 (batch_size,) 的张量，包含每张图像的标签

        outputs = model(imgs)
        loss = criterion(outputs, targets)

        # 计算准确率
        _, predicted = torch.max(outputs, 1)
        total += targets.size(0)
        correct += (predicted == targets).sum().item()

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_train_step = epoch * len(train_loader) + batch_idx + 1
        if total_train_step % 100 == 0:
            acc = correct / total
            print("训练次数{}时，损失值是{}，准确率是{:.2f}%".format(total_train_step, loss, acc * 100))
            writer.add_scalar("train_loss", loss.item(), total_train_step)
            writer.add_scalar("train_accuracy", acc, total_train_step)
          # 更新模型权重并记录到Tensorboard
    for name, param in model.named_parameters():
        writer.add_histogram(name, param.clone().cpu().data.numpy(), epoch)
print("Training finished.")
writer.close()     #将event log写完之后，记得close()
# 在训练循环结束后
torch.save(model.state_dict(), 'model_normal.pth')
print("模型参数已保存为model_normal.pth")


现在，已经完成了第一次，没有任何**增强**的网络的训练。然后我们用这个网络在Vail上进行验证，为了寻求好的**超参数**，目前这包括两个部分：

1. ```momentum = 0.9, weight_decay = 0.0005, learning_rate = 0.01, batch_size = 512, num_epochs = 5```
这些用在优化器```optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum, weight_decay=weight_decay)```上的参数

2. ```
    CIFAR_MEAN = [0.49139968, 0.48215827, 0.44653124]
    CIFAR_STD = [0.2023, 0.1994, 0.2010]
    train_transform = transforms.Compose([transforms.RandomResizedCrop(size=(32, 32), scale=(0.8, 1.0)),transforms.ToTensor(),transforms.Normalize(CIFAR_MEAN, CIFAR_STD)])
  ```
    也就是我们关注的对图片的**增强**

我们不可以直接在test上计算loss和准确率，所以把train的10%分割作为vail集，并在它上面寻找合适的超参数。

In [None]:
# 验证步骤开始
for batch_idx, (imgs, targets) in enumerate(val_loader):
  # 将数据移至 GPU
    imgs, targets = imgs.to(device), targets.to(device)
    # imgs: 一个形状为 (batch_size, channels, height, width) 的张量
    # targets: 一个形状为 (batch_size,) 的张量，包含每张图像的标签

    outputs = model(imgs)
    loss = criterion(outputs, targets)

    # 计算准确率
    _, predicted = torch.max(outputs, 1)
    total += targets.size(0)
    correct += (predicted == targets).sum().item()

    total_train_step = batch_idx + 1
    if total_train_step % 100 == 0:
        acc = correct / total
        print("测试第{}图片时，损失值是{}，准确率是{:.2f}%".format(total_train_step, loss, acc * 100))
        writer.add_scalar("val_loss", loss.item(), total_train_step)
        writer.add_scalar("val_accuracy", acc, total_train_step)
print("Valing finished.")

In [None]:
nomraml_dict = {"协方差最大值":rk_max_value, "协方差最大值对应的索引":rk_max_index, "训练的模型的保留路径":"model_normal.pth","在测试集中的损失":loss, "在测试集中的准确率":acc}
print(nomraml_dict)

# 寻找合适的超参数

综上，已经完成了模型的建立，并且对最原始的情况进行了记录，下面开始寻找合适的增强的方法。

In [None]:
import torch
import torch.nn as nn
from torchvision.transforms import ToTensor
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import torch.nn.functional as F
from torch.utils.tensorboard import SummaryWriter
from get_rank_gpu import get_Effective_Ranks_GPU

def get_scale(scale):

  # ------------------------------------训练开始前的基本设置， 并得到协方差---------------------------------------------------
  # 设置设备
  device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  # 添加tensorboard
  writer = SummaryWriter("logs_train_scale=(0.8, 1.0)")

  # 数据预处理
  CIFAR_MEAN = [0.49139968, 0.48215827, 0.44653124]
  CIFAR_STD = [0.2023, 0.1994, 0.2010]
  train_transform = transforms.Compose([
        transforms.RandomResizedCrop(size=(32, 32), scale=(scale, scale)),  # 随机裁剪，但保持大小不变
        transforms.ToTensor(),
        transforms.Normalize(CIFAR_MEAN, CIFAR_STD)
      ])

  # 加载 CIFAR-10 数据集， 使用get_Effective_Ranks_GPU，同时计算协方差
  get_cifar10_gpu_normal = get_Effective_Ranks_GPU(dataset_name='CIFAR10', path_to_dataset_folder='dataset_folder', my_transform=train_transform)
  train_loader = get_cifar10_gpu_normal.build_dataloader()[0]
  val_loader = get_cifar10_gpu_normal.build_dataloader()[1]
  test_loader = get_cifar10_gpu_normal.build_dataloader()[2]

  # 得到对应的协方差
  rk, Rk = get_cifar10_gpu_normal.train_rk, get_cifar10_gpu_normal.train_Rk
  rk_max_value = max(rk)  # 找到列表中的最大值
  rk_max_index = rk.index(rk_max_value)  # 找到最大值对应的索引
  Rk_max_value = max(Rk)  # 找到列表中的最大值
  Rk_max_index = Rk.index(Rk_max_value)  # 找到最大值对应的索引
  # 暂定的超参数
  momentum = 0.9
  weight_decay = 0.0005
  learning_rate = 0.01
  batch_size = 512
  num_epochs = 5

  # 创建模型实例并将其移至 GPU
  channel=3
  num_classes=10
  im_size=(32, 32)
  model = MLP()
  model.to(device)

  # 定义损失函数和优化器
  criterion = nn.CrossEntropyLoss()
  optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum, weight_decay=weight_decay)

  # 记录训练的次数
  total_train_step = 0
  # 记录训练过程中的准确率
  correct = 0
  total = 0

  # --------------------------------------------开始训练-------------------------------------------
  # 训练循环
  for epoch in range(num_epochs):
      # print("------------第{}轮训练开始了-----------".format(epoch + 1))

      # 训练步骤开始
      for batch_idx, (imgs, targets) in enumerate(train_loader):
        # 将数据移至 GPU
          imgs, targets = imgs.to(device), targets.to(device)
          # imgs: 一个形状为 (batch_size, channels, height, width) 的张量
          # targets: 一个形状为 (batch_size,) 的张量，包含每张图像的标签

          outputs = model(imgs)
          loss = criterion(outputs, targets)

          # 计算准确率
          _, predicted = torch.max(outputs, 1)
          total += targets.size(0)
          correct += (predicted == targets).sum().item()

          optimizer.zero_grad()
          loss.backward()
          optimizer.step()

          total_train_step = epoch * len(train_loader) + batch_idx + 1
          if total_train_step % 100 == 0:
              acc = correct / total
              # print("训练次数{}时，损失值是{}，准确率是{:.2f}%".format(total_train_step, loss, acc * 100))
              writer.add_scalar("train_loss", loss.item(), total_train_step)
              writer.add_scalar("train_accuracy", acc, total_train_step)
      # 更新模型权重并记录到Tensorboard
      for name, param in model.named_parameters():
          writer.add_histogram(name, param.clone().cpu().data.numpy(), epoch)

  print("Training finished.")
  writer.close()     #将event log写完之后，记得close()
  # 在训练循环结束后
  # 保存模型参数
  filename = "model_scale={}.pth".format(scale)
  torch.save(model.state_dict(), filename)
  print("模型参数已保存为{}".format(filename))

  # ---------------------------------------------训练结束，开始验证----------------------------------------------------

  # 验证步骤开始
  for batch_idx, (imgs, targets) in enumerate(val_loader):
    # 将数据移至 GPU
      imgs, targets = imgs.to(device), targets.to(device)
      # imgs: 一个形状为 (batch_size, channels, height, width) 的张量
      # targets: 一个形状为 (batch_size,) 的张量，包含每张图像的标签

      outputs = model(imgs)
      loss = criterion(outputs, targets)

      # 计算准确率
      _, predicted = torch.max(outputs, 1)
      total += targets.size(0)
      correct += (predicted == targets).sum().item()

      total_train_step = batch_idx + 1
      if total_train_step % 100 == 0:
          acc = correct / total
          # print("测试第{}图片时，损失值是{}，准确率是{:.2f}%".format(total_train_step, loss, acc * 100))
          writer.add_scalar("val_loss", loss.item(), total_train_step)
          writer.add_scalar("val_accuracy", acc, total_train_step)
  print("Valing finished.")

  # ------------------------------------------完成各种工作，记录结果------------------------------------
  nomraml_dict = {"协方差最大值":rk_max_value, "协方差最大值对应的索引":rk_max_index, "训练的模型的保留路径":filename,"在测试集中的损失":loss, "在测试集中的准确率":acc}
  print(nomraml_dict)
  return nomraml_dict


In [None]:
# 定义初始的scale范围
scale_min = 0.1
scale_max = 0.9
scale_list = [round(x * 0.1, 1) for x in range(int(scale_min * 10), int(scale_max * 10) + 1)]
# print(scale_list)
dic = {}
for scale in scale_list:
  print("开始scale={}的情况".format(scale), "\n")
  dic['sacle={}'.format(scale)] = get_scale(scale=scale)

print(dic)