学习的要义：
1. 系统性学习：只言片语的学，只会一叶遮目，只见树木，不见森林。系统性的学习，才能知道事物的始末，你才能知道怎么创造。
2. 多实战：学习最好的方式是教给别人，因为讲，需要条理清晰的知道每个细节。实战也是一种另一种形式的“讲”。
3. 多复习：人脑存在遗忘机制，遗忘不是不好，遗忘是因为大脑不知道记忆的东西重要还是不重要。当你不停的重复记忆，大脑才能知道这项东西非常重要。
本文是 PyTorch 的第一讲，打算写一个完整的系列，用这一系列，从0到1的系统性学习PyTorch。学习完的效果是要能从0到1独立复现任何一篇Paper。
言归正传，本文是一个开胃菜，遵循先整体后局部的思路。第一讲，将讲述本文总结的深度学习代码模板。深度学习没有那么高深，所有深度学习模板都遵循着这一模版。

# 神经网络学习七步法

[系统性学习PyTorch之第一讲深度学习的模板](https://zhuanlan.zhihu.com/p/599280748)

[pytorch/tutorials/beginner_source/basics/quickstart_tutorial.py](https://github.com/pytorch/tutorials/blob/main/beginner_source/basics/quickstart_tutorial.py)

In [1]:
# import liberies
import numpy as np
import matplotlib.pyplot as plt

import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms, utils

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# set random seed
seed = 0 
np.random.seed(seed=seed)
torch.random.manual_seed(seed=seed)

<torch._C.Generator at 0x1195716f0>

## 第一步: 数据集- 使用 Dataset 封装数据集

In [3]:
train_data = datasets.FashionMNIST(root="./data", train=True, transform=transforms.ToTensor(), download=False)
test_data = datasets.FashionMNIST(root="./data", train=False, transform=transforms.ToTensor(), download=False)


Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:02<00:00, 11176967.32it/s]


Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 918066.81it/s]

Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz





Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 8813566.78it/s] 


Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 14708635.55it/s]

Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw






## 第二步 : 加载数据集：使用 DataLoader 生成一个迭代器，用的时候记得传入 batch_size

In [4]:
train_dataloader = DataLoader(train_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=False)


##  第三步 : 创建模型

In [5]:
# 创建NN Class，继承 nn.Module，实现 forward 函数
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits



In [6]:
# 创建 Loss Function

loss_fn = nn.CrossEntropyLoss() # 使用内置的Loss Function


# 继承 nn.Module，自定义loss，通常复现论文时使用

class CustomLoss(nn.Module):
    def __init__(self):
        super(CustomLoss, self).__init__()

    def forward(self, output, target):
        target = torch.LongTensor(target)
        criterion = nn.CrossEntropyLoss()
        loss = criterion(output, target)
        mask = target == 9
        high_cost = (loss * mask.float()).mean()
        return loss + high_cost




### 第四步 : 设定训练参数 

In [7]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
print(f"device: {device}")

device: mps


In [8]:
#  实例化NN函数，并 to(device)
model = NeuralNetwork().to(device)

print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


In [9]:
# 创建 Optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

In [11]:
# 编写train 函数
# 1. 用train模式：model.train()
# 2. loop 数据
# 3. predict，并计算 loss
# 4. 反向传播
# 5. 每N个 batch，打印 loss和 acc

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            #print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")


def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f}")


### 第五步 : train 模型

In [13]:
epochs = 50
for t in range(epochs):
    print(f"Epoch {t+1}:")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)

Epoch 1:
Test Error: Accuracy: 79.2%, Avg loss: 0.605012
Epoch 2:
Test Error: Accuracy: 79.5%, Avg loss: 0.595239
Epoch 3:
Test Error: Accuracy: 79.6%, Avg loss: 0.588063
Epoch 4:
Test Error: Accuracy: 80.0%, Avg loss: 0.579006
Epoch 5:
Test Error: Accuracy: 80.2%, Avg loss: 0.572682
Epoch 6:
Test Error: Accuracy: 80.1%, Avg loss: 0.566434
Epoch 7:
Test Error: Accuracy: 80.3%, Avg loss: 0.562854
Epoch 8:
Test Error: Accuracy: 80.6%, Avg loss: 0.553573
Epoch 9:
Test Error: Accuracy: 80.6%, Avg loss: 0.547929
Epoch 10:
Test Error: Accuracy: 80.9%, Avg loss: 0.544086
Epoch 11:
Test Error: Accuracy: 80.9%, Avg loss: 0.540304
Epoch 12:
Test Error: Accuracy: 81.3%, Avg loss: 0.534956
Epoch 13:
Test Error: Accuracy: 81.3%, Avg loss: 0.532731
Epoch 14:
Test Error: Accuracy: 81.3%, Avg loss: 0.527043
Epoch 15:
Test Error: Accuracy: 81.6%, Avg loss: 0.524065
Epoch 16:
Test Error: Accuracy: 81.7%, Avg loss: 0.521020
Epoch 17:
Test Error: Accuracy: 81.7%, Avg loss: 0.517351
Epoch 18:
Test Error: A

### 第六步 : save model

In [14]:
torch.save(model.state_dict(), 'data/0_FashionMINST_NN.pth')

### 第七步 : inference model

In [24]:
model = NeuralNetwork()
model.load_state_dict(torch.load("data/0_FashionMINST_NN.pth"))

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]


model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"
