# torch4keras使用手册
[torch4keras](https://github.com/Tongjilibo/torch4keras)的功能是像使用keras一样使用pytorch, 是从[bert4torch]()中抽象出来的trainer,适用于一般神经网络的训练，用户仅需关注网络结构代码的实现，而无需关注训练工程代码

## 1、训练过程
```text
2022-10-28 23:16:10 - Start Training
2022-10-28 23:16:10 - Epoch: 1/5
5000/5000 [==============================] - 13s 3ms/step - loss: 0.1351 - acc: 0.9601
Evaluate: 100%|██████████████████████████████████████████████████| 2500/2500 [00:03<00:00, 798.09it/s] 
test_acc: 0.98045. best_test_acc: 0.98045

2022-10-28 23:16:27 - Epoch: 2/5
5000/5000 [==============================] - 13s 3ms/step - loss: 0.0465 - acc: 0.9862
Evaluate: 100%|██████████████████████████████████████████████████| 2500/2500 [00:03<00:00, 635.78it/s] 
test_acc: 0.98280. best_test_acc: 0.98280

2022-10-28 23:16:44 - Epoch: 3/5
5000/5000 [==============================] - 15s 3ms/step - loss: 0.0284 - acc: 0.9915
Evaluate: 100%|██████████████████████████████████████████████████| 2500/2500 [00:03<00:00, 673.60it/s] 
test_acc: 0.98365. best_test_acc: 0.98365

2022-10-28 23:17:03 - Epoch: 4/5
5000/5000 [==============================] - 15s 3ms/step - loss: 0.0179 - acc: 0.9948
Evaluate: 100%|██████████████████████████████████████████████████| 2500/2500 [00:03<00:00, 692.34it/s] 
test_acc: 0.98265. best_test_acc: 0.98365

2022-10-28 23:17:21 - Epoch: 5/5
5000/5000 [==============================] - 14s 3ms/step - loss: 0.0129 - acc: 0.9958
Evaluate: 100%|██████████████████████████████████████████████████| 2500/2500 [00:03<00:00, 701.77it/s] 
test_acc: 0.98585. best_test_acc: 0.98585

2022-10-28 23:17:37 - Finish Training
```

## 2、功能介绍
1. **模型训练**： 模型的训练过程和keras很相似，`model.compile(optimizer,loss, scheduler,metric)`指定loss, 优化器，scheduler，mertrics；`model.fit(train_dataloader, epoch, steps_per_epoch)`进行模型训练
2. **特色功能**： 进度条展示训练过程；自带和自定义metric；自带Evaluator, Checkpoint, Tensorboard, Logger等Callback，也可自定义Callback；可支持dp和ddp的多卡训练
3. **设计初衷**：前期功能是作为[bert4torch](https://github.com/Tongjilibo/bert4torch)和[rec4torch](https://github.com/Tongjilibo/rec4torch)的Trainer，用户可用于各类pytorch模型训练

## 3、建模流程
### 3.1 加载数据
这里直接使用torchvision自带的数据集，更一般的是用户自己读取数据后，组建Dataset后使用Dataloader来构建训练数据集

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torch4keras.model import BaseModel
from torch4keras.snippets import seed_everything, Checkpoint, Evaluator, EarlyStopping
from torch.utils.data import TensorDataset, DataLoader
from tqdm import tqdm

seed_everything(42)
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# 读取数据
mnist = torchvision.datasets.MNIST(root='./', download=True)
x, y = mnist.train_data.unsqueeze(1), mnist.train_labels
x, y = x.to(device), y.to(device)
x = x.float() / 255.0    # scale the pixels to [0, 1]
x_train, y_train = x[:40000], y[:40000]
train_dataloader = DataLoader(TensorDataset(x_train, y_train), batch_size=8)
x_test, y_test = x[40000:], y[40000:]
test_dataloader = DataLoader(TensorDataset(x_test, y_test), batch_size=8)

### 3.2 定义网络结构
有两种方式，一种是直接继承`BaseModel`来定义网络结构（推荐），另一种是把网络结构实例化出来传入`BaseModel`

In [None]:
# 方式1
class MyModel(BaseModel):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential(
            nn.Conv2d(1, 32, kernel_size=3), nn.ReLU(),
            nn.MaxPool2d(2, 2), 
            nn.Conv2d(32, 64, kernel_size=3), nn.ReLU(),
            nn.Flatten(),
            nn.Linear(7744, 10)
        )
    def forward(self, inputs):
        return self.model(inputs)
model = MyModel().to(device)

# 方式2
# net = torch.nn.Sequential(
#             nn.Conv2d(1, 32, kernel_size=3), nn.ReLU(),
#             nn.MaxPool2d(2, 2), 
#             nn.Conv2d(32, 64, kernel_size=3), nn.ReLU(),
#             nn.Flatten(),
#             nn.Linear(7744, 10)
#         )
# model = BaseModel(net).to(device)

### 3.3 `compile`定义
定义optimizer， loss, scheduler, metric等其他参数

In [None]:
model.compile(optimizer=optim.Adam(model.parameters()), loss=nn.CrossEntropyLoss(), metrics=['acc'])

### 3.4 Callback定义

In [None]:
class MyEvaluator(Evaluator):
    # 重构评价函数
    def evaluate(self):
        total, hit = 1e-5, 0
        for X, y in tqdm(test_dataloader):
            pred_y = self.model.predict(X).argmax(dim=-1)
            hit += pred_y.eq(y).sum().item()
            total += y.shape[0]
        return {'test_acc': hit/total}
evaluator = MyEvaluator(monitor='test_acc', 
                        checkpoint_path='./ckpt/best_model.pt', 
                        optimizer_path='./ckpt/best_optimizer.pt', 
                        steps_params_path='./ckpt/best_step_params.pt')
ckpt = Checkpoint('./ckpt/model_{epoch}_{test_acc:.5f}.pt',
                    optimizer_path='./ckpt/optimizer_{epoch}_{test_acc:.5f}.pt',
                    steps_params_path='./ckpt/steps_params_{epoch}_{test_acc:.5f}.pt')
early_stop = EarlyStopping(monitor='test_acc', verbose=1)

### 3.5 模型训练

In [None]:
model.fit(train_dataloader, steps_per_epoch=100, epochs=5, callbacks=[evaluator, ckpt, early_stop])

## 4、Github仓库推荐
- 本项目：[torch4keras](https://github.com/Tongjilibo/torch4keras)
- NLP场景：参考bert4keras的pytorch实现：[bert4torch](https://github.com/Tongjilibo/bert4torch)
- 推荐场景：参考deepctr的实现（刚刚起步）：[rec4torch](https://github.com/Tongjilibo/rec4torch)