DataModule类有一个父类(HyperParameters)

在DataModule类定义中，主要有以下方法

- `__init__` 

传入数据的url或者数据的根目录文件。downloading the data and preprocessing the data.

亦可以自己定义数据集(但仍然建议这一步在DIYDataset类中完成)
```python
self.X = torch.randn(n, len(w))
noise = torch.randn(n, 1) * noise
self.y = torch.matmul(self.X, w.reshape((-1, 1))) + b + noise 
```
- `def train_dataloader` 调用`def get_dataloader`并传入train==True，返回train的DataLoader。returns data loader for the training set.
- `def val_dataloader(option)` 调用`def get_dataloader`并传入train==False，返回validation的DataLoader。returns data loader for the validaton set.
- `def get_dataloader` 调用`def get_tensorloader`并传入是否是训练集，返回DataLoader
- `def get_tensorloader` 根据train标记生成对应dataset=Dataset的子类DIYDataset，并根据Dataset生成DataLoader，返回DataLoader

数据究竟如何`__getitem__`和`__len__`取决于DIYDataset类。

In [None]:
"""
Intro:
    The DataModule class is the base class for data.

    __init__ method is used to prepare the data. This includes downloading and preprocessing if needed.

    train_dataloader returns the data loader for the training dataset. A data loader is a (Python) generator that yields a data batch each time it is used. This batch is then fed into the training_step method of Module to compute the loss.

    There is an optional val_dataloader to return the validation dataset loader.
"""
class DataModule(HyperParameters):  #@save
    """The base class of data."""
    def __init__(self, root='../data', num_workers=4):
        """
        intro:
            read the data path.
        """
        self.save_hyperparameters()

    def train_dataloader(self):
        """
        intro:
            return training dataloader
        """
        return self.get_dataloader(train=True)

    def val_dataloader(self):
        """
        intro:
            return validation dataloader
        """
        return self.get_dataloader(train=False)
    
    def get_dataloader(self, train):
        """
        intro:
            return train / validation dataloader depend on train == True / False
        """
        # raise NotImplementedError
        i = slice(0, self.num_train) if train else slice(self.num_train, None)
        return self.get_tensorloader((self.X, self.y), train, i)

    # add
    def get_tensorloader(self, tensors, train, indices=slice(0, None)):
        """
        intro:
            get dataset through `class DIYDataset` inherit `Dataset` using `torch.utils.data.Dataset`. then return dataloader
        """
        tensors = tuple(a[indices] for a in tensors)
        dataset = torch.utils.data.TensorDataset(*tensors)
        return torch.utils.data.DataLoader(dataset, self.batch_size,
                                           shuffle=train)

Module类有两个父类(torch.nn.Module, HyperParameters)

在Module类模型定义中，主要有以下几个方法

- `def __init__`: 
    
    定义模型参数，self.w, self.b或者self.net; 
    
    初始化模型参数; 
    
    生成绘制loss的图片，实例化ProgressBoard()类==self.board
- `def forward`:  模型参数和数据的计算方式，如何对参数进行训练
- `def loss`:     损失函数。传入signal_hat和target计算并返回loss。
- `def configure_optimizers`: 书写优化函数，作用在于如何使用算法使得loss最小，且更新参数。
- `def training_step` 在训练阶段接收一个batch的数据，这里的batch是已经训练好的batch，并将batch数据分为signal_hat和target传入`def loss`，计算两者的loss并返回，同时使用plot传入这个batch中的loss值。accepts a data batch to return the loss value.
- `def validation_step(option)` 测试阶段使用。evaluation measures.
- `def plot` 绘制一个epoch-loss曲线，但是会有一个参数叫做`self.plot_train_per_epoch`决定了一个epoch中有几个点会在图上绘制。

In [None]:
"""
Name: 
    Model module

Intro: 
    The Module class is the base class of all models we will implement. It inherits from `nn.Module`

    __init__, stores the learnable parameters

    training_step method accepts a data batch to return the loss value

    configure_optimizers returns the optimization method, or a list of them, that is used to update the learnable parameters

    validation_step to report the evaluation measures.
"""
class Module(nn.Module, HyperParameters):  #@save
    """The base class of models."""
    def __init__(self, plot_train_per_epoch=2, plot_valid_per_epoch=1):
        """
        intro:
            init self.net to get model parameters. 
            init self.board to get ProgressBar, that is, TensorBoard.
        """
        super().__init__()
        self.save_hyperparameters()
        self.net = None
        self.board = ProgressBoard()

    def loss(self, y_hat, y):
        """
        intro:
            init loss(fn), then calculate the loss between (y_hat, y).
        """
        # raise NotImplementedError
        fn = nn.MSELoss()
        return fn(y_hat, y)

    def forward(self, X):
        """
        intro:
            how to calculate the model learnable parameters using forward.
        """
        assert hasattr(self, 'net'), 'Neural network is defined'
        return self.net(X)

    def plot(self, key, value, train):
        """
        intro:
            plot loss. Plot a point in animation.
        """
        assert hasattr(self, 'trainer'), 'Trainer is not inited'
        self.board.xlabel = 'epoch'
        if train:
            x = self.trainer.train_batch_idx / \
                self.trainer.num_train_batches
            n = self.trainer.num_train_batches / \
                self.plot_train_per_epoch
        else:
            x = self.trainer.epoch + 1
            n = self.trainer.num_val_batches / \
                self.plot_valid_per_epoch
        self.board.draw(x, value.to(cpu()).detach().numpy(),
                        ('train_' if train else 'val_') + key,
                        every_n=int(n))

    def training_step(self, batch):
        """
        intro:
            how to calculate loss in one batch.
        """
        l = self.loss(self(*batch[:-1]), batch[-1])
        self.plot('loss', l, train=True)
        return l

    def validation_step(self, batch):
        """
        intro:
            how to calculate loss in one batch.
        """
        l = self.loss(self(*batch[:-1]), batch[-1])
        self.plot('loss', l, train=False)

    def configure_optimizers(self):
        """
        intro:
            return optimizer
        """
        # raise NotImplementedError
        return torch.optim.SGD(self.parameters(), self.lr)

Trainer类有一个父类(HyperParameters)

在Trainer类定义中，主要有以下几个方法

- `def __init__` 初始化一些训练参数。

指定训练过程中的max_epochs, 

使用哪个GPU训练num_gpus,

 梯度是否裁剪以及对应裁剪值gradient_clip_val
- `def fit` 传入Module的实例model，DataModule实例data，进行训练。

调用`def prepare_data`

调用`def prepare_model`

调用`model.configure_optimizers`指定self.optim。使用module中的类方法返回值为trainer中的优化算法

初始化self.epoch, self.train_batch_idx, self.val_batch_idx

每一个epoch调用`def fit_epoch()`

- `def prepare_data` 传入data，即DataModule实例类。预处理在DIYDataset中或者更早已经完成。使用self.train_dataloader = data.train_dataloader()来给trainer的train_dataloader赋值。
- `def prepare_model` 传入Module实例类。指定trainer.model=model, model.trainer=trainer, model.board.xlim为trainer.max_epochs
- `def fit_epoch` 一个epoch所要做的事情

In [None]:
"""
Name:
    Training Module

Intro:
    The Trainer class trains the learnable parameters in the Module class with data specified in DataModule. 

    The key method is fit, which accepts two arguments: model, an instance of Module, and data, an instance of DataModule. It then iterates over the entire dataset max_epochs times to train the model. 
"""
class Trainer(HyperParameters):  
    """The base class for training models with data."""
    def __init__(self, max_epochs, num_gpus=0, gradient_clip_val=0):
        """
        intro:
            init trainer
        """
        self.save_hyperparameters()
        assert num_gpus == 0, 'No GPU support yet'

    def prepare_data(self, data):
        """
        intro:
            get self.train_dataloader, self.val_dataloader
        """
        self.train_dataloader = data.train_dataloader()
        self.val_dataloader = data.val_dataloader()
        try:
            self.num_train_batches = len(self.train_dataloader) if len(self.train_dataloader) else 0
            self.num_val_batches = (len(self.val_dataloader)
                                if self.val_dataloader is not None else 0)
        except:
            print("not using torch dataset")
    
    # add
    def prepare_batch(self, batch):
        return batch
    
    def prepare_model(self, model):
        """
        intro:
            get self.model
        """
        model.trainer = self
        model.board.xlim = [0, self.max_epochs]
        self.model = model

    def fit(self, model, data):
        """
        intro:
            fit.
        """
        self.prepare_data(data)
        self.prepare_model(model)
        self.optim = model.configure_optimizers()
        self.epoch = 0
        self.train_batch_idx = 0
        self.val_batch_idx = 0
        for self.epoch in range(self.max_epochs):
            self.fit_epoch()

    def fit_epoch(self):
        """
        intro:
            fit per epoch
        """
        # 1. training time
        self.model.train()
        for batch in self.train_dataloader:
            loss = self.model.training_step(self.prepare_batch(batch))
            self.optim.zero_grad()
            # 2. grad clip
            with torch.no_grad():
                loss.backward()
                if self.gradient_clip_val > 0:
                    self.clip_gradients(self.gradient_clip_val, self.model)
                self.optim.step()
            self.train_batch_idx += 1
        if self.val_dataloader is None:
            return 
        # 3. (optional) validation time
        self.model.eval()
        for batch in self.val_dataloader:
            with torch.no_grad():
                self.model.validation_step(self.prepare_batch(batch))
            self.val_batch_idx += 1