# PyTorch API

fastai, Catalyst, PyTorch Lightning, PyTorch-IgniteなどのPyTorchをベースとしたさまざまなライブラリやAPIが開発されている。  
| ライブラリ                 | 特徴と適用領域                               |
| --------------------- | ------------------------------------- |
| **fastai**            | 高レベルAPI。少ないコードで精度を出したい初中級者向け。教育分野にも◎  |
| **PyTorch Lightning** | PyTorchそのままの感覚で構造を整理しやすく、研究・プロダクション向け |
| **Catalyst**          | ハイコンペテンション環境、MLflowやHydra連携、強い再現性と柔軟性 |
| **Ignite**            | 最低限のフレームワークで、細かく制御したい中上級者向け           |


## PyTorch Lightning
PyTorchでの深層学習モデルの訓練・開発を簡潔かつ効率的に行うための高水準フレームワーク

In [9]:
import pytorch_lightning as pl
import torch
import torch.nn as nn
from torchmetrics import Accuracy

class MultiLayerPerceptron(pl.LightningModule):
    def __init__(self, image_shape=(1, 28, 28), hidden_units=(32, 16)):
        super().__init__()
        # Lightningの新しい属性
        self.train_acc = Accuracy(task="multiclass", num_classes=10)
        self.valid_acc = Accuracy(task="multiclass", num_classes=10)
        self.test_acc = Accuracy(task="multiclass", num_classes=10)

        # 前節と同様のモデル
        input_size = image_shape[0] * image_shape[1] * image_shape[2]
        all_layers = [nn.Flatten()]
        for hidden_unit in hidden_units:
            layer = nn.Linear(input_size, hidden_unit)
            all_layers.append(layer)
            all_layers.append(nn.ReLU())
            input_size = hidden_unit

        all_layers.append(nn.Linear(hidden_units[-1], 10))
        all_layers.append(nn.Softmax(dim=1))
        self.model = nn.Sequential(*all_layers)

    def forward(self, x):
        x = self.model(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(self(x), y)
        preds = torch.argmax(logits, dim=1)
        self.train_acc.update(preds, y)
        self.log("train_loss", loss, prog_bar=True)
        return loss

    def on_training_epoch_end(self, outs):
        self.log("train_acc", self.train_acc.compute())
        self.train_acc.reset()

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(self(x), y)
        preds = torch.argmax(logits, dim=1)
        self.valid_acc.update(preds, y)
        self.log("valid_loss", loss, prog_bar=True)
        return loss

    def on_validation_epoch_end(self):
        self.log("valid_acc", self.valid_acc.compute(), prog_bar=True)
        self.valid_acc.reset()

    def test_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(self(x), y)
        preds = torch.argmax(logits, dim=1)
        self.test_acc.update(preds, y)
        self.log("test_loss", loss, prog_bar=True)
        self.log("test_acc", self.test_acc.compute(), prog_bar=True)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=0.001)
        return optimizer


In [10]:
from torch.utils.data import DataLoader
from torch.utils.data import random_split
from torchvision.datasets import MNIST
from torchvision import transforms

class MnistDataModule(pl.LightningDataModule):
    def __init__(self, data_path='./'):
        super().__init__()
        self.data_path = data_path
        self.transform = transforms.Compose([transforms.ToTensor()])

    def prepare_data(self):
        MNIST(root=self.data_path, download=True)

    def setup(self, stage=None):
        # stageは'fit'、'validate'、'test'、または'predict'
        #（ここではNoneを指定）
        mnist_all = MNIST(root=self.data_path,
                          train=True,
                          transform=self.transform,
                          download=False)
        self.train, self.val = random_split(
            mnist_all, [55000, 5000],
            generator=torch.Generator().manual_seed(1))
        self.test = MNIST(root=self.data_path,
                          train=False,
                          transform=self.transform,
                          download=False)

    def train_dataloader(self):
        return DataLoader(self.train, batch_size=64, num_workers=4)

    def val_dataloader(self):
        return DataLoader(self.val, batch_size=64, num_workers=4)

    def test_dataloader(self):
        return DataLoader(self.test, batch_size=64, num_workers=4)


In [11]:
torch.manual_seed(1)
mnist_dm = MnistDataModule()

In [12]:
mnistclassifier = MultiLayerPerceptron()
if torch.cuda.is_available():
    trainer = pl.Trainer(max_epochs=10, gpus=1)
else:
    trainer = pl.Trainer(max_epochs=10)

trainer.fit(model=mnistclassifier, datamodule=mnist_dm)

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs

  | Name      | Type               | Params | Mode 
---------------------------------------------------------
0 | train_acc | MulticlassAccuracy | 0      | train
1 | valid_acc | MulticlassAccuracy | 0      | train
2 | test_acc  | MulticlassAccuracy | 0      | train
3 | model     | Sequential         | 25.8 K | train
---------------------------------------------------------
25.8 K    Trainable params
0         Non-trainable params
25.8 K    Total params
0.103     Total estimated model params size (MB)
11        Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:420: Consider setting `persistent_workers=True` in 'val_dataloader' to speed up the dataloader worker initialization.


                                                                           

/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:420: Consider setting `persistent_workers=True` in 'train_dataloader' to speed up the dataloader worker initialization.


Epoch 9: 100%|██████████| 860/860 [00:08<00:00, 100.46it/s, v_num=3, train_loss=1.590, valid_loss=1.520, valid_acc=0.943]

`Trainer.fit` stopped: `max_epochs=10` reached.


Epoch 9: 100%|██████████| 860/860 [00:08<00:00, 100.37it/s, v_num=3, train_loss=1.590, valid_loss=1.520, valid_acc=0.943]


## TensorBoardを使ってモデル評価

コマンドラインで
tensorboard --logdir lightning_logs/
新しいPythonのバージョンだと使えないっぽい。  
Lightningでは、好都合なことに、すでに訓練したモデルを読み込み、さらに数エポックの訓練を行うことができる。

In [13]:
%load_ext tensorboard
%tensorboard --logdir lightning_logs

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
  import pkg_resources
Traceback (most recent call last):
  File [35m"/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/bin/tensorboard"[0m, line [35m5[0m, in [35m<module>[0m
    from tensorboard.main import run_main
  File [35m"/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/tensorboard/main.py"[0m, line [35m27[0m, in [35m<module>[0m
    from tensorboard import default
  File [35m"/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/tensorboard/default.py"[0m, line [35m40[0m, in [35m<module>[0m
    from tensorboard.plugins.image import images_plugin
  File [35m"/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/tensorboard/plugins/image/images_plugin.py"[0m, line [35m18[0m, in [35m<module>[0m
    import imghdr
[1;35mModuleNotFoundError[0m: [35mNo module named 'imghdr'[0m

In [14]:
path = './lightning_logs/version_1/checkpoints/epoch=9-step=8600.ckpt'
if torch.cuda.is_available():
    trainer = pl.Trainer(max_epochs=15, gpus=1)
else:
    trainer = pl.Trainer(max_epochs=15)
trainer.fit(model=mnistclassifier, datamodule=mnist_dm, ckpt_path=path)

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Restoring states from the checkpoint path at ./lightning_logs/version_1/checkpoints/epoch=9-step=8600.ckpt
/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:362: The dirpath has changed from '/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/Python機械学習プログラミング/第13章_PyTorchのメカニズム/lightning_logs/version_1/checkpoints' to '/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/Python機械学習プログラミング/第13章_PyTorchのメカニズム/lightning_logs/version_4/checkpoints', therefore `best_model_score`, `kth_best_model_path`, `kth_value`, `last_model_path` and `best_k_models` won't be reloaded. Only `best_model_path` will be reloaded.

  | Name      | Type               | Params | Mo

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:420: Consider setting `persistent_workers=True` in 'val_dataloader' to speed up the dataloader worker initialization.


                                                                            

/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:420: Consider setting `persistent_workers=True` in 'train_dataloader' to speed up the dataloader worker initialization.


Epoch 14: 100%|██████████| 860/860 [00:08<00:00, 104.79it/s, v_num=4, train_loss=1.590, valid_loss=1.520, valid_acc=0.944]

`Trainer.fit` stopped: `max_epochs=15` reached.


Epoch 14: 100%|██████████| 860/860 [00:08<00:00, 104.70it/s, v_num=4, train_loss=1.590, valid_loss=1.520, valid_acc=0.944]


In [15]:
trainer.test(model=mnistclassifier, datamodule=mnist_dm)

/Users/nagairyousuke/名称未設定フォルダ/StudyMLList/StudyML/lib/python3.13/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:420: Consider setting `persistent_workers=True` in 'test_dataloader' to speed up the dataloader worker initialization.


Testing DataLoader 0: 100%|██████████| 157/157 [00:00<00:00, 239.17it/s]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test_acc            0.9402898550033569
        test_loss            1.514886736869812
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 1.514886736869812, 'test_acc': 0.9402898550033569}]

In [16]:
#モデルの再利用
model = MultiLayerPerceptron.load_from_checkpoint(path)