# Pytorch Tutorial

[Video-Source]https://www.youtube.com/playlist?list=PLqnslRFeH2UrcDBWF5mfPGpqQDSta6VK4;  
[Code-Source]https://github.com/patrickloeber/pytorch-examples

## Tensor Basics

```python
torch.ones(2, 2, dtype=torch.float16)
torch.rand(2, 2)
torch.zeros(2, 2)

torch.mul(x, y) # assert x.shape == y.shape
torch.div(x, y) # assert x.shape == y.shape
torch.matmul(x, y) # assert x.shape == (m, k) and y.shape == (k, n)
torch.add(x, y) # assert x.shape == y.shape
torch.sub(x, y) # assert x.shape == y.shape

a = numpy.arrary()
torch.from_numpy(a)
```

In [None]:
import torch
import numpy as np 

x = torch.rand(4, 4)
print(x)

y = x.view(16)
z = x.view(-1, 8)
print(y)
print(z, z.size())

Check the device

```python
if torch.cuda.is_available():
    device = torch.device("cuda")
    x = torch.ones(5, device=device)
    y = torch.ones(5)
    y = y.to(device)
```

In [None]:
torch.cuda.is_available()

if torch.cuda.is_available():
    device = torch.device("cuda")
    x = torch.ones(5, device=device)
    y = torch.ones(5)
    y = y.to(device) # to gpu
    z = x + y
    z = z.to("cpu") 

## Gradient calculation with autograd

```python
x = torch.randn(3, requires_grad=True)
y = x.mean()
# create a Jacobian matrix to get derivative
y.backward() # dy / dx
print(x.grad)

# Not show gradient info
x.requires_grad_(False) # form 1
x.detach() # form 2
with torch.no_grad(): # form 3
    y = x + 2
    print(y)
```

**Backpropagation Theory**

chain rule
$$\frac{dz}{dx} = \frac{dz}{dy} \cdot \frac{dy}{dx}$$

## Training Pipeline

+ 1) Design our model (input, output size, forward pass)
+ 2) Construct loss and optimizer
+ 3) Training loop:   
  Iterate over data, calculate loss, perform backward pass, update weights
  + forward pass: compute prediction
  + backward pass: compute gradients
  + update weights

**Key coding**

+ ```model.forward()```：前向推理，计算损失函数；
+ ```loss.backward()```：反向传播，计算当前梯度；
+ ```optimizer.step()```：根据梯度更新网络参数
+ ```optimizer.zero_grad()```：清空过往梯度

In [None]:
import torch 

X = torch.randn(100, 4, requires_grad=True)
y = torch.randn(100, 1, requires_grad=True)
w = torch.randn(4, 4, dtype=torch.float32, requires_grad=True) 

def forward(X, w):
    return torch.matmul(X, w)

def loss_func(y, y_pred):
    return torch.mean((y_pred - y) ** 2)

learning_rate = 0.01
n_iters = 100

for epoch in range(n_iters):
    y_pred = forward(X, w)
    l = loss_func(y, y_pred)
    l.backward() # dl / dw

    with torch.no_grad():
        w -= learning_rate * w.grad
    w.grad.zero_() # zero gradients
    if epoch % 10 == 0:
        print(f"Epoch: {epoch}, Loss: {l.item()}")


In [None]:
import torch 
import torch.nn as nn 

X = torch.randn(100, 4, dtype=torch.float32)
y = torch.randn(100, 1, dtype=torch.float32)

n_samples, n_features = X.shape
# input, output size of features
input_size = n_features
output_size = n_features
# linear regression model
model = nn.Linear(input_size, output_size) 

lr = 0.01
n_iters = 100

criterion = nn.MSELoss() # loss function
optimizer = torch.optim.SGD(model.parameters(), lr=lr)

for epoch in range(n_iters):
    y_pred = model(X)
    loss = criterion(y_pred, y)

    loss.backward() # compute gradients
    optimizer.step() # update weights   
    optimizer.zero_grad() # zero gradients

    if epoch % 10 == 0:
        print(f"Epoch = {epoch}, Loss = {loss.item():.5f}")


## Dataset and Dataloader

[reference] https://pytorch.org/tutorials/beginner/basics/data_tutorial.html  

+ Dataset: 数据集，存储数据和标签；
  + A custom Dataset class must implement three functions: ```__init__```, ```__len__```, and ```__getitem__```.

+ Dataloader: 数据加载器，对数据进行预处理，并生成批量数据；

  + The ```Dataset``` retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in “minibatches”, reshuffle the data at every epoch to reduce model overfitting, and use Python’s ```multiprocessing``` to speed up data retrieval.

  + ```DataLoader``` is an iterable that abstracts this complexity for us in an easy API.

`DataLoader` 参数说明

1. `dataset` (必需): 用于加载数据的数据集，通常是`torch.utils.data.Dataset`的子类实例。
1. `batch_size` (可选): 每个批次的数据样本数。默认值为1。
1. `shuffle` (可选): 是否在每个周期开始时打乱数据。默认为False。
1. `sampler` (可选): 定义从数据集中抽取样本的策略。如果指定，则忽略`shuffle`参数。
1. `batch_sampler` (可选): 与sampler类似，但一次返回一个批次的索引。不能与`batch_size`、`shuffle`和`sampler`同时使用。
1. `num_workers` (可选): 用于数据加载的子进程数量。默认为0，意味着数据将在主进程中加载。
1. `collate_fn` (可选): 如何将多个数据样本整合成一个批次。通常不需要指定。将一个list的sample组成一个mini-batch的函数.
1. `drop_last` (可选): 如果数据集大小不能被批次大小整除，是否丢弃最后一个不完整的批次。默认为False。
1. `pin_memory` (可选): 如果为True，数据加载器将使用固定内存（pinned memory）来加速数据传输到GPU。默认为False。



## Dataset Transforms

[reference] https://pytorch.org/tutorials/beginner/data_loading_tutorial.html#transforms  

+ ```torchvision.transforms```: 图像预处理
    
```python


```

In [None]:
import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np
import math

class WineDataset(Dataset, transform=None):
    def __init__(self, transform=False):
        # data loading (skip the first row header)
        xy = np.loadtxt('./asset/wine/wine.csv', delimiter=",", dtype=np.float32, skiprows=1)
        self.X = torch.from_numpy(xy[:, 1:])
        self.y = torch.from_numpy(xy[:, [0]]) # n_samples, 1
        self.n_samples = xy.shape[0]
        self.transform = transform


    def __getitem__(self, index):
        sample = self.X[index], self.y[index]

        if self.transform:
            sample = self.transform(sample)
        return sample

    def __len__(self):
        return self.n_samples

dataset = WineDataset()
# first_data = dataset[0]
# features, labels = first_data
# print(features, labels)

dataloader = DataLoader(dataset=dataset, batch_size=4, shuffle=True)
# features, labels = next(iter(dataloader))
# print(features, labels)

# training loop 
num_epochs = 2
total_samples = len(dataset)
n_iters = math.ceil(total_samples / 4) # get upper boundary
print(total_samples, n_iters)

'''
for epoch in range(num_epochs):
    for i, (inputs, _) in enumerate(dataloader):
        # forward pass, backward pass, update weights
        if (i + 1) % 5 == 0:
            print(f'epoch {epoch+1} / {num_epochs}, step {i+1} / {n_iters}, inputs {inputs.shape}')
'''

## Feed-Forward Neural Networks


In [None]:
import torch
import torch.nn as nn

# fully connencted neual network with a hidden layer
class NeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)  
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        # no activation and no softmax at the end
        return out

In [None]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, Dataset
from torch.utils.tensorboard import SummaryWriter
import sys
writer = SummaryWriter("runs/mnist2")

# device config
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# hyper parameters 
input_size = 784 # 28x28
hidden_size = 100
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# MNIST dataset
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, \
                                           transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, \
                                          transform=transforms.ToTensor())
train_loader = DataLoader(dataset=train_dataset, batch_size= batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size= batch_size, shuffle=False)

examples = iter(train_loader)
samples, labels = examples.__next__() # bug for python version
print(samples.shape, labels.shape)

for i in range(6):
    plt.subplot(2, 3, i+1)
    plt.imshow(samples[i][0], cmap='gray')
    plt.title(labels[i].item())
plt.show()

img_grid = torchvision.utils.make_grid(samples)
writer.add_image('mnist_images', img_grid)
# writer.close()
# sys.exit()

model = NeuralNetwork(input_size, hidden_size, num_classes)
# loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

writer.add_graph(model, samples.reshape(-1, 28*28))
writer.close()
# sys.exit()
# training loop
n_total_steps = len(train_loader)

running_loss = 0.0 # for tensorboard scalar
running_correct = 0 # for tensorboard scalar
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # reshape images to (batch_size, input_size)
        # (100, 1, 28, 28) -> (100, 28*28)
        images = images.reshape(-1, 28*28).to(device)
        labels = labels.to(device)
        
        # forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # backward pass and update weights
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        running_correct += (predicted == labels).sum().item()

        if (i+1) % 100 == 0:
            print(f"epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss}")
            writer.add_scalar('training loss', running_loss / 100, epoch * n_total_steps + i)
            writer.add_scalar('accuracy', running_correct / 100, epoch * n_total_steps + i)
            running_loss = 0.0
            running_correct = 0

# test
preds = [] 
labels = []
with torch.no_grad():
    n_correct = 0
    n_samples = 0
    for images, label in test_loader:
        images = images.reshape(-1, 28*28).to(device)
        label = label.to(device)
        
        outputs = model(images)
        # max returns (value, index)
        _, predicted = torch.max(outputs.data, 1)
        n_samples += label.shape[0]
        n_correct += (predicted == label).sum().item()

        # classification results for tensorboard
        class_predictions = [nn.functional.softmax(output, dim=0) for output in outputs]
        preds.append(class_predictions)
        labels.append(predicted)
    
    preds = torch.cat([torch.stack(batch) for batch in preds])
    labels = torch.cat(labels)
    acc = 100.0 * n_correct / n_samples
    print(f'Accuracy on the testing images= {acc}%')

    classes = range(10)
    for i in classes:
        labels_i = labels == i
        preds_i = preds[:, i]
        writer.add_pr_curve(str(i), labels_i, preds_i, global_step=0)
        writer.close()


## Convolutional Neural Networks

+ The CIFAR-10 dataset 
+ Convolutional Layer
  + input size: $(n_h \times n_w)$, convo kernel: $(k_h \times k_w)$, 
  + padding: $(p_h, p_w)$,  stride: $(s_h, s_w)$
  + output size: $$(n_h - k_h + p_h + 1)\times (n_w - k_w + p_w + 1)$$
  + output size: $$\lfloor(n_h - k_h + p_h + s_h)/s_h \rfloor \times \lfloor (n_w - k_w + p_w + s_w)/s_w \rfloor$$ , or we can just compute $$ \lfloor (N - K + P) / S \rfloor + 1$$ if $p_h = k_h -1, p_w = k_w - 1$, output $(n_h/s_h)\times (n_w/s_w)$
+ Max Pooling 
  + (2 x 2) max pooling, output size: $(n_h/2)\times (n_w/2)$
+ Pytorch code
  + ```torch.nn.Conv2d```: 卷积层
  + ```torch.nn.MaxPool2d```: 最大池化层
  + ```torch.nn.Flatten```: 展平层
  + ```torch.nn.Linear```: 全连接层

  + ```torch.nn.Sequential```: 顺序模型
+ utils 
  + ```out = torchvision.utils.make_grid(images)``` : 显示图像
  + ```imshow(out, title=[class_names[x] for x in classes])```: 显示图像
  + ```torchvision.transforms.ToPILImage()```: 图像转换
    
  ``````

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class ConvNetwork(nn.Module):
    # original shape of images [4, 3, 32, 32]
    # input_layer: 3 input channels, 6 output channels, 5 kernel size   
    def __init__(self):
        super(ConvNetwork, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5) # [4, 3, 32, 32] -> [4, 6, 28, 28]
        self.pool = nn.MaxPool2d(2, 2) # [4, 6, 28, 28] -> [4, 6, 14, 14]
        self.conv2 = nn.Conv2d(6, 16, 5) # [4, 6, 14, 14] -> [4, 16, 10, 10]
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.pool(x)
        x = F.relu(self.conv2(x))
        x = self.pool(x)
        # flatten the output of conv2 to (batch_size, 16*5*5)
        x = x.view(-1, 16*5*5) 
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return self.fc3(x)
    

In [None]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader, Dataset

# device config
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# hyper parameters 
input_size = 1024 # 32x32
hidden_size = 100
num_classes = 10
num_epochs = 4
batch_size = 4
learning_rate = 0.001

# CIRAR10 dataset
transform = transforms.Compose(
    [transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)
train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, \
                                           transform=transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, \
                                          transform=transform, download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size= batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size= batch_size, shuffle=False)

classes = ('plane', 'car', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck')

examples = iter(train_loader)
samples, labels = examples.__next__() # bug for python version
print(samples.shape, labels.shape)

for i in range(4):
    plt.subplot(1, 4, i+1)
    plt.imshow(samples[i][0])
    plt.title(classes[labels[i].item()])
plt.show()


model = ConvNetwork()
# loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# training loop
n_total_steps = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # original shape of images [4, 3, 32, 32]
        # input_layer: 3 input channels, 6 output channels, 5 kernel size
        images = images.to(device)
        labels = labels.to(device)
        
        # forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # backward pass and update weights
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print(f"epoch {epoch+1} / {num_epochs}, step {i+1}/{n_total_steps}, loss = {loss}")

print('Finishing Training')
# test 
with torch.no_grad():
    n_correct = 0
    n_samples = 0
    n_class_correct = [0 for i in range(num_classes)]
    n_class_samples = [0 for i in range(num_classes)]

    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        
        outputs = model(images)
        # value, index
        _, predicted = torch.max(outputs.data, 1)
        n_samples += labels.shape[0]
        n_correct += (predicted == labels).sum().item()

        for i in range(batch_size):
            label = labels[i]
            pred = predicted[i]
            if (label == pred):
                n_class_correct[label] += 1
            n_class_samples[label] += 1

    acc = 100.0 * n_correct / n_samples
    print(f'Accuracy of Convolutional Network = {acc}%')

    for i in range(10):
        acc = 100.0 * n_class_correct[i] / n_class_samples[i]
        print(f'Accuracy of {classes[i]} = {acc}%')


## Tranfer Learning

+ ```torch.utils.tensorboard```: 记录训练过程
    

```python
from torch.utils.tensorboard import SummaryWriter
import sys

writer = SummaryWriter("runs/mnist2")
```


In [None]:
# tranfer.py 
import torch 
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torchvision
from torchvision import datasets, models, transforms
import numpy as np
import matplotlib.pyplot as plt
import time
import os
import copy

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225]) 

... 

def train_model():
    pass

####
model = models.resnet18(pretrained=True)
for param in model.parameters():
    param.requires_grad = False 
    
num_ftrs = model.fc.in_features  # num_features
model.fc = nn.Linear(num_ftrs, 2) 
model.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)

# scheduler
# Decay LR by a factor of 0.1 every 7 epochs
step_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
model = train_model(model, criterion, optimizer, step_lr_scheduler, num_epochs=25)
# for epoch in range(num_epochs):
#     train(...) 
#     validate(...) 
#     schelduler.step()


## Transformer


+ ```torch.nn.Transformer```: 实现Transformer模型

## TensorBoard Usage

+ ```torch.utils.tensorboard```: 记录训练过程
  [referce]https://pytorch.org/tutorials/beginner/introyt/tensorboardyt_tutorial.html;
  
```python

from torch.utils.tensorboard import SummaryWriter
import sys

writer = SummaryWriter("runs/mnist2")

examples = iter(train_loader)
samples, labels = examples.__next__() # bug for python version
print(samples.shape, labels.shape)


img_grid = torchvision.utils.make_grid(samples)
writer.add_image('mnist_images', img_grid)

writer.add_graph(model, samples.reshape(-1, 28*28))
# writer.close()
# training loop
running_loss = 0.0
running_correct = 0

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):

        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        running_correct += (predicted == labels).sum().item()

        if (i+1) % 100 == 0:
            writer.add_scalar('training loss', running_loss / 100, epoch * n_total_steps + i)
            writer.add_scalar('accuracy', running_correct / 100, epoch * n_total_steps + i)
            running_loss = 0.0
            running_correct = 0
# test
preds = []
labels = []
with torch.no_grad():
    for images, label in test_loader:
        images = images.reshape(-1, 28*28).to(device)
        label = label.to(device)
        
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        ...
        # classification results for tensorboard
        class_predictions = [nn.functional.softmax(output, dim=0) for output in outputs]
        preds.append(class_predictions)
        labels.append(predicted)
    
    preds = torch.cat([torch.stack(batch) for batch in preds])
    labels = torch.cat(labels)
    acc = 100.0 * n_correct / n_samples
    print(f'Accuracy on the testing images= {acc}%')

    classes = range(10)
    for i in classes:
        labels_i = labels == i
        preds_i = preds[:, i]
        writer.add_pr_curve(str(i), labels_i, preds_i, global_step=0)
        writer.close()
```


## Saving and Loading Models

[Reference]https://pytorch.org/tutorials/beginner/basics/saveloadrun_tutorial.html;

+ ```torch.save```: 保存模型参数
  + complete model: 保存整个模型对象，包括模型的结构和参数。当加载模型时，需要确保与原始模型相同的代码定义了模型的结构。
  + state dict: 仅保存模型的状态字典（state dictionary），即模型的参数。这种方法保存的文件相对较小，只包含模型的权重信息，而不包括模型的结构。加载模型时，需要首先根据代码定义模型的结构，然后再将参数加载到模型中。
+ ```torch.load```: 加载模型参数

```python
# Example
PATH = 'mymodel.pth'
#### COMPLETE MODEL ####
torch.save(model, PATH)

# model class must be define somewhere 
model = torch.load(PATH)
model.eval()

##### STATE DICT #####
torch.save(model.state_dict(), PATH)

# model must be created again with parameters
model = MyModel(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

# How to make model human visible
for param in loaded_model.parameters():
  print(param)

print(model.state_dict())
```

+ A pipeline for using **checkpoint** to save and load model

```python
# train your model
learning_rate = 0.01 
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
print(optimizer.state_dict())

checkpoint = {
    'epoch': current_epoch,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    # 'loss': loss,
    # 'accuracy': accuracy
}
# torch.save(checkpoint, 'checkpoint.pth')
loaded_checkpoint = torch.load('checkpoint.pth')
epoch = loaded_checkpoint['epoch']

model = MyModel(*args, **kwargs)
optimizer = torch.optim.SGD(model.parameters(), lr=0)

model.load_state_dict(loaded_checkpoint['model_state_dict'])
optimizer.load_state_dict(loaded_checkpoint['optimizer_state_dict'])

print(optimizer.state_dict())
```

+ **Saving and loading model on CPU or GPU**

```python
# Save on GPU， load on CPU 
device = torch.device("cuda")
model.to(device) 
model.save(model.sate_dict(), PATH)

target_device = torch.device("cpu")
model = MyModel(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location=target_device)) 

# Save on GPU， load on GPU
device = torch.device("cuda")
model.to(device)
model.save(model.sate_dict(), PATH)

model = MyModel(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.to(device)

# Save on CPU， load on GPU 
model.save(model.sate_dict(), PATH)

device = torch.device("cuda") # specify the cude device
model = MyModel(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location="cuda:0")) # choose which cuda device to load
model.to(device)
```



## Pytorch Lightning Tutorial

Lightning Source: [github-PytorchLightning](https://github.com/Lightning-AI/pytorch-lightning);  
Reference: [pytorch-lightning入门到精通](https://github.com/3017218062/Pytorch-Lightning-Learning)  

Simple installation from PyPI or Conda

 - ```pip install pytorch-lightning``` 
 - ```conda install pytorch-lightning -c conda-forge``` 

Show on ``tensorboard``` 

```logger = TensorBoardLogger('tb_logs', name='my_model')```  
```tensorboard --logdir ./tb_logs```

```python
# lightning features
model.train()
model.eval()

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# -> easy GPU/TPU support
# -> scale GPUs

# Bonus: - Tensorbord support
#        - prints tips/hints
```

In [31]:
# lightning.py  
import torch 
import torch.nn as nn 
import torchvision 
import torchvision.transforms as transforms 
import matplotlib.pyplot as plt 

import pytorch_lightning as pl
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader

# Hyper-parameters 
input_size = 784  # 28x28
hidden_size = 500 
num_classes = 10 
num_epochs = 5 
batch_size = 100 
learning_rate = 0.001


class LitNeuralNet(pl.LightningModule):
    def __init__(self, input_size, hidden_size, num_classes):
        super(LitNeuralNet, self).__init__()
        self.validation_step_outputs = []
        self.input_size = input_size 
        self.l1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.l2 = nn.Linear(hidden_size, num_classes) 

    def forward(self, x):
        out = self.relu(self.l1(x))
        out = self.l2(out)
        # no activation and no softmax at the end 
        return out 
    
    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=learning_rate) 
   
    def training_step(self, batch, batch_idx):
        images, labels = batch 
        images = images.reshape(-1, 28*28)

        # forward pass 
        outputs = self(images) 
        loss = F.cross_entropy(outputs, labels)
        tensorboard_logs = {'train_loss': loss} 
        return {'loss': loss, 'log': tensorboard_logs} 
    
    def train_dataloader(self):
        train_dataset = torchvision.datasets.MNIST(root='./data/', 
                         train=True, transform=transforms.ToTensor(), download=True)
        train_loader = DataLoader(train_dataset, batch_size=batch_size, num_workers=4, shuffle=True) 
        return train_loader
    
    def validation_step(self, batch, batch_idx):
        images, labels = batch 
        images = images.reshape(-1, 28*28)

        # forward pass 
        outputs = self(images) 
        loss = F.cross_entropy(outputs, labels)
        self.validation_step_outputs.append(loss)
        tensorboard_logs = {'avg_val_loss': loss} 
        return {'val_loss': loss, 'log': tensorboard_logs} 
    
    def val_dataloader(self):
        val_dataset = torchvision.datasets.MNIST(root='./data/', 
                         train=False, transform=transforms.ToTensor(), download=True)
        val_loader = DataLoader(val_dataset, batch_size=batch_size, num_workers=4, shuffle=False) 
        return val_loader    
    
    def on_validation_epoch_end(self):
        avg_loss = torch.stack(self.validation_step_outputs).mean()
        tensorboard_logs = {'avg_val_loss': avg_loss} 
        self.validation_step_outputs.clear()  # free memory
        return {'val_loss': avg_loss, 'log': tensorboard_logs}
    

# if __name__ == 'main':
trainer = pl.Trainer(max_epochs=num_epochs, fast_dev_run=False) 
model = LitNeuralNet(input_size, hidden_size, num_classes) 
trainer.fit(model) 

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: d:\Desktop\StudyNote\PythonNote\lightning_logs



  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 392 K 
1 | relu | ReLU   | 0     
2 | l2   | Linear | 5.0 K 
--------------------------------
397 K     Trainable params
0         Non-trainable params
397 K     Total params
1.590     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

d:\Code\Anaconda\envs\ml\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:436: Consider setting `persistent_workers=True` in 'val_dataloader' to speed up the dataloader worker initialization.


                                                                            

d:\Code\Anaconda\envs\ml\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:436: Consider setting `persistent_workers=True` in 'train_dataloader' to speed up the dataloader worker initialization.


Epoch 4: 100%|██████████| 600/600 [00:14<00:00, 40.85it/s, v_num=0]

`Trainer.fit` stopped: `max_epochs=5` reached.


Epoch 4: 100%|██████████| 600/600 [00:14<00:00, 40.80it/s, v_num=0]


In [27]:
# **A simple model by pytorch-lightning on github**
# main.py
# ! pip install torchvision
import torch, torch.nn as nn, torch.utils.data as data, torchvision as tv, torch.nn.functional as F
import pytorch_lightning as pl

# --------------------------------
# Step 1: Define a LightningModule
# --------------------------------
# A LightningModule (nn.Module subclass) defines a full *system*
# (ie: an LLM, diffusion model, autoencoder, or simple image classifier).


class LitAutoEncoder(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Sequential(nn.Linear(28 * 28, 128), nn.ReLU(), nn.Linear(128, 3))
        self.decoder = nn.Sequential(nn.Linear(3, 128), nn.ReLU(), nn.Linear(128, 28 * 28))

    def forward(self, x):
        # in lightning, forward defines the prediction/inference actions
        embedding = self.encoder(x)
        return embedding

    def training_step(self, batch, batch_idx):
        # training_step defines the train loop. It is independent of forward
        x, _ = batch
        x = x.view(x.size(0), -1)
        z = self.encoder(x)
        x_hat = self.decoder(z)
        loss = F.mse_loss(x_hat, x)
        self.log("train_loss", loss)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
        return optimizer


# -------------------
# Step 2: Define data
# -------------------
dataset = tv.datasets.MNIST("./data/", download=True, transform=tv.transforms.ToTensor())
train, val = data.random_split(dataset, [55000, 5000])

# -------------------
# Step 3: Train
# -------------------
autoencoder = LitAutoEncoder()
trainer = pl.Trainer(fast_dev_run=True)
trainer.fit(autoencoder, data.DataLoader(train), data.DataLoader(val))

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Running in `fast_dev_run` mode: will run the requested loop using 1 batch(es). Logging and checkpointing is suppressed.
d:\Code\Anaconda\envs\ml\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:72: You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.

  | Name    | Type       | Params
---------------------------------------
0 | encoder | Sequential | 100 K 
1 | decoder | Sequential | 101 K 
---------------------------------------
202 K     Trainable params
0         Non-trainable params
202 K     Total params
0.810     Total estimated model params size (MB)
d:\Code\Anaconda\envs\ml\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argume

Epoch 0: 100%|██████████| 1/1 [00:00<00:00, 61.58it/s]

`Trainer.fit` stopped: `max_steps=1` reached.


Epoch 0: 100%|██████████| 1/1 [00:00<00:00, 54.83it/s]
