## Loss Functions

딥러닝의 핵심은 주어진 네트워크의 weight bias를 loss 가 작아지는 쪽으로 바꾸는 데 있다. 따라서 손실함수야 말로 학습을 어떻게 할 지 정의해주는 부분이라고 생각 할 수 있다. 

pytorch / pytorch lightning 에서는 기본적으로 회귀에서 사용되는 MSE, 분류에서 사용되는 Cross Entropy 뿐 아니라 다양한 손실함수를 정의하고 있으며, 이는 torch.nn.functional 에 사전 정의되어 내장되어 있다. 

> torch.nn.functional
* mse_loss      : element-wise mean squared error.
* cross_entropy : cross entropy loss between input(logit) and target(prob.).
* binary_cross_entropy : Binary Cross Entropy between the target and input probabilities.
* binary_cross_entropy_with_logits :  Binary Cross Entropy between target and input logits.
* kl_div : Kullback-Leibler divergence Loss
* l1_loss : mean element-wise absolute value difference.
* smooth_l1_loss : uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.
* nll_loss : negative log likelihood loss.
* poisson_nll_loss : Poisson negative log likelihood loss.
* gaussian_nll_loss : Gaussian negative log likelihood loss.


또한 Metric 들도 다양하게 사용할 수 있는데, Accuracy, AUCROC 등등 torchmetrics 를 통해 사용할 수 있다.
https://torchmetrics.readthedocs.io/en/stable/

대부분의 경우 내장됨 함수를 사용하게 되겠지만, 필요하다면 임의의 loss / metric 을 정의해서 쓸 수 도 있는데, 그 방법은 아래와 같다.

In [2]:
import numpy as np
import torch
from torch import nn
from torch.nn import functional as F
import torch.optim as optim

import pytorch_lightning as pl
from pytorch_lightning.accelerators import accelerator
from torchmetrics import functional as FM
from torchinfo import summary

from torchvision.datasets import MNIST
import torchvision.transforms as transforms
import torch.utils.data as data
from torch.utils.data import DataLoader
import pandas as pd
import matplotlib.pyplot as plt


In [3]:
class Onehot(object) :
    def __call__(self, sample):
        sample = sample
        target = np.eye(10)[sample] # 10x10 대각행열을 만들어서 그 중에 n번째 row 를 반환 0 --> (1,0,0,0,0....0)
        return torch.FloatTensor(target)
    

In [4]:
a = Onehot()
a(5)

tensor([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])

In [5]:
y_transform = transforms.Compose([Onehot()])         # target one-hot encoding 
x_transform = transforms.Compose([transforms.ToTensor()])  # image transform 

In [6]:
train_dataset = MNIST('', transform=x_transform, target_transform=y_transform, train=True)
test_dataset = MNIST('', transform=x_transform, target_transform=y_transform, train=False)

In [7]:
batch_size = 128
trainDatLoader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
valDataLoader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

In [8]:
class Model(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear1 = nn.Linear(28*28, 64)
        self.linear2 = nn.Linear(64, 32)
        self.linear3 = nn.Linear(32, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.flatten(x)
        x1 = self.linear1(x)
        x1 = self.relu(x1)
        x2 = self.linear2(x1)
        x2 = self.relu(x2)
        x3 = self.linear3(x2)
        return x3
               

In [9]:
def custom_loss_mse(pred, target):
    error = torch.mean(torch.square(pred-target))
    return error

In [10]:
def custom_mean_abs_error(y_pred, y_true):
    error = torch.abs(torch.mean(y_true - y_pred))
    return error 

In [24]:
class myModel(pl.LightningModule):

    def __init__(self):
        super().__init__()
        self.layers = Model()


    def forward(self, x):
        out = self.layers(x)
        out = torch.softmax(out, dim=-1) 
        return(out)

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self(x)
        loss = custom_loss_mse(y_pred, y)
        error = custom_mean_abs_error(y_pred, y)
        metrics = {'loss' : loss, 'error' : error}
        self.log_dict(metrics)
        return loss


    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_pred = self(x)
        loss = custom_loss_mse(y_pred, y)
        error = custom_mean_abs_error(y_pred, y)
        metrics = {'val_loss':loss, 'val_error':error}
        self.log_dict(metrics)
        #return loss # validation 은 리턴 안해도 상관 없음

    def configure_optimizers(self):
        return torch.optim.Adam( self.parameters(), lr=0.001)


    

In [25]:
model = myModel()

In [26]:
summary(model, input_size=(8, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
myModel                                  [8, 10]                   --
├─Model: 1-1                             [8, 10]                   --
│    └─Flatten: 2-1                      [8, 784]                  --
│    └─Linear: 2-2                       [8, 64]                   50,240
│    └─ReLU: 2-3                         [8, 64]                   --
│    └─Linear: 2-4                       [8, 32]                   2,080
│    └─ReLU: 2-5                         [8, 32]                   --
│    └─Linear: 2-6                       [8, 10]                   330
Total params: 52,650
Trainable params: 52,650
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.42
Input size (MB): 0.03
Forward/backward pass size (MB): 0.01
Params size (MB): 0.21
Estimated Total Size (MB): 0.24

In [27]:
epoch = 3
name = 'custom_loss_model' 
logger = pl.loggers.CSVLogger("logs", name=name)

In [28]:
trainer = pl.Trainer(max_epochs= epoch, logger=logger, accelerator='auto')
trainer.fit(model, trainDatLoader, valDataLoader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type  | Params
---------------------------------
0 | layers | Model | 52.6 K
---------------------------------
52.6 K    Trainable params
0         Non-trainable params
52.6 K    Total params
0.211     Total estimated model params size (MB)


Sanity Checking: 0it [00:00, ?it/s]

  rank_zero_warn(
  rank_zero_warn(


Training: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

Validation: 0it [00:00, ?it/s]

`Trainer.fit` stopped: `max_epochs=3` reached.


In [29]:
version_num = logger.version
history = pd.read_csv(f'./logs/{name}/version_{version_num}/metrics.csv')
history

Unnamed: 0,loss,error,epoch,step,val_loss,val_error
0,0.04986,2.328306e-10,0,49,,
1,0.020648,8.847564e-10,0,99,,
2,0.014066,3.72529e-10,0,149,,
3,0.020123,9.313226e-11,0,199,,
4,0.01265,3.72529e-10,0,249,,
5,0.02146,7.683411e-10,0,299,,
6,0.018764,1.164153e-10,0,349,,
7,0.012486,3.72529e-10,0,399,,
8,0.011867,6.984919e-10,0,449,,
9,,,0,468,0.011942,4.249812e-10


In [30]:
history.groupby('epoch').last().drop('step', axis=1)

Unnamed: 0_level_0,loss,error,val_loss,val_error
epoch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,0.011867,6.984919e-10,0.011942,4.249812e-10
1,0.013232,2.328306e-10,0.01053,5.265187e-10
2,0.008949,6.053597e-10,0.008557,4.597271e-10
