## XCurve.AUROC.losses.StandardAUROC
This chapter will introduce how to conduct AUROC-oriented training on multiclass long-tail CIFAR-10 dataset.
Note that, in the following, we take SquareAUCLoss as an example. The others (i.e., HingeAUCLoss and ExpAUCLoss) stay similar to this one.

## Step 1
To begin with, we should import some packages:

In [2]:
from easydict import EasyDict as edict
import torch
import random
import numpy as np

from XCurve.AUROC.dataloaders import get_datasets # dataset of Xcurve
from XCurve.AUROC.dataloaders import get_data_loaders # dataloader of Xcurve
from XCurve.AUROC.losses import SquareAUCLoss # loss of AUROC
from torch.optim import SGD # optimier (or one can use any optimizer supported by PyTorch)
from XCurve.AUROC.models import generate_net # create model or you can adopt any DNN models by Pytorch

## Step 2
Set seed to ensure a reproducible experiment.

In [3]:
seed = 1024
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
np.random.seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

## Step 3
Formulate your model, criterion and optimizer

### Model
Our XCurve supports any pytorch-driven DNN model. In the XCurve, we implement ResNet-family backbone (18, 20, 35, etc.), Densenet121 and MLP.
> XCurve.AUROC.models.resnet18(num_classes=1, pretrained=None)

### Criterion
To decouple the pairwisely interdependent AUC risk, we use an instance-wise reformulation strategy to obtain an end-to-end AUROC-oriented AT method.
> XCurve.AUROC.losses.SquareAUCLoss(self, num_classes, gamma=1, transform='ovo')

> XCurve.AUROC.losses.HingeAUCLoss(self, num_classes, gamma=1, transform='ovo')

> XCurve.AUROC.losses.ExpAUCLoss(self, num_classes, gamma=1, transform='ovo')

#### Parameters
- num_classes (int): the number of classes in the classification problem. In this case, since we have more than two classes, num_classes is set to be greater than 2.
- gamma (float): the safe margin for the loss. If the distance between a sample and the decision boundary of its true class is less than gamma, the sample is considered correctly classified (Default [1.0]).
- transform (str): the manner of computing the multi-class AUROC Metric. In this case, it is set to "ovo", which stands for "one-vs-one". This means that all possible pairs of classes are compared, and the AUROC metric is computed for each pair separately. The final AUROC score is the average of all pairwise scores (Default ['ovo']).

### Optimizer
The optimizer variable is created using the SGD optimizer from the torch.optim module.

#### Parameters
- params: parameters in your pytorch model
- lr: the learning rate for the optimizer.

In [4]:
# set params to create model
args = edict({
    "model_type": "resnet18", # (support resnet, densenet121 and mlp)
    "num_classes": 10, # number of class
    "pretrained": None # if the model is pretrained
})
# Or you can adopt any DNN models by Pytorch
model = generate_net(args).cuda() # generate pytorch model 

num_classes = 10
criterion = SquareAUCLoss(
    num_classes=num_classes, # number of classes
    gamma=1.0, # safe margin
    transform="ovo" # the manner of computing the multi-classes AUROC Metric ('ovo' or 'ova').
) # create loss criterion
optimizer = SGD(model.parameters(), lr=0.01) # create optimizer

## Step 4
Now, we can start adversarial training!

### Dataset and DataLoaders
Before training, you should load your customized datasets and dataloaders. 
The get_datasets() function is utilized to load the dataset and create train, validation, and test datasets. The get_data_loaders() function is then employed to create dataloaders for the train, validation, and test datasets.
> XCurve.AUROC.dataloaders.get_datasets(dataset_args)

> XCurve.AUROC.dataloaders.get_data_loaders(train_set, val_set, test_set, train_batch_size， test_batch_size, num_workers=4, rpos = 1, rneg = 4)

#### Parameters
In the provided code, the dataset_args dictionary contains various parameters used to configure the dataset. Some of the crucial parameters in the dataset_args dictionary are explained below:

- data_dir (str): the relative path of the dataset. 
- input_size (list): the size of the input images in pixels.
- norm_params (EasyDict, list, float): the normalization parameters for the dataset. Specifically, it includes the mean and standard deviation values for the RGB channels of the images.
- use_lmdb (bool): whether the dataset should be loaded as an LMDB database or not.
- sampler (EasyDict, int): only employed for binary classification and contains the sampling rate for positive and negative examples in the dataset.
- aug (bool): whether data augmentation should be used during training or not.
- class2id (EasyDict, int): a dictionary mapping the class labels to their corresponding IDs. Here, the minority class has a label of 1 and all other classes have a label of 0.

In [5]:
# set dataset params, see our doc. for more details.
dataset_args = edict({
    "data_dir": "cifar-10-long-tail/", # relative path of dataset
    "input_size": [32, 32],
    "norm_params": {
        "mean": [123.675, 116.280, 103.530],
        "std": [58.395, 57.120, 57.375]
        },
    "use_lmdb": True,
    "resampler_type": "None",
    "npy_style": True,
    "aug": True, 
    "num_classes": num_classes
})

train_set, val_set, test_set = get_datasets(dataset_args) # load dataset
trainloader, valloader, testloader = get_data_loaders(
    train_set,
    val_set,
    test_set,
    train_batch_size=32,
    test_batch_size =64
) # load dataloader
# Note that, in the get_datasets(), we conduct stratified sampling for train_set  
# using the StratifiedSampler at from XCurve.AUROC.dataloaders import StratifiedSampler

### Train the model

The provided code block delineates the forward pass of a neural network model for one epoch.

- In line 1, a `for` loop is employed to iterate over the training data loader (`trainloader`). The `enumerate` function is utilized to iterate over the data loader with an index. For each iteration of the loop, a batch of input (`x`) and target (`target`) tensors are extracted from the `trainloader` and transferred to the GPU using the `cuda()` method.
- In line 2, the shape of the `target` tensor is printed. This tensor has a shape of `[batch_size, ]`, where `batch_size` is the number of samples in the current batch. It is also noted that the model is anticipated to output predictions in the range of `[0, 1]` for binary (i.e., sigmoid) or multi-class (i.e., softmax) AUROC optimization.
- In line 3, the input tensor `x` is passed through the model using the `model` object. The output of the model is a tensor of shape `[batch_size, num_classes]` if `num_classes > 2`, or `[batch_size, ]` otherwise. This output tensor is passed through a sigmoid function using the `torch.sigmoid` method to obtain a probability distribution over the classes.
- In line 4, the binary cross-entropy loss is calculated between the predicted probabilities and the true targets using the `criterion` object.
- In line 5, the current loss value is displayed on the console if the current iteration index is a multiple of 30.
- In line 6, the gradients of the loss concerning the model parameters are computed using the `backward` method.
- In line 7, the optimizer's gradients are set to zero employing the `zero_grad` method.
- In line 8, the optimizer's step function is invoked using the `step` method to update the model parameters based on the computed gradients.

It is worth noting that the backward pass and optimizer step functions are employed to update the model's parameters and enhance its performance during training.

In [6]:
# forward of model for one epoch
for index, (x, target) in enumerate(trainloader):
    x, target  = x.cuda(), target.cuda()
    # target.shape => [batch_size, ]
    # Note that we ask for the prediction of the model among [0,1] 
    # for any binary (i.e., sigmoid) or multi-class (i.e., softmax) AUROC optimization.
    
    # forward
    pred = torch.sigmoid(model(x)) # [batch_size, num_classess] when num_classes > 2, o.w. output [batch_size, ] 
    loss = criterion(pred, target)
    if index % 30 == 0:
        print("loss:", loss.item())
    
    # backward
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

loss: 0.5070081353187561
loss: 0.5045551061630249
