## XCurve Library

This code imports several libraries and modules.

### Import optimizer, loss function, dataset, and dataloader.

The provided code block imports the necessary packages and modules for using the SquareAUCLoss loss function and the SGD optimizer in a PyTorch neural network implementation.

#### Importing Required Packages
1. The `torch` package is imported, which is the primary package used for building and training neural networks in PyTorch.
2. The `EasyDict` module from the `easydict` package is imported and aliased as `edict`. This module provides a simple way to create and access dictionaries with dot notation, making it easier to read and write code.

#### Importing SquareAUCLoss Function
1. The `SquareAUCLoss` class is imported from the `XCurve.AUROC.losses` module.
2. This loss function is used to compute the area under the ROC curve (AUROC) in a binary classification problem, which is a common evaluation metric used in machine learning.
3. The `SquareAUCLoss` function takes in the predicted scores and true labels as inputs and computes the loss based on the square of the difference between the predicted scores and the true labels.
4. This loss function is a variation of the AUCLoss function that has been shown to improve the performance of binary classification models.

#### Importing SGD Optimizer
1. The `SGD` optimizer is imported from the `torch.optim` module.
2. This optimizer is a stochastic gradient descent optimizer that is commonly used for training deep neural networks.
3. The `SGD` optimizer takes in the model parameters and the learning rate as inputs and updates the parameters based on the gradients computed during the backpropagation process.
4. Other optimizers like `Adam` or `Adagrad` can also be used depending on the specific needs of the model or task.

Overall, the imported packages and modules are essential for implementing and training a neural network in PyTorch. The SquareAUCLoss loss function is a specialized loss function that can be used to optimize the performance of binary classification models based on the AUROC metric. The SGD optimizer is a commonly used optimization algorithm that updates the model parameters based on the gradients computed during the backpropagation process. By importing these modules, the user can easily use these functions in their neural network implementation and fine-tune the model to achieve optimal performance for their specific task.

In [None]:
import torch
from easydict import EasyDict as edict

# import loss of AUROC
from XCurve.AUROC.losses import SquareAUCLoss

# import optimier (or one can use any optimizer supported by PyTorch)
from torch.optim import SGD

### Set up the parameters, create the model.

In the provided code, a deep learning model is created using the `generate_net` function from the `XCurve.AUROC.models` module. This function generates a neural network model of a specified type (e.g., resnet18, resnet20, densenet121, or mlp).

The `args` variable is an `edict`, which is a dictionary-like object that allows attribute-style access (e.g., `args.model_type` instead of `args['model_type']`). It contains the parameters for creating the model. Some of the essential parameters are explained below:

- `model_type`: This parameter specifies the type of model to be generated. In this case, it is set to `"resnet18"`, which means that a ResNet-18 model will be generated.

- `num_classes`: This parameter specifies the number of classes in the classification problem. In this case, it is set to `10`.

- `pretrained`: This parameter specifies whether the model should be loaded with pre-trained weights. In this case, it is set to `None`, which means that the model will not be pre-trained.

The `generate_net` function returns a PyTorch model object, stored in the `model` variable. The model is then moved to the GPU using the `cuda` method.

It is worth noting that the `generate_net` function is part of the XCurve package, which is a Python package for computing the Area Under the Receiver Operating Characteristic Curve (AUROC) and other performance metrics for binary and multi-class classification problems.

In [None]:
# create model or you can adopt any DNN models by Pytorch
from XCurve.AUROC.models import generate_net

# set params to create model
args = edict({
    "model_type": "resnet18", # (support resnet18,resnet20, densenet121 and mlp)
    "num_classes": 10, # number of class
    "pretrained": None # if the model is pretrained
})
model = generate_net(args).cuda() # generate model

### Load optimizer and loss function

The provided code block shows the creation of a loss criterion and an optimizer for a deep neural network (DNN) model with 2 classes using the PAUCLoss loss function and the SGD optimizer algorithm.

## Initializing num_classes Variable
1. The `num_classes` variable is initialized to 10, which indicates that the multi-classification problem is being solved.
2. This variable is used to specify the number of classes in the dataset, which is important for defining the network architecture and selecting the appropriate loss function.

## Creating Optimizer
1. The `SGD` optimizer is created using the `model.parameters()` method, which returns an iterator over the model's trainable parameters.
2. The learning rate of the optimizer is set to 0.01, which determines the step size of the parameter updates during training.
3. Other hyperparameters like momentum and weight decay can also be specified as arguments to the optimizer constructor.

## Creating Loss Criterion
1. The `SquareAUCLoss` loss criterion is created using the specified parameters:
    a. `num_classes` is set to the previously initialized value of 2.
    b. `gamma` is set to 1.0, which is the safe margin used in the loss function to penalize false positives and false negatives.
    c. `transform` is set to "ovo", which stands for "one-vs-one" and indicates that the multi-class AUROC metric will be computed using pairwise comparisons between classes.
2. The `SquareAUCLoss` loss criterion is a specialized loss function that is used to optimize the performance of binary classification models based on the AUROC metric.
3. The specified parameters are used to customize the behavior of the loss function, such as the way the multi-class AUROC metric is computed.

Overall, the `num_classes`, optimizer, and loss criterion are important components of a PyTorch neural network implementation. The `num_classes` variable is used to specify the number of classes in the dataset, which is important for defining the network architecture and selecting the appropriate loss function. The optimizer is responsible for updating the model parameters during training, and the `SGD` optimizer is commonly used for training deep neural networks. The `SquareAUCLoss` loss criterion is a specialized loss function that is used to optimize the performance of binary classification models based on the AUROC metric. By customizing the parameters of the loss function and the optimizer, the user can fine-tune the model to achieve optimal performance for their specific task.

In [None]:
num_classes = 10
optimizer = SGD(model.parameters(), lr=0.01) # create optimizer

# create loss criterion
criterion = SquareAUCLoss(
    num_classes=num_classes, # number of classes
    gamma=1.0, # safe margin
    transform="ovo" # the manner of computing the multi-classes AUROC Metric ('ovo' or 'ova').
)

### Create Dataset (train_set, val_set, test_set) and dataloader (trainloader)

In the provided code, the `dataset_args` dictionary contains various parameters used to configure the dataset. Some of the crucial parameters in the `dataset_args` dictionary are explained below:

- `data_dir`: This parameter specifies the relative path of the dataset. In this case, the dataset is stored in the `cifar-10-long-tail/` directory.

- `input_size`: This parameter specifies the size of the input images in pixels. Here, the input images have a dimension of 32x32 pixels.

- `norm_params`: This parameter contains the normalization parameters for the dataset. Specifically, it includes the mean and standard deviation values for the RGB channels of the images.

- `use_lmdb`: This parameter specifies whether the dataset should be loaded as an LMDB database or not.

- `sampler`: This parameter is only employed for binary classification and contains the sampling rate for positive and negative examples in the dataset.

- `aug`: This parameter specifies whether data augmentation should be used during training or not.

- `class2id`: This parameter is a dictionary mapping the class labels to their corresponding IDs. Here, the minority class has a label of `1` and all other classes have a label of `0`.

The `get_datasets()` function is utilized to load the dataset and create train, validation, and test datasets. The `get_data_loaders()` function is then employed to create dataloaders for the train, validation, and test datasets.

It is worth noting that the `get_datasets()` function utilizes stratified sampling for the train set. Specifically, it employs the `StratifiedSampler` from the `XCurve.AUROC.dataloaders` module to ensure that the number of samples from each class is balanced in the train set.

For more information on the `StratifiedSampler`, please refer to the official PyTorch documentation [here](https://pytorch.org/docs/stable/data.html#torch.utils.data.StratifiedSampler).

In [None]:
# set dataset params, see our doc. for more details.
dataset_args = edict({
    "data_dir": "cifar-10-long-tail/", # relative path of dataset
    "input_size": [32, 32],
    "norm_params": {
        "mean": [123.675, 116.280, 103.530],
        "std": [58.395, 57.120, 57.375]
        },
    "use_lmdb": True,
    "resampler_type": "None",
    "sampler": { # only used for binary classification
        "rpos": 1,
        "rneg": 10
        },
    "npy_style": True,
    "aug": True, 
    "class2id": { # positive (minority) class idx
        "1": 1, "0":0, "2":0, "3":0, "4":0, "5":0,
        "6":0, "7":0, "8":0, "9":0
    }
})

train_set, val_set, test_set = get_datasets(dataset_args) # load dataset
trainloader, valloader, testloader = get_data_loaders(
    train_set,
    val_set,
    test_set,
    train_batch_size=32,
    test_batch_size =64
) # load dataloader
# Note that, in the get_datasets(), we conduct stratified sampling for train_set  
# using the StratifiedSampler at from XCurve.AUROC.dataloaders import StratifiedSampler

  self.data = self.data.append(neg_samples, ignore_index=False)


### Train the model

The provided code block delineates the forward pass of a neural network model for one epoch.

- In line 1, a `for` loop is employed to iterate over the training data loader (`trainloader`). The `enumerate` function is utilized to iterate over the data loader with an index. For each iteration of the loop, a batch of input (`x`) and target (`target`) tensors are extracted from the `trainloader` and transferred to the GPU using the `cuda()` method.

- In line 2, the shape of the `target` tensor is printed. This tensor has a shape of `[batch_size, ]`, where `batch_size` is the number of samples in the current batch. It is also noted that the model is anticipated to output predictions in the range of `[0, 1]` for binary (i.e., sigmoid) or multi-class (i.e., softmax) AUROC optimization.

- In line 3, the input tensor `x` is passed through the model using the `model` object. The output of the model is a tensor of shape `[batch_size, num_classes]` if `num_classes > 2`, or `[batch_size, ]` otherwise. This output tensor is passed through a sigmoid function using the `torch.sigmoid` method to obtain a probability distribution over the classes.

- In line 4, the binary cross-entropy loss is calculated between the predicted probabilities and the true targets using the `criterion` object.

- In line 5, the current loss value is displayed on the console if the current iteration index is a multiple of 30.

- In line 6, the gradients of the loss concerning the model parameters are computed using the `backward` method.

- In line 7, the optimizer's gradients are set to zero employing the `zero_grad` method.

- In line 8, the optimizer's step function is invoked using the `step` method to update the model parameters based on the computed gradients.

It is worth noting that the backward pass and optimizer step functions are employed to update the model's parameters and enhance its performance during training.

In [None]:
# forward of model for one epoch
for index, (x, target) in enumerate(trainloader):
    x, target  = x.cuda(), target.cuda()
    # target.shape => [batch_size, ]
    # Note that we ask for the prediction of the model among [0,1] 
    # for any binary (i.e., sigmoid) or multi-class (i.e., softmax) AUROC optimization.
    
    # forward
    pred = torch.sigmoid(model(x)) # [batch_size, num_classess] when num_classes > 2, o.w. output [batch_size, ] 
    loss = criterion(pred, target)
    if index % 30 == 0:
        print("loss:", loss.item())
    
    # backward
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

loss: 0.15990693867206573
loss: 0.2334757000207901
loss: 0.30239760875701904
loss: 0.1381421536207199
loss: 0.13158641755580902
loss: 0.38331863284111023
loss: 0.08875473588705063
loss: 0.20242173969745636
loss: 0.32185712456703186
loss: 0.15229110419750214
loss: 0.11186783015727997
loss: 0.11620910465717316
