## XCurve.AUROC.losses.AdversarialAUROC
This chapter will introduce how to conduct AUROC-oriented adversarial training on binary long-tail CIFAR-10 dataset. 
Note that, in the following, we take AdvAUCOptimizer as an example. The other one (i.e., RegAdvAUCOptimizer) stays similar to this one.

## Step 1
To begin with, we should import some packages: 

In [None]:
from easydict import EasyDict as edict

from torchvision import transforms

In [None]:
import torch
import random
import numpy as np

from XCurve.AUROC.dataloaders import StratifiedSampler # for imbalanced dataset sampling
from XCurve.AUROC.dataloaders import IMBALANCECIFAR10, get_data_loaders # dataloader of Xcurve
from XCurve.AUROC.losses import AdvAUROCLoss, PGDAdversary
from XCurve.AUROC.optimizer import AdvAUCOptimizer
from XCurve.AUROC.models import generate_net # create model or you can adopt any DNN models by Pytorch

## Step 2
Set seed to ensure a reproducible experiment.

In [None]:
seed = 1024
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
np.random.seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

## Step 3
Formulate your model, criterion and optimizer

### Model
Our XCurve supports any pytorch-driven DNN model. In the XCurve, we implement ResNet-family backbone (18, 20, 35, etc.), Densenet121 and MLP.
> XCurve.AUROC.models.resnet18(num_classes=1, pretrained=None)

### Criterion
To decouple the pairwisely interdependent AUC risk, we use an instance-wise reformulation strategy to obtain an end-to-end AUROC-oriented AT method.
> XCurve.AUROC.losses.AdversarialAUROC.AdvAUROCLoss(self, imratio=None, a=None, b=None, alpha=None)

> XCurve.AUROC.losses.AdversarialAUROC.RegAdvAUROCLoss(self, imratio=None, a=None, b=None, alpha=None, lambda1=1.0, lambda2=1.0)

#### Parameters
- imratio (float): imbalance ration in your dataset. It must be precisely estimated based on your training set, i.e., $\frac{\#Pos.Num}{\#Tot.Num}$.
- a, b, alpha (torch.Tensor, float, requires_grad=True): initial values for a, b, $\alpha$ in instance-wise AUROC loss (Default [0.0]).
- lambda1, lambda2 (torch.Tensor, float, requires_grad=True): initial coefficient values for regularization (Default [1.0]).

### Optimizer
One-step optimizer to achieve Min-Max gradient update for PGD framework simultaneously.
> XCurve.AUROC.optimizer.AdvAUCOptimizer(params, a, b, alpha, momentum=0.9, weight_decay=0, nesterov=False)

> XCurve.AUROC.optimizer.RegAdvAUCOptimizer(params, a, b, alpha, lambda1, lambda2)
#### Parameters
- params: parameters in your pytorch model
- a, b, alpha, lambda1, lambda2: parameters in your criterion.

In [None]:

# set params to create model
args = edict({
    "model_type": "resnet18", # (support resnet, densenet121 and mlp)
    "num_classes": 2, # number of class
    "pretrained": None # if the model is pretrained
})
# Or you can adopt any DNN models by Pytorch
model = generate_net(args).cuda() # generate pytorch model 

p_hat = 0.1
criterion = AdvAUROCLoss(imratio=p_hat)
optimizer = AdvAUCOptimizer(model.parameters(), 
                            criterion.a, criterion.b, criterion.alpha,
                            lr=0.1, momentum=0.9,
                            weight_decay=1e-5)


## Step 4
Now, we can start adversarial training!

### Dataset and DataLoaders
Before training, you should load your customized datasets and dataloaders. 

Note that, since AUROC requires there must at lease two classess in a mini-batch, you should employ a stratified sample (instead of randomly select).
> XCurve.AUROC.dataloaders.StratifiedSampler(class_vector, batch_size, rpos=1, rneg=4)

#### Parameters
- class_vector (array): unique target vectors, such as [0, 1] for a binary case.
- batch_size (int): training batch size.
- rpos, rneg (int): your expected imbalance ratio (only work for binary cases). After setting this, the sampler will automatically adjust the imbalance ratio in the dataset by adding/dropping some samples if necessary.
### PGD Adversary
Remember use the PGD-based adversary to generate adversarial examples for the inner maximization:
> XCurve.AUROC.losses.AdversarialAUROC.PGDAdversary(model, X, y, criterion, epsilon=8.0/255, alpha=2.0/255, attack_iters=10, restarts=1, norm='linf')

#### Parameters
- model: adversarial training model.
- X (batch, $\dots$): clean inputs.
- y (batch, ): corresponding outputs.
- criterion: white-box attack loss.
- epsilon (float): the maximum perturbation budget.
- alpha (float): PGD adversary step size.
- attack_iters (int): PGD iteration step.
- restart (int): repeat times for PGD-based attacks.
- norm (str) {'linf', 'l2}: what kind of norm-induced attack.

#### Return
Return the adversarial perturbation $\delta$ that induced the maximum loss among restart times PGD attacks.


In [None]:
transform_train = transforms.Compose([
        transforms.RandomCrop(32, padding=4),
        transforms.ToTensor(),
    ])
transform_test = transforms.Compose([
    transforms.ToTensor(),
])

trainset = IMBALANCECIFAR10(root='dataset',
                                 train=True,
                                 download=True,
                                 transform=transform_train)
testset = IMBALANCECIFAR10(root='dataset',
                                train=False,
                                transform=transform_test)

sampler = StratifiedSampler(trainset.targets,
                            batch_size = 32,
                            rpos = 1,
                            rneg = 10)

train_loader = torch.utils.data.DataLoader(
        trainset,
        batch_size=sampler.real_batch_size,
        shuffle=(sampler is None),
        num_workers=0,
        pin_memory=True,
        sampler=sampler,
        drop_last=True)

In [None]:
# forward of model for one epoch
model.train()
lower_limit, upper_limit = 0.0, 1.0

for index, (X, y) in enumerate(train_loader):
    X, y  = X.cuda(), y.cuda()
    # target.shape => [batch_size, ]
    # Note that we ask for the prediction of the model among [0,1] 
    # for any binary (i.e., sigmoid) or multi-class (i.e., softmax) AUROC optimization.
    
    # obtain adversarial examples
    delta = PGDAdversary(model, X, y, criterion, epsilon=8.0/255, alpha=2.0/255, 
                         attack_iters=10, restarts=1, norm='linf')
    adv_input = torch.clamp(X + delta, min=lower_limit, max=upper_limit)
    adv_input.requires_grad_(requires_grad=False)
    # forward
    robust_output = model(adv_input).view_as(y)
    pred = torch.sigmoid(robust_output) # [batch_size, num_classess] when num_classes > 2, o.w. output [batch_size, ] 
    robust_loss = criterion(pred, y)
    if index % 30 == 0:
        print("loss:", robust_loss.item())
    
    # backward
    optimizer.zero_grad()
    robust_loss.backward()
    
    # If U use RegAdvAUROCLoss, firstly U should use `optimizer.record_grad()`, then use `optimizer.step()` !!!!
    # optimizer.record_grad()
    optimizer.step()