# Adversarial AutoAugment

Data augmentation (DA) has been widely utilized to improve generalization in training deep neural networks. Recently, human-designed data augmentation has been gradually replaced by automatically learned augmentation policy. 

<br/>

UntangleAI's Adversarial AutoAugment simultaneously optimize target related object and augmentation policy search loss. The augmentation policy network attempts to increase the training loss of a target network through generating adversarial augmentation policies, while the target network can learn more robust features from harder examples to improve the generalization. It also manages Distributed Data Parallel to enable faster computations with no boilerplate code.

## Training with Adversarial AutoAugment

Example implementation that covers what is needed to effectively utilize UntangleAI's `train_augment` function


In [None]:
#Required imports
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision.transforms import transforms
import pickle
from untangle import UntangleAI

You can decide on what to use for optimizer and scheduler based on your network. Do not send in `None` for any of them. 

In [None]:
from torch.optim import SGD
from torch.optim.lr_scheduler import CosineAnnealingLR

We use `TrainAugmentArgs` class to pass in our preferences.
<br/>
**A new directory 'mname' is created with sub-directory named train_augment/'experiment_ID'** 
*   `mname` is the name of the project
*   `num_class` is the number of classes in your dataset
*   `batch_size` is..well, batch size for the dataset. However, the batch size should be greater than your GPU count
*    `epoch` -> Number of epochs to train the network
*   `gpu_count` is the number of GPU's available for training. 
*    `experiment_ID` -> You can run the same model with different configurations by using different experment_ID for each of them. The results will be saved in appropriate folders





---
If some of your GPU's are being used by someone else, mask them using the command `export CUDA_VISIBLE_DEVICES=2,3,4` (masks 0 and 1 GPUs)


In [None]:
class TrainAugmentArgs:
    """
    A new directory 'mname' is created with sub-directory named train_augment/'experiment_ID' 
    """
    mname = 'test_net'
    num_class = 10
    batch_size = 128
    epoch = 600
    gpu_count = 8
    experiment_ID = 1


Your model goes here

In [None]:
class LeNet(nn.Module):
    # TODO: This isn't really a LeNet, but we implement this to be
    #  consistent with the Evidential Deep Learning paper
    def __init__(self):
        super(LeNet, self).__init__()
        self.model = None
        lenet_conv = []
        lenet_conv += [nn.Conv2d(3, 6, 5)] 
        lenet_conv += [torch.nn.ReLU(inplace=True)]
        lenet_conv += [nn.MaxPool2d(2, 2)]
        lenet_conv += [nn.Conv2d(6, 16, 5)]
        lenet_conv += [torch.nn.ReLU(inplace=True)]
        lenet_conv += [nn.MaxPool2d(2, 2)]
        lenet_dense = []
        lenet_dense += [nn.Linear(16 * 5 * 5, 120)]
        lenet_dense += [torch.nn.ReLU(inplace=True)]
        lenet_dense += [nn.Linear(120, 84)]
        lenet_dense += [torch.nn.ReLU(inplace=True)]
        lenet_dense += [nn.Linear(84, 10)]

        self.features = torch.nn.Sequential(*lenet_conv)
        self.classifier = torch.nn.Sequential(*lenet_dense)

    def forward(self, input):
        output = self.features(input)
        output = output.view(-1, 16 * 5 * 5)
        output = self.classifier(output)
        return output


Everthing comes together here. Go through the code first and come back here.
<br/>
<br/>


1.   `if __name__ == '__main__':` **should be used** as we make use of [Distributed Data Parallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) by default.
2.   No need to use `transforms.ToTensor` as train_augment handles it internally.`transforms.compose()` should be used for declaring transforms because we add the augmentations generated by the policy generator to these list of transforms along with `transforms.ToTensor`. 
3.   Empty `transforms.compose([])` should be used if there are no transforms needed except for `ToTensor()`. e.g. MNIST dataset
4.   **train_augment** returns the trained model(best loss model)
5.   Checkpoint models such as model at the end of each epoch are saved in 'mname'/train_augment/'experiment_ID'/models/ for custom experiments
6.   Logs such as policies and losses are saved in 'mname'/train_augment/'experiment_ID'/logs.pkl



In [None]:
if __name__ == '__main__': #1
    args= TrainAugmentArgs()
    untangleai = UntangleAI()
    model = LeNet()
    
    transform_train = transforms.Compose([
                transforms.RandomCrop(32, padding=4),transforms.RandomHorizontalFlip(),]) #2 explains why transforms are mandatory to use train_augment
    
    transform_test = transforms.Compose([]) #3 covers if there are no transforms
    trainset = torchvision.datasets.CIFAR10(root='./testDataset/', train=True, download=True,transform=transform_train,)
    testset = torchvision.datasets.CIFAR10(root='./testDataset/', train=False, download=True,transform=transform_test)
    optimizer = SGD(model.parameters(), lr = 0.01, momentum = 0.9, nesterov = True, weight_decay = 1e-4)
    scheduler = CosineAnnealingLR(optimizer, T_max = 2)

    trained_model = untangleai.train_augment(model,optimizer,scheduler,trainset,testset,args) #4 explains what model is returned after training 

    #5 explains how to access checkpoint models -> Feel free to explore them
    exp_path = untangleai.train_augment_path # test_net/train_augment/1/
    
    with open('test_net/train_augment/1/logs.pkl', 'rb') as f: #logs are saved at 'mname'/train_augment/'experiment_ID'/logs.pkl
        data = pickle.load(f)
        print(data) #6 explains what these logs are


