GitHub - Techget/SuperAT: next gen AT

Libraries we're using & referencing to

Lightning-bolt
- Our VAE implementation is adapted from their VAE implementation and we load pretrained parameters from bolts.
Foolbox
- Allows us easily run adversarial attacks against machine learning models
- We can use it benchmark the trained VAE attacker
RobustBench
- Allows us to us bench mark the adversarial trained models
- Have a model adversarial trained by our proposing method attacked and benchmarked with SOTA methods

There are roughly 2 stages to conduct the experiment to verify our theories:

Current status/progress:

As of 19 Dec 2022
- Trained attacker reached 40% robust accuracy when attack pre adversarial trained model, which achieves 66% robustness accurracy with AutoAttack, reckon it is too good to be true.
- Inspected the reconstruction error, it is around 0.1-0.2 pixel wise difference, after training 32 epoches, it remains similar level since very beginning
- The training is running on yyao0814@172.17.34.20
- Also we can check on tensorboard by using scp -r yyao0814@172.17.34.20:/home/yyao0814/xuantong/SuperAT/runs/Dec18_17-29-36_gpu4-119-1 ., then run the tensorboard locally
As of 18 Dec 2022, runing experiment for step1
- Using a vanilla VAE as attacker and Rebuffi2021Fixing_70_16_cutmix_extra from robustbench as defensor, the defensor is pre-adversarial-trained

discriminator.py
- Include the common classification methods, currenlty using resnet18
- Obtain pretrained resnet18 from Pytorch_CIFAR10
- todo: use pre activated resnet (no pretrained parameters)
vae.py
- Include vanilla implementation of VAE
- reference to lightning VAE
main_adversarial_training.py
- pretrain VAE to return image with little modification to fool the trained model
- GENERATOR and Discriminator are updated IN TURN instead of being updated simultaneously
- python3 main_adversarial_training.py to start training no other input required at the moment
- This specific file hasn't been cross checked yet
test_adversarial_trained_model.py
- verify adversarial trained model on common data, not adversarial example
reference to https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
- for updating pre-trained torch models, however the torch model is trained for imagenet, seems we cannot use it, as our experiment is based on CIFAR10
artemis_script_*.pbs
- Script to run certain job on artemis

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
state_dicts		state_dicts
.gitignore		.gitignore
README.md		README.md
artemis_script_train_attacker.pbs		artemis_script_train_attacker.pbs
dataloader.py		dataloader.py
discriminator.py		discriminator.py
foolbox_attack_resnet18_example.py		foolbox_attack_resnet18_example.py
lightning_VAE.py		lightning_VAE.py
main_adversarial_training.py		main_adversarial_training.py
stage1_evaluate_attack_effect.py		stage1_evaluate_attack_effect.py
stage2_evaluate_adversarial_trained_model.py		stage2_evaluate_adversarial_trained_model.py
test_adversarial_trained_model_natural_accuracy.py		test_adversarial_trained_model_natural_accuracy.py
train_attacker.py		train_attacker.py
utils.py		utils.py