Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adversarial training is not working #9

Open
ksouvik52 opened this issue Jul 24, 2022 · 12 comments
Open

Adversarial training is not working #9

ksouvik52 opened this issue Jul 24, 2022 · 12 comments

Comments

@ksouvik52
Copy link

Hi, as you suggested in the earlier thread, we tried with the exact same settings and argument that you provided to do train on deit tiny adversarially. However, it is still not working, only 18% accuracy after 52 epochs.

{"train_lr": 9.999999999999953e-07, "train_loss": 6.91152723201459, "test_0_loss": 6.872483930142354, "test_0_acc1": 0.26, "test_0_acc5": 1.12, "test_5_loss": 6.9177000566849856, "test_5_acc1": 0.1805, "test_5_acc5": 0.6765, "epoch": 0, "n_parameters": 5717416}
{"train_lr": 9.999999999999953e-07, "train_loss": 6.900039605433039, "test_0_loss": 6.850005263940539, "test_0_acc1": 0.362, "test_0_acc5": 1.578, "test_5_loss": 6.9177000566849856, "test_5_acc1": 0.1805, "test_5_acc5": 0.6765, "epoch": 1, "n_parameters": 5717416}
{"train_lr": 0.00040089999999999305, "train_loss": 6.754413659242894, "test_0_loss": 6.145018349987379, "test_0_acc1": 3.726, "test_0_acc5": 10.93, "test_5_loss": 6.9177000566849856, "test_5_acc1": 0.1805, "test_5_acc5": 0.6765, "epoch": 2, "n_parameters": 5717416}
{"train_lr": 0.0008008000000000196, "train_loss": 6.625597798519379, "test_0_loss": 5.638628653967449, "test_0_acc1": 7.224, "test_0_acc5": 18.596, "test_5_loss": 6.9177000566849856, "test_5_acc1": 0.1805, "test_5_acc5": 0.6765, "epoch": 3, "n_parameters": 5717416}
{"train_lr": 0.001200700000000036, "train_loss": 6.560625480233337, "test_0_loss": 5.315861636678607, "test_0_acc1": 10.064, "test_0_acc5": 23.916, "test_5_loss": 6.9177000566849856, "test_5_acc1": 0.1805, "test_5_acc5": 0.6765, "epoch": 4, "n_parameters": 5717416}
{"train_lr": 0.0016005999999999657, "train_loss": 6.54946454628099, "test_0_loss": 5.134192888048774, "test_0_acc1": 11.984, "test_0_acc5": 27.634, "test_5_loss": 5.716198673022533, "test_5_acc1": 6.383, "test_5_acc5": 16.22425, "epoch": 5, "n_parameters": 5717416}
{"train_lr": 0.0020004999999999914, "train_loss": 6.542691466810225, "test_0_loss": 5.129832990186304, "test_0_acc1": 12.648, "test_0_acc5": 28.482, "test_5_loss": 5.716198673022533, "test_5_acc1": 6.383, "test_5_acc5": 16.22425, "epoch": 6, "n_parameters": 5717416}
{"train_lr": 0.002400399999999976, "train_loss": 6.536189370113406, "test_0_loss": 5.018598655821495, "test_0_acc1": 13.942, "test_0_acc5": 31.118, "test_5_loss": 5.716198673022533, "test_5_acc1": 6.383, "test_5_acc5": 16.22425, "epoch": 7, "n_parameters": 5717416}
{"train_lr": 0.002800300000000059, "train_loss": 6.503189501430777, "test_0_loss": 4.9134670785048495, "test_0_acc1": 15.256, "test_0_acc5": 33.352, "test_5_loss": 5.716198673022533, "test_5_acc1": 6.383, "test_5_acc5": 16.22425, "epoch": 8, "n_parameters": 5717416}
{"train_lr": 0.0032002000000000983, "train_loss": 6.5094980549373975, "test_0_loss": 4.851811613246408, "test_0_acc1": 15.578, "test_0_acc5": 34.102, "test_5_loss": 5.716198673022533, "test_5_acc1": 6.383, "test_5_acc5": 16.22425, "epoch": 9, "n_parameters": 5717416}
{"train_lr": 0.0036000999999999585, "train_loss": 6.511975479259384, "test_0_loss": 5.143745462633598, "test_0_acc1": 13.898, "test_0_acc5": 30.834, "test_5_loss": 5.7486809707954, "test_5_acc1": 7.313, "test_5_acc5": 17.87725, "epoch": 10, "n_parameters": 5717416}
{"train_lr": 0.0039023577500088853, "train_loss": 6.418486285981515, "test_0_loss": 5.218988336208, "test_0_acc1": 12.054, "test_0_acc5": 27.72, "test_5_loss": 5.7486809707954, "test_5_acc1": 7.313, "test_5_acc5": 17.87725, "epoch": 11, "n_parameters": 5717416}
{"train_lr": 0.003882057134063648, "train_loss": 6.32594983450038, "test_0_loss": 5.0386335921455325, "test_0_acc1": 14.276, "test_0_acc5": 31.454, "test_5_loss": 5.7486809707954, "test_5_acc1": 7.313, "test_5_acc5": 17.87725, "epoch": 12, "n_parameters": 5717416}
{"train_lr": 0.0038599040893470705, "train_loss": 6.353343641300567, "test_0_loss": 4.799374374913163, "test_0_acc1": 16.064, "test_0_acc5": 34.754, "test_5_loss": 5.7486809707954, "test_5_acc1": 7.313, "test_5_acc5": 17.87725, "epoch": 13, "n_parameters": 5717416}
{"train_lr": 0.0038359204782395374, "train_loss": 6.200804713199274, "test_0_loss": 5.33436276099656, "test_0_acc1": 11.77, "test_0_acc5": 26.914, "test_5_loss": 5.7486809707954, "test_5_acc1": 7.313, "test_5_acc5": 17.87725, "epoch": 14, "n_parameters": 5717416}
{"train_lr": 0.0038101299696696937, "train_loss": 6.372054211956134, "test_0_loss": 5.132588769103652, "test_0_acc1": 14.968, "test_0_acc5": 32.62, "test_5_loss": 5.900749441910766, "test_5_acc1": 6.776, "test_5_acc5": 16.361, "epoch": 15, "n_parameters": 5717416}
{"train_lr": 0.0037825580157558377, "train_loss": 6.417677939938699, "test_0_loss": 4.666486162294277, "test_0_acc1": 15.988, "test_0_acc5": 34.454, "test_5_loss": 5.900749441910766, "test_5_acc1": 6.776, "test_5_acc5": 16.361, "epoch": 16, "n_parameters": 5717416}
{"train_lr": 0.0037532318266873923, "train_loss": 6.26336412826221, "test_0_loss": 5.38978447795143, "test_0_acc1": 12.402, "test_0_acc5": 28.042, "test_5_loss": 5.900749441910766, "test_5_acc1": 6.776, "test_5_acc5": 16.361, "epoch": 17, "n_parameters": 5717416}
{"train_lr": 0.003722180343872929, "train_loss": 6.142464170686537, "test_0_loss": 4.77877281021782, "test_0_acc1": 15.232, "test_0_acc5": 33.184, "test_5_loss": 5.900749441910766, "test_5_acc1": 6.776, "test_5_acc5": 16.361, "epoch": 18, "n_parameters": 5717416}
{"train_lr": 0.0036894342113766073, "train_loss": 6.004803928301679, "test_0_loss": 5.506222593730944, "test_0_acc1": 10.49, "test_0_acc5": 24.774, "test_5_loss": 5.900749441910766, "test_5_acc1": 6.776, "test_5_acc5": 16.361, "epoch": 19, "n_parameters": 5717416}
{"train_lr": 0.0036550257456777735, "train_loss": 6.135914517725877, "test_0_loss": 5.406659782199775, "test_0_acc1": 11.604, "test_0_acc5": 27.232, "test_5_loss": 6.950556825081355, "test_5_acc1": 2.013, "test_5_acc5": 5.54425, "epoch": 20, "n_parameters": 5717416}
{"train_lr": 0.0036189889037779527, "train_loss": 6.093383449254086, "test_0_loss": 5.010170250158621, "test_0_acc1": 13.934, "test_0_acc5": 30.648, "test_5_loss": 6.950556825081355, "test_5_acc1": 2.013, "test_5_acc5": 5.54425, "epoch": 21, "n_parameters": 5717416}
{"train_lr": 0.0035813592496895356, "train_loss": 6.059879529152176, "test_0_loss": 5.371730933796497, "test_0_acc1": 10.762, "test_0_acc5": 25.292, "test_5_loss": 6.950556825081355, "test_5_acc1": 2.013, "test_5_acc5": 5.54425, "epoch": 22, "n_parameters": 5717416}
{"train_lr": 0.00354217391933767, "train_loss": 6.073066240544323, "test_0_loss": 5.048453500617107, "test_0_acc1": 13.242, "test_0_acc5": 29.126, "test_5_loss": 6.950556825081355, "test_5_acc1": 2.013, "test_5_acc5": 5.54425, "epoch": 23, "n_parameters": 5717416}
{"train_lr": 0.0035014715839127445, "train_loss": 6.08733743352951, "test_0_loss": 5.331494154414533, "test_0_acc1": 10.974, "test_0_acc5": 25.072, "test_5_loss": 6.950556825081355, "test_5_acc1": 2.013, "test_5_acc5": 5.54425, "epoch": 24, "n_parameters": 5717416}
{"train_lr": 0.003459292411705762, "train_loss": 6.1378554502646505, "test_0_loss": 5.264302222605172, "test_0_acc1": 10.784, "test_0_acc5": 25.304, "test_5_loss": 7.786235228686171, "test_5_acc1": 0.8985, "test_5_acc5": 2.83875, "epoch": 25, "n_parameters": 5717416}
{"train_lr": 0.0034156780284671424, "train_loss": 6.1356256470786965, "test_0_loss": 5.568371077798074, "test_0_acc1": 8.102, "test_0_acc5": 19.622, "test_5_loss": 7.786235228686171, "test_5_acc1": 0.8985, "test_5_acc5": 2.83875, "epoch": 26, "n_parameters": 5717416}
{"train_lr": 0.003370671476327724, "train_loss": 6.071214107396029, "test_0_loss": 5.589009375886435, "test_0_acc1": 9.484, "test_0_acc5": 22.578, "test_5_loss": 7.786235228686171, "test_5_acc1": 0.8985, "test_5_acc5": 2.83875, "epoch": 27, "n_parameters": 5717416}
{"train_lr": 0.00332431717132075, "train_loss": 5.972238908568732, "test_0_loss": 5.707102177010388, "test_0_acc1": 7.978, "test_0_acc5": 19.746, "test_5_loss": 7.786235228686171, "test_5_acc1": 0.8985, "test_5_acc5": 2.83875, "epoch": 28, "n_parameters": 5717416}
{"train_lr": 0.0032766608595485736, "train_loss": 6.1025936421539955, "test_0_loss": 5.42314580428013, "test_0_acc1": 11.556, "test_0_acc5": 26.76, "test_5_loss": 7.786235228686171, "test_5_acc1": 0.8985, "test_5_acc5": 2.83875, "epoch": 29, "n_parameters": 5717416}
{"train_lr": 0.0032277495720376523, "train_loss": 6.084920492580087, "test_0_loss": 5.923481827123914, "test_0_acc1": 6.536, "test_0_acc5": 16.768, "test_5_loss": 7.121632807848168, "test_5_acc1": 1.17125, "test_5_acc5": 3.44575, "epoch": 30, "n_parameters": 5717416}
{"train_lr": 0.003177631578323426, "train_loss": 6.115835655793298, "test_0_loss": 5.460594815469596, "test_0_acc1": 9.702, "test_0_acc5": 23.214, "test_5_loss": 7.121632807848168, "test_5_acc1": 1.17125, "test_5_acc5": 3.44575, "epoch": 31, "n_parameters": 5717416}
{"train_lr": 0.003126356338814952, "train_loss": 6.0647800623846475, "test_0_loss": 5.592425890023786, "test_0_acc1": 10.268, "test_0_acc5": 23.954, "test_5_loss": 7.121632807848168, "test_5_acc1": 1.17125, "test_5_acc5": 3.44575, "epoch": 32, "n_parameters": 5717416}
{"train_lr": 0.0030739744559831173, "train_loss": 6.054409027099609, "test_0_loss": 5.690968008889499, "test_0_acc1": 9.494, "test_0_acc5": 22.702, "test_5_loss": 7.121632807848168, "test_5_acc1": 1.17125, "test_5_acc5": 3.44575, "epoch": 33, "n_parameters": 5717416}
{"train_lr": 0.003020537624422026, "train_loss": 6.063710031606597, "test_0_loss": 4.698054779361473, "test_0_acc1": 15.238, "test_0_acc5": 32.862, "test_5_loss": 7.121632807848168, "test_5_acc1": 1.17125, "test_5_acc5": 3.44575, "epoch": 34, "n_parameters": 5717416}
{"train_lr": 0.0029660985798329645, "train_loss": 5.910670252202703, "test_0_loss": 5.064415029280474, "test_0_acc1": 13.038, "test_0_acc5": 29.468, "test_5_loss": 7.283367584701997, "test_5_acc1": 1.14525, "test_5_acc5": 3.321, "epoch": 35, "n_parameters": 5717416}
{"train_lr": 0.0029107110469803756, "train_loss": 5.995033393136794, "test_0_loss": 4.890588184388418, "test_0_acc1": 14.9, "test_0_acc5": 32.556, "test_5_loss": 7.283367584701997, "test_5_acc1": 1.14525, "test_5_acc5": 3.321, "epoch": 36, "n_parameters": 5717416}
{"train_lr": 0.002854429686672256, "train_loss": 5.9739233070044975, "test_0_loss": 5.277839526410142, "test_0_acc1": 12.19, "test_0_acc5": 28.048, "test_5_loss": 7.283367584701997, "test_5_acc1": 1.14525, "test_5_acc5": 3.321, "epoch": 37, "n_parameters": 5717416}
{"train_lr": 0.002797310041816381, "train_loss": 5.918417158744318, "test_0_loss": 5.211479980214925, "test_0_acc1": 13.658, "test_0_acc5": 30.232, "test_5_loss": 7.283367584701997, "test_5_acc1": 1.14525, "test_5_acc5": 3.321, "epoch": 38, "n_parameters": 5717416}
{"train_lr": 0.002739408482605983, "train_loss": 5.925514445745116, "test_0_loss": 5.419682004859031, "test_0_acc1": 10.596, "test_0_acc5": 24.802, "test_5_loss": 7.283367584701997, "test_5_acc1": 1.14525, "test_5_acc5": 3.321, "epoch": 39, "n_parameters": 5717416}
{"train_lr": 0.002680782150889308, "train_loss": 5.883729684505341, "test_0_loss": 5.13738645015431, "test_0_acc1": 13.256, "test_0_acc5": 29.476, "test_5_loss": 7.820061174479342, "test_5_acc1": 1.0445, "test_5_acc5": 3.129, "epoch": 40, "n_parameters": 5717416}
{"train_lr": 0.0026214889037779947, "train_loss": 5.8768603530623835, "test_0_loss": 5.099718636911188, "test_0_acc1": 14.214, "test_0_acc5": 32.066, "test_5_loss": 7.820061174479342, "test_5_acc1": 1.0445, "test_5_acc5": 3.129, "epoch": 41, "n_parameters": 5717416}
{"train_lr": 0.002561587256548306, "train_loss": 5.914836130887389, "test_0_loss": 5.2416254764783865, "test_0_acc1": 11.636, "test_0_acc5": 26.856, "test_5_loss": 7.820061174479342, "test_5_acc1": 1.0445, "test_5_acc5": 3.129, "epoch": 42, "n_parameters": 5717416}
{"train_lr": 0.0025011363248938225, "train_loss": 5.900150206830386, "test_0_loss": 5.101568935318627, "test_0_acc1": 13.276, "test_0_acc5": 30.104, "test_5_loss": 7.820061174479342, "test_5_acc1": 1.0445, "test_5_acc5": 3.129, "epoch": 43, "n_parameters": 5717416}
{"train_lr": 0.00244019576658606, "train_loss": 5.86289567250809, "test_0_loss": 4.907099856829994, "test_0_acc1": 14.74, "test_0_acc5": 32.012, "test_5_loss": 7.820061174479342, "test_5_acc1": 1.0445, "test_5_acc5": 3.129, "epoch": 44, "n_parameters": 5717416}
{"train_lr": 0.0023788257225985108, "train_loss": 5.835502566836721, "test_0_loss": 5.569610283303093, "test_0_acc1": 11.684, "test_0_acc5": 26.968, "test_5_loss": 7.334822424390113, "test_5_acc1": 1.49225, "test_5_acc5": 4.25875, "epoch": 45, "n_parameters": 5717416}
{"train_lr": 0.002317086757755297, "train_loss": 5.902843760119544, "test_0_loss": 5.3058019126750535, "test_0_acc1": 13.534, "test_0_acc5": 29.732, "test_5_loss": 7.334822424390113, "test_5_acc1": 1.49225, "test_5_acc5": 4.25875, "epoch": 46, "n_parameters": 5717416}
{"train_lr": 0.0022550398009607707, "train_loss": 5.798361852205248, "test_0_loss": 5.197937856175087, "test_0_acc1": 13.114, "test_0_acc5": 29.348, "test_5_loss": 7.334822424390113, "test_5_acc1": 1.49225, "test_5_acc5": 4.25875, "epoch": 47, "n_parameters": 5717416}
{"train_lr": 0.0021927460850704548, "train_loss": 5.73072993350353, "test_0_loss": 4.734242562521595, "test_0_acc1": 17.252, "test_0_acc5": 35.896, "test_5_loss": 7.334822424390113, "test_5_acc1": 1.49225, "test_5_acc5": 4.25875, "epoch": 48, "n_parameters": 5717416}
{"train_lr": 0.0021302670864610006, "train_loss": 5.69394142336125, "test_0_loss": 5.058369449217657, "test_0_acc1": 14.906, "test_0_acc5": 32.252, "test_5_loss": 7.334822424390113, "test_5_acc1": 1.49225, "test_5_acc5": 4.25875, "epoch": 49, "n_parameters": 5717416}
{"train_lr": 0.002067664464360847, "train_loss": 5.709019121363294, "test_0_loss": 4.3592056012351925, "test_0_acc1": 19.068, "test_0_acc5": 39.056, "test_5_loss": 10.509043875979218, "test_5_acc1": 0.72375, "test_5_acc5": 2.323, "epoch": 50, "n_parameters": 5717416}
{"train_lr": 0.002005000000000026, "train_loss": 5.700551097960972, "test_0_loss": 4.753117966484123, "test_0_acc1": 15.846, "test_0_acc5": 33.87, "test_5_loss": 10.509043875979218, "test_5_acc1": 0.72375, "test_5_acc5": 2.323, "epoch": 51, "n_parameters": 5717416}
{"train_lr": 0.0019423355356391193, "train_loss": 5.7724813230031975, "test_0_loss": 4.528603377741876, "test_0_acc1": 18.206, "test_0_acc5": 37.726, "test_5_loss": 10.509043875979218, "test_5_acc1": 0.72375, "test_5_acc5": 2.323, "epoch": 52, "n_parameters": 5717416}

@ksouvik52
Copy link
Author

This is the command we are using:

python -m torch.distributed.launch --nproc_per_node=8 --master_port=12349 --use_env main_adv_deit.py --model deit_tiny_patch16_224_adv --batch-size=128 --data-path /datasets/imagenet-ilsvrc2012 --attack-iter 1 --attack-epsilon 4 --attack-step-size 4 --epoch 100 --reprob 0 --no-repeated-aug --sing singln --drop 0 --drop-path 0 --start_epoch 0 --warmup-epochs 10 --cutmix 0 --output_dir save/deit_adv/deit_tiny_patch16_224

@ytongbai
Copy link
Owner

ytongbai commented Jul 24, 2022

Just to be sure, you are using deit-tiny? Did the deit-small work on you side now after yesterday? (Want to diagnose the problem by step)

@ksouvik52
Copy link
Author

Okay, I will try with diet_small...have never tried yet.

@ytongbai
Copy link
Owner

ytongbai commented Jul 24, 2022

Oh, but in your yesterday's response(#8 (comment)), you mentioned you are using deit_small_patch16_224_adv

'python -m torch.distributed.launch --nproc_per_node=4 --master_port=5672 --use_env main_adv_deit.py --model deit_small_patch16_224_adv --batch-size 128 --data-path /datasets/imagenet-ilsvrc2012 --attack-iter 1 --attack-epsilon 4 --attack-step-size 4 --epoch 100 --reprob 0 --no-repeated-aug --sing singln --drop 0 --drop-path 0 --start_epoch 0 --warmup-epochs 10 --cutmix 0 --output_dir save/deit_adv/deit_small_patch16_224'

@ksouvik52
Copy link
Author

ksouvik52 commented Jul 24, 2022

Yes, I never tried with 8 gpu node for deit small. Trying now. So, the goal is to narrow down whether its model specific?

@ytongbai
Copy link
Owner

Oh I see..

@ksouvik52
Copy link
Author

ksouvik52 commented Jul 24, 2022

One reason to choose deit_tiny is to see the results quickly, as the deit_small takes little longer to train. unfortunately it is not working yet.

@ytongbai
Copy link
Owner

ytongbai commented Jul 24, 2022

Just want to debug the problem, since we didn't use deit-tiny setting in the paper. So first step is to make sure the original recipe works on your side to rule out other problems.

But yes I do have some assumptions on DeiT-Tiny. As we mentioned in the paper, the augmentations are very strong, and with strong regularization like adversariall training, therefore even with the DeiT-small 100 epoch model, the regularization is very strong. So if you shrink down the model size, I would assume the model is over-regularized and cannot give good results, maybe using less augmentation would help. But if the model size goes up, like adversarail training on base model, that wouldn't be a problem.

(Previously in our experiment, DeiT-tiny would not need strong augmentation to perform well on adversarial training
image)

Hope it helps :)

@ksouvik52
Copy link
Author

Can you let me know what arguments you used to tone down augmentation on diet tiny to make it work?

@ytongbai
Copy link
Owner

ytongbai commented Jul 24, 2022

Sure,
we didn't fully explore under this setting since it is out of the scope(fair comparison between these two structures), but still got some initial recipe for you to explore.

With tiny model:
With sgd, cosine lr scheduler, lr 0.1, no regularization and no augmentation, clean acc is around 54.9. pgd-5 is 31.7.
(Adam gives similar results but less stable, so would recommand the above setting.

@ksouvik52
Copy link
Author

ksouvik52 commented Jul 24, 2022

Running on deit_small, will let you know the results...its around 2x slower than deit_tiny, so will be little late.

@ytongbai
Copy link
Owner

Sure no worries, I'll be around to help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants