[2022-09-29 08:58:07 swin_tiny_patch4_window7_224] (main.py 312): INFO Full config saved to output/swin_tiny_patch4_window7_224/fix_ddp/config.json [2022-09-29 08:58:08 swin_tiny_patch4_window7_224] (main.py 315): INFO AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 128 CACHE_MODE: part DATASET: imagenet DATA_PATH: /data/ImageNet/extract/ IMG_SIZE: 224 INTERPOLATION: bicubic NUM_WORKERS: 8 PIN_MEMORY: true ZIP_MODE: false EVAL_MODE: false LOCAL_RANK: 0 MODEL: DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 NAME: swin_tiny_patch4_window7_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 TYPE: swin OUTPUT: output/swin_tiny_patch4_window7_224/fix_ddp PRINT_FREQ: 100 SAVE_FREQ: 10 SEED: 0 TAG: fix_ddp TEST: CROP: true SEQUENTIAL: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 0 AUTO_RESUME: false BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 NAME: cosine MIN_LR: 1.0e-05 OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 [2022-09-29 08:58:11 swin_tiny_patch4_window7_224] (main.py 70): INFO Creating model:swin/swin_tiny_patch4_window7_224 [2022-09-29 08:58:14 swin_tiny_patch4_window7_224] (main.py 74): INFO SwinTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((96,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=384, out_features=192, bias=False) (norm): LayerNorm((384,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=768, out_features=384, bias=False) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=1536, out_features=768, bias=False) (norm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d() (head): Linear(in_features=768, out_features=1000, bias=True) ) [2022-09-29 08:58:14 swin_tiny_patch4_window7_224] (main.py 81): INFO number of params: 28288354 [2022-09-29 08:58:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Start training [2022-09-29 08:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][0/1251] eta 5:30:17 lr 0.000001 time 15.8413 (15.8413) loss 7.0163 (7.0163) grad_norm 1.3946 (1.3946) [2022-09-29 08:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][100/1251] eta 0:17:50 lr 0.000005 time 0.8239 (0.9298) loss 6.9452 (6.9535) grad_norm 1.1852 (1.3133) [2022-09-29 09:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][200/1251] eta 0:14:53 lr 0.000009 time 0.8480 (0.8499) loss 6.8820 (6.9359) grad_norm 1.0850 (1.2453) [2022-09-29 09:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][300/1251] eta 0:13:02 lr 0.000013 time 0.8184 (0.8228) loss 6.8865 (6.9234) grad_norm 0.9920 (1.1838) [2022-09-29 09:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][400/1251] eta 0:11:29 lr 0.000017 time 0.7919 (0.8097) loss 6.8411 (6.9130) grad_norm 0.9639 (1.1350) [2022-09-29 09:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][500/1251] eta 0:10:02 lr 0.000021 time 0.7370 (0.8029) loss 6.8816 (6.9037) grad_norm 0.9663 (1.0936) [2022-09-29 09:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][600/1251] eta 0:08:39 lr 0.000025 time 0.8118 (0.7973) loss 6.8215 (6.8961) grad_norm 0.9081 (1.0614) [2022-09-29 09:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][700/1251] eta 0:07:17 lr 0.000029 time 0.8574 (0.7944) loss 6.8698 (6.8882) grad_norm 0.8688 (1.0395) [2022-09-29 09:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][800/1251] eta 0:05:57 lr 0.000033 time 0.8370 (0.7931) loss 6.8500 (6.8812) grad_norm 0.9972 (1.0337) [2022-09-29 09:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][900/1251] eta 0:04:37 lr 0.000037 time 0.7934 (0.7911) loss 6.7978 (6.8741) grad_norm 1.0927 (1.0426) [2022-09-29 09:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1000/1251] eta 0:03:18 lr 0.000041 time 0.8919 (0.7907) loss 6.7925 (6.8662) grad_norm 0.9607 (1.0681) [2022-09-29 09:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1100/1251] eta 0:01:59 lr 0.000045 time 0.8617 (0.7900) loss 6.8216 (6.8575) grad_norm 1.0846 (1.1011) [2022-09-29 09:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1200/1251] eta 0:00:40 lr 0.000049 time 0.7210 (0.7885) loss 6.6996 (6.8484) grad_norm 1.6675 (1.1268) [2022-09-29 09:14:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 0 training takes 0:16:26 [2022-09-29 09:14:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saving...... [2022-09-29 09:14:41 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saved !!! [2022-09-29 09:14:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.940 (4.940) Loss 6.3440 (6.3440) Acc@1 2.051 (2.051) Acc@5 6.934 (6.934) [2022-09-29 09:15:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 1.868 Acc@5 6.368 [2022-09-29 09:15:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 1.9% [2022-09-29 09:15:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 1.87% [2022-09-29 09:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][0/1251] eta 1:40:38 lr 0.000051 time 4.8271 (4.8271) loss 6.7811 (6.7811) grad_norm 1.3540 (1.3540) [2022-09-29 09:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][100/1251] eta 0:15:47 lr 0.000055 time 0.8560 (0.8234) loss 6.7252 (6.7076) grad_norm 2.1416 (1.6687) [2022-09-29 09:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][200/1251] eta 0:14:05 lr 0.000059 time 0.8918 (0.8042) loss 6.8401 (6.7009) grad_norm 1.3751 (1.7283) [2022-09-29 09:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][300/1251] eta 0:12:35 lr 0.000063 time 0.8163 (0.7948) loss 6.4481 (6.6803) grad_norm 1.6437 (1.7604) [2022-09-29 09:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][400/1251] eta 0:11:14 lr 0.000067 time 0.7970 (0.7925) loss 6.5908 (6.6625) grad_norm 2.6291 (1.7950) [2022-09-29 09:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][500/1251] eta 0:09:51 lr 0.000071 time 0.7521 (0.7875) loss 6.5882 (6.6492) grad_norm 1.9736 (1.8525) [2022-09-29 09:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][600/1251] eta 0:08:31 lr 0.000075 time 0.6517 (0.7852) loss 6.4092 (6.6360) grad_norm 2.5684 (1.8624) [2022-09-29 09:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][700/1251] eta 0:07:12 lr 0.000079 time 0.7227 (0.7848) loss 6.5569 (6.6188) grad_norm 1.8803 (1.8850) [2022-09-29 09:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][800/1251] eta 0:05:52 lr 0.000083 time 0.7947 (0.7823) loss 6.6713 (6.6051) grad_norm 2.4987 (1.8954) [2022-09-29 09:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][900/1251] eta 0:04:34 lr 0.000087 time 0.7482 (0.7819) loss 6.6811 (6.5937) grad_norm 1.7285 (1.9115) [2022-09-29 09:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1000/1251] eta 0:03:16 lr 0.000091 time 0.7242 (0.7819) loss 6.6792 (6.5830) grad_norm 2.1481 (1.9288) [2022-09-29 09:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1100/1251] eta 0:01:57 lr 0.000095 time 0.8049 (0.7812) loss 6.3422 (6.5717) grad_norm 1.9727 (1.9387) [2022-09-29 09:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1200/1251] eta 0:00:39 lr 0.000099 time 0.6728 (0.7798) loss 6.5389 (6.5602) grad_norm 1.7695 (1.9543) [2022-09-29 09:31:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 1 training takes 0:16:15 [2022-09-29 09:31:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.395 (4.395) Loss 5.5704 (5.5704) Acc@1 5.469 (5.469) Acc@5 18.457 (18.457) [2022-09-29 09:31:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 6.288 Acc@5 17.846 [2022-09-29 09:31:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 6.3% [2022-09-29 09:31:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 6.29% [2022-09-29 09:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][0/1251] eta 1:38:10 lr 0.000101 time 4.7082 (4.7082) loss 6.3676 (6.3676) grad_norm 1.4015 (1.4015) [2022-09-29 09:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][100/1251] eta 0:15:30 lr 0.000105 time 0.7395 (0.8087) loss 6.1963 (6.4534) grad_norm 1.6226 (1.9848) [2022-09-29 09:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][200/1251] eta 0:13:46 lr 0.000109 time 0.8641 (0.7860) loss 6.4802 (6.4368) grad_norm 1.9104 (2.0212) [2022-09-29 09:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][300/1251] eta 0:12:22 lr 0.000113 time 0.8432 (0.7812) loss 6.6354 (6.4126) grad_norm 2.9198 (2.0213) [2022-09-29 09:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][400/1251] eta 0:11:02 lr 0.000117 time 0.8295 (0.7784) loss 6.0709 (6.3910) grad_norm 1.5452 (2.0629) [2022-09-29 09:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][500/1251] eta 0:09:44 lr 0.000121 time 0.7513 (0.7777) loss 6.5261 (6.3800) grad_norm 1.7877 (2.0716) [2022-09-29 09:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][600/1251] eta 0:08:26 lr 0.000125 time 0.8468 (0.7775) loss 6.0829 (6.3672) grad_norm 2.1291 (2.0860) [2022-09-29 09:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][700/1251] eta 0:07:08 lr 0.000129 time 0.8489 (0.7775) loss 6.5115 (6.3573) grad_norm 1.8646 (2.1039) [2022-09-29 09:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][800/1251] eta 0:05:50 lr 0.000133 time 0.6654 (0.7761) loss 6.3690 (6.3489) grad_norm 2.0158 (2.1046) [2022-09-29 09:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][900/1251] eta 0:04:32 lr 0.000137 time 0.9095 (0.7755) loss 6.3761 (6.3368) grad_norm 2.3583 (2.1134) [2022-09-29 09:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1000/1251] eta 0:03:14 lr 0.000141 time 0.7476 (0.7747) loss 6.3853 (6.3245) grad_norm 1.8534 (2.1182) [2022-09-29 09:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1100/1251] eta 0:01:56 lr 0.000145 time 0.8184 (0.7747) loss 5.6477 (6.3145) grad_norm 1.8644 (2.1247) [2022-09-29 09:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1200/1251] eta 0:00:39 lr 0.000149 time 0.8182 (0.7743) loss 6.2932 (6.3095) grad_norm 2.3926 (2.1269) [2022-09-29 09:47:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 2 training takes 0:16:09 [2022-09-29 09:47:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.765 (3.765) Loss 4.9132 (4.9132) Acc@1 11.035 (11.035) Acc@5 28.906 (28.906) [2022-09-29 09:48:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 12.226 Acc@5 29.120 [2022-09-29 09:48:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 12.2% [2022-09-29 09:48:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 12.23% [2022-09-29 09:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][0/1251] eta 1:29:52 lr 0.000151 time 4.3106 (4.3106) loss 6.0555 (6.0555) grad_norm 1.9022 (1.9022) [2022-09-29 09:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][100/1251] eta 0:15:34 lr 0.000155 time 0.8567 (0.8117) loss 5.9988 (6.1877) grad_norm 2.0112 (2.2223) [2022-09-29 09:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][200/1251] eta 0:13:52 lr 0.000159 time 0.7630 (0.7918) loss 6.3445 (6.1380) grad_norm 1.8333 (2.2330) [2022-09-29 09:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][300/1251] eta 0:12:24 lr 0.000163 time 0.6836 (0.7834) loss 6.0267 (6.1443) grad_norm 2.2801 (2.2397) [2022-09-29 09:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][400/1251] eta 0:11:06 lr 0.000167 time 0.8308 (0.7832) loss 5.7222 (6.1270) grad_norm 2.4346 (2.2420) [2022-09-29 09:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][500/1251] eta 0:09:46 lr 0.000171 time 0.7661 (0.7805) loss 6.3986 (6.1263) grad_norm 2.3545 (2.2324) [2022-09-29 09:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][600/1251] eta 0:08:28 lr 0.000175 time 0.8880 (0.7810) loss 6.2193 (6.1123) grad_norm 2.2361 (2.2365) [2022-09-29 09:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][700/1251] eta 0:07:08 lr 0.000179 time 0.6990 (0.7786) loss 5.6771 (6.1078) grad_norm 2.3789 (2.2374) [2022-09-29 09:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][800/1251] eta 0:05:50 lr 0.000183 time 0.7318 (0.7781) loss 6.0415 (6.1014) grad_norm 3.3187 (2.2454) [2022-09-29 09:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][900/1251] eta 0:04:32 lr 0.000187 time 0.8281 (0.7770) loss 6.2083 (6.0987) grad_norm 1.5674 (2.2429) [2022-09-29 10:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1000/1251] eta 0:03:15 lr 0.000191 time 0.6469 (0.7771) loss 6.2240 (6.0930) grad_norm 2.6424 (2.2398) [2022-09-29 10:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1100/1251] eta 0:01:57 lr 0.000195 time 0.8367 (0.7771) loss 6.3945 (6.0822) grad_norm 1.7188 (2.2477) [2022-09-29 10:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1200/1251] eta 0:00:39 lr 0.000199 time 0.8264 (0.7770) loss 6.0420 (6.0698) grad_norm 2.2591 (2.2548) [2022-09-29 10:04:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 3 training takes 0:16:12 [2022-09-29 10:04:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.106 (4.106) Loss 4.2891 (4.2891) Acc@1 19.629 (19.629) Acc@5 41.699 (41.699) [2022-09-29 10:04:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 18.646 Acc@5 39.810 [2022-09-29 10:04:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 18.6% [2022-09-29 10:04:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 18.65% [2022-09-29 10:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][0/1251] eta 1:44:43 lr 0.000201 time 5.0226 (5.0226) loss 6.3049 (6.3049) grad_norm 2.5165 (2.5165) [2022-09-29 10:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][100/1251] eta 0:15:42 lr 0.000205 time 0.7778 (0.8188) loss 6.2897 (5.8964) grad_norm 1.9827 (2.1908) [2022-09-29 10:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][200/1251] eta 0:13:59 lr 0.000209 time 0.8047 (0.7984) loss 6.2348 (5.9231) grad_norm 1.9054 (2.2559) [2022-09-29 10:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][300/1251] eta 0:12:30 lr 0.000213 time 0.8376 (0.7894) loss 5.7637 (5.9400) grad_norm 2.3594 (2.2577) [2022-09-29 10:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][400/1251] eta 0:11:09 lr 0.000217 time 0.8421 (0.7870) loss 6.2873 (5.9234) grad_norm 3.6221 (2.2602) [2022-09-29 10:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][500/1251] eta 0:09:48 lr 0.000221 time 0.6781 (0.7839) loss 6.1699 (5.9212) grad_norm 2.1546 (2.2850) [2022-09-29 10:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][600/1251] eta 0:08:30 lr 0.000225 time 0.8324 (0.7836) loss 5.4317 (5.9062) grad_norm 2.2151 (2.2888) [2022-09-29 10:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][700/1251] eta 0:07:10 lr 0.000229 time 0.7558 (0.7818) loss 5.6609 (5.9045) grad_norm 2.6439 (2.2842) [2022-09-29 10:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][800/1251] eta 0:05:52 lr 0.000233 time 0.8355 (0.7808) loss 6.2544 (5.8933) grad_norm 2.6279 (2.2819) [2022-09-29 10:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][900/1251] eta 0:04:34 lr 0.000237 time 0.8472 (0.7809) loss 6.1409 (5.8904) grad_norm 2.0632 (2.2626) [2022-09-29 10:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1000/1251] eta 0:03:16 lr 0.000241 time 0.7607 (0.7812) loss 6.0980 (5.8892) grad_norm 1.8440 (2.2649) [2022-09-29 10:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1100/1251] eta 0:01:57 lr 0.000245 time 0.7021 (0.7812) loss 5.5281 (5.8812) grad_norm 2.6620 (2.2641) [2022-09-29 10:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1200/1251] eta 0:00:39 lr 0.000249 time 0.7678 (0.7808) loss 5.2537 (5.8731) grad_norm 2.0275 (2.2648) [2022-09-29 10:21:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 4 training takes 0:16:16 [2022-09-29 10:21:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.759 (4.759) Loss 3.9713 (3.9713) Acc@1 23.730 (23.730) Acc@5 47.363 (47.363) [2022-09-29 10:21:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 24.674 Acc@5 48.092 [2022-09-29 10:21:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 24.7% [2022-09-29 10:21:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 24.67% [2022-09-29 10:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][0/1251] eta 1:44:13 lr 0.000251 time 4.9985 (4.9985) loss 5.8742 (5.8742) grad_norm 2.2648 (2.2648) [2022-09-29 10:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][100/1251] eta 0:15:43 lr 0.000255 time 0.8577 (0.8196) loss 5.5931 (5.8543) grad_norm 2.2197 (2.1506) [2022-09-29 10:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][200/1251] eta 0:13:51 lr 0.000259 time 0.8456 (0.7909) loss 5.0387 (5.7994) grad_norm 2.2951 (2.1841) [2022-09-29 10:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][300/1251] eta 0:12:29 lr 0.000263 time 0.7995 (0.7882) loss 5.4692 (5.7730) grad_norm 2.0317 (2.2187) [2022-09-29 10:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][400/1251] eta 0:11:07 lr 0.000267 time 0.8091 (0.7839) loss 5.6648 (5.7622) grad_norm 2.0463 (2.2140) [2022-09-29 10:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][500/1251] eta 0:09:47 lr 0.000271 time 0.9006 (0.7824) loss 6.2371 (5.7493) grad_norm 2.1944 (2.2254) [2022-09-29 10:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][600/1251] eta 0:08:29 lr 0.000275 time 0.8025 (0.7819) loss 5.5496 (5.7372) grad_norm 2.0021 (2.2349) [2022-09-29 10:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][700/1251] eta 0:07:10 lr 0.000279 time 0.6919 (0.7807) loss 5.5791 (5.7202) grad_norm 2.4424 (2.2313) [2022-09-29 10:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][800/1251] eta 0:05:52 lr 0.000283 time 0.7552 (0.7810) loss 5.9128 (5.7081) grad_norm 1.9562 (2.2337) [2022-09-29 10:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][900/1251] eta 0:04:34 lr 0.000287 time 0.8237 (0.7810) loss 6.2590 (5.7056) grad_norm 1.8753 (2.2239) [2022-09-29 10:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1000/1251] eta 0:03:16 lr 0.000291 time 0.8377 (0.7813) loss 5.6516 (5.7005) grad_norm 1.7604 (2.2239) [2022-09-29 10:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1100/1251] eta 0:01:57 lr 0.000295 time 0.8151 (0.7812) loss 5.8734 (5.6974) grad_norm 2.0717 (2.2203) [2022-09-29 10:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1200/1251] eta 0:00:39 lr 0.000299 time 0.8419 (0.7819) loss 5.7247 (5.6858) grad_norm 1.9011 (2.2156) [2022-09-29 10:37:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 5 training takes 0:16:18 [2022-09-29 10:37:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.059 (4.059) Loss 3.4614 (3.4614) Acc@1 29.492 (29.492) Acc@5 55.566 (55.566) [2022-09-29 10:38:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 29.646 Acc@5 54.566 [2022-09-29 10:38:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 29.6% [2022-09-29 10:38:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 29.65% [2022-09-29 10:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][0/1251] eta 1:44:31 lr 0.000301 time 5.0135 (5.0135) loss 4.8082 (4.8082) grad_norm 3.0551 (3.0551) [2022-09-29 10:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][100/1251] eta 0:15:35 lr 0.000305 time 0.7002 (0.8129) loss 6.0273 (5.5590) grad_norm 2.4552 (2.1798) [2022-09-29 10:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][200/1251] eta 0:13:54 lr 0.000309 time 0.9379 (0.7943) loss 5.6036 (5.5905) grad_norm 2.3451 (2.1672) [2022-09-29 10:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][300/1251] eta 0:12:30 lr 0.000313 time 0.8654 (0.7892) loss 5.5950 (5.5607) grad_norm 1.9170 (2.1658) [2022-09-29 10:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][400/1251] eta 0:11:05 lr 0.000317 time 0.8381 (0.7820) loss 5.8955 (5.5758) grad_norm 1.9541 (2.1817) [2022-09-29 10:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][500/1251] eta 0:09:45 lr 0.000321 time 0.7697 (0.7793) loss 5.8565 (5.5581) grad_norm 2.6425 (2.1716) [2022-09-29 10:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][600/1251] eta 0:08:27 lr 0.000325 time 0.8080 (0.7793) loss 5.9450 (5.5494) grad_norm 1.8668 (2.1684) [2022-09-29 10:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][700/1251] eta 0:07:09 lr 0.000329 time 0.8592 (0.7792) loss 5.9089 (5.5467) grad_norm 2.8226 (2.1752) [2022-09-29 10:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][800/1251] eta 0:05:51 lr 0.000333 time 0.7905 (0.7786) loss 4.9552 (5.5408) grad_norm 2.4190 (2.1737) [2022-09-29 10:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][900/1251] eta 0:04:32 lr 0.000337 time 0.8298 (0.7776) loss 5.4326 (5.5409) grad_norm 2.2144 (2.1758) [2022-09-29 10:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1000/1251] eta 0:03:14 lr 0.000341 time 0.7910 (0.7765) loss 5.9644 (5.5394) grad_norm 1.6409 (2.1641) [2022-09-29 10:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1100/1251] eta 0:01:57 lr 0.000345 time 0.7779 (0.7766) loss 5.8514 (5.5385) grad_norm 1.9386 (2.1627) [2022-09-29 10:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1200/1251] eta 0:00:39 lr 0.000349 time 0.8961 (0.7766) loss 5.2266 (5.5295) grad_norm 1.6776 (2.1538) [2022-09-29 10:54:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 6 training takes 0:16:12 [2022-09-29 10:54:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.778 (3.778) Loss 3.2757 (3.2757) Acc@1 32.520 (32.520) Acc@5 58.789 (58.789) [2022-09-29 10:54:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 33.382 Acc@5 58.986 [2022-09-29 10:54:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 33.4% [2022-09-29 10:54:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 33.38% [2022-09-29 10:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][0/1251] eta 1:39:19 lr 0.000351 time 4.7639 (4.7639) loss 5.6382 (5.6382) grad_norm 2.2718 (2.2718) [2022-09-29 10:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][100/1251] eta 0:15:48 lr 0.000355 time 0.8467 (0.8241) loss 4.8801 (5.4849) grad_norm 1.9891 (2.1872) [2022-09-29 10:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][200/1251] eta 0:14:01 lr 0.000359 time 0.8353 (0.8004) loss 5.8452 (5.4563) grad_norm 1.7652 (2.1669) [2022-09-29 10:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][300/1251] eta 0:12:36 lr 0.000363 time 0.8365 (0.7951) loss 5.0264 (5.4382) grad_norm 1.7054 (2.1517) [2022-09-29 10:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][400/1251] eta 0:11:13 lr 0.000367 time 0.9155 (0.7918) loss 5.4748 (5.4346) grad_norm 2.1285 (2.1356) [2022-09-29 11:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][500/1251] eta 0:09:52 lr 0.000371 time 0.7601 (0.7892) loss 5.4064 (5.4274) grad_norm 1.8801 (2.1405) [2022-09-29 11:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][600/1251] eta 0:08:31 lr 0.000375 time 0.8248 (0.7856) loss 5.3491 (5.4198) grad_norm 2.3655 (2.1362) [2022-09-29 11:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][700/1251] eta 0:07:12 lr 0.000379 time 0.7683 (0.7843) loss 5.4760 (5.4238) grad_norm 1.5885 (2.1223) [2022-09-29 11:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][800/1251] eta 0:05:52 lr 0.000383 time 0.7336 (0.7825) loss 5.9051 (5.4169) grad_norm 2.6329 (2.1129) [2022-09-29 11:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][900/1251] eta 0:04:29 lr 0.000387 time 0.8170 (0.7668) loss 4.7505 (5.4052) grad_norm 2.1349 (2.1085) [2022-09-29 11:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1000/1251] eta 0:03:12 lr 0.000391 time 0.8236 (0.7679) loss 5.7885 (5.4092) grad_norm 2.9396 (2.1055) [2022-09-29 11:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1100/1251] eta 0:01:56 lr 0.000395 time 0.8381 (0.7700) loss 5.3567 (5.3983) grad_norm 2.6254 (2.0988) [2022-09-29 11:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1200/1251] eta 0:00:39 lr 0.000399 time 0.8112 (0.7711) loss 4.3369 (5.3917) grad_norm 2.3756 (2.0973) [2022-09-29 11:10:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 7 training takes 0:16:04 [2022-09-29 11:10:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.728 (4.728) Loss 2.7704 (2.7704) Acc@1 41.504 (41.504) Acc@5 67.383 (67.383) [2022-09-29 11:11:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 38.206 Acc@5 63.704 [2022-09-29 11:11:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 38.2% [2022-09-29 11:11:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 38.21% [2022-09-29 11:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][0/1251] eta 1:44:55 lr 0.000401 time 5.0324 (5.0324) loss 5.5416 (5.5416) grad_norm 2.7080 (2.7080) [2022-09-29 11:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][100/1251] eta 0:15:43 lr 0.000405 time 0.7484 (0.8195) loss 4.8687 (5.3804) grad_norm 2.2659 (2.0408) [2022-09-29 11:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][200/1251] eta 0:13:55 lr 0.000409 time 0.7333 (0.7954) loss 4.6605 (5.3685) grad_norm 2.6630 (2.0511) [2022-09-29 11:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][300/1251] eta 0:12:30 lr 0.000413 time 0.8364 (0.7888) loss 4.6796 (5.3655) grad_norm 1.8730 (2.0306) [2022-09-29 11:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][400/1251] eta 0:11:04 lr 0.000417 time 0.8116 (0.7805) loss 5.5098 (5.3371) grad_norm 1.7603 (2.0221) [2022-09-29 11:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][500/1251] eta 0:09:43 lr 0.000421 time 0.7202 (0.7774) loss 5.2423 (5.3232) grad_norm 1.7040 (2.0369) [2022-09-29 11:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][600/1251] eta 0:08:25 lr 0.000425 time 0.8394 (0.7761) loss 5.5893 (5.3013) grad_norm 2.0268 (2.0465) [2022-09-29 11:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][700/1251] eta 0:07:07 lr 0.000429 time 0.7111 (0.7761) loss 4.7164 (5.3002) grad_norm 2.4586 (2.0392) [2022-09-29 11:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][800/1251] eta 0:05:49 lr 0.000433 time 0.8136 (0.7760) loss 5.2955 (5.2926) grad_norm 1.5794 (2.0233) [2022-09-29 11:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][900/1251] eta 0:04:32 lr 0.000437 time 0.7066 (0.7755) loss 5.3650 (5.2903) grad_norm 2.0094 (2.0146) [2022-09-29 11:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1000/1251] eta 0:03:14 lr 0.000441 time 0.5890 (0.7748) loss 5.5982 (5.2855) grad_norm 1.7108 (2.0110) [2022-09-29 11:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1100/1251] eta 0:01:56 lr 0.000445 time 0.8649 (0.7746) loss 5.4652 (5.2833) grad_norm 2.2997 (2.0054) [2022-09-29 11:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1200/1251] eta 0:00:39 lr 0.000449 time 0.8453 (0.7749) loss 4.4620 (5.2784) grad_norm 2.0495 (2.0078) [2022-09-29 11:27:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 8 training takes 0:16:08 [2022-09-29 11:27:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.751 (3.751) Loss 2.7503 (2.7503) Acc@1 43.555 (43.555) Acc@5 67.969 (67.969) [2022-09-29 11:27:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 40.898 Acc@5 66.664 [2022-09-29 11:27:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 40.9% [2022-09-29 11:27:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 40.90% [2022-09-29 11:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][0/1251] eta 1:32:28 lr 0.000451 time 4.4353 (4.4353) loss 4.5080 (4.5080) grad_norm 1.9842 (1.9842) [2022-09-29 11:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][100/1251] eta 0:15:32 lr 0.000455 time 0.8338 (0.8102) loss 5.5751 (5.2584) grad_norm 2.9310 (1.9674) [2022-09-29 11:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][200/1251] eta 0:14:00 lr 0.000459 time 0.8197 (0.7995) loss 5.1445 (5.2530) grad_norm 1.7497 (1.9564) [2022-09-29 11:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][300/1251] eta 0:12:31 lr 0.000463 time 0.8543 (0.7905) loss 4.5436 (5.2644) grad_norm 1.9191 (1.9324) [2022-09-29 11:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][400/1251] eta 0:11:11 lr 0.000467 time 0.8126 (0.7889) loss 4.8990 (5.2301) grad_norm 1.7337 (1.9223) [2022-09-29 11:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][500/1251] eta 0:09:48 lr 0.000471 time 0.8121 (0.7841) loss 5.4447 (5.2083) grad_norm 1.9484 (1.9243) [2022-09-29 11:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][600/1251] eta 0:08:28 lr 0.000475 time 0.8482 (0.7816) loss 4.9417 (5.2065) grad_norm 3.2435 (1.9340) [2022-09-29 11:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][700/1251] eta 0:07:10 lr 0.000478 time 0.7405 (0.7808) loss 5.7614 (5.2002) grad_norm 1.7403 (1.9209) [2022-09-29 11:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][800/1251] eta 0:05:52 lr 0.000482 time 0.8368 (0.7811) loss 4.3053 (5.1928) grad_norm 1.6276 (1.9152) [2022-09-29 11:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][900/1251] eta 0:04:34 lr 0.000486 time 0.8562 (0.7816) loss 5.7972 (5.1798) grad_norm 2.0748 (1.9156) [2022-09-29 11:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1000/1251] eta 0:03:16 lr 0.000490 time 0.8160 (0.7809) loss 5.6690 (5.1676) grad_norm 2.5077 (1.9179) [2022-09-29 11:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1100/1251] eta 0:01:57 lr 0.000494 time 0.6622 (0.7800) loss 5.0106 (5.1650) grad_norm 1.6055 (1.9125) [2022-09-29 11:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1200/1251] eta 0:00:39 lr 0.000498 time 0.8202 (0.7796) loss 4.5043 (5.1507) grad_norm 2.6334 (1.9041) [2022-09-29 11:43:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 9 training takes 0:16:15 [2022-09-29 11:43:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.688 (3.688) Loss 2.6115 (2.6115) Acc@1 42.871 (42.871) Acc@5 71.875 (71.875) [2022-09-29 11:44:09 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 43.920 Acc@5 70.040 [2022-09-29 11:44:09 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 43.9% [2022-09-29 11:44:09 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 43.92% [2022-09-29 11:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][0/1251] eta 1:43:43 lr 0.000501 time 4.9744 (4.9744) loss 5.3760 (5.3760) grad_norm 1.6901 (1.6901) [2022-09-29 11:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][100/1251] eta 0:15:28 lr 0.000504 time 0.6332 (0.8070) loss 4.9186 (5.1148) grad_norm 2.0983 (1.8452) [2022-09-29 11:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][200/1251] eta 0:13:50 lr 0.000508 time 0.6208 (0.7904) loss 5.1235 (5.1389) grad_norm 1.7349 (1.8319) [2022-09-29 11:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][300/1251] eta 0:12:26 lr 0.000512 time 0.8027 (0.7854) loss 5.4729 (5.1145) grad_norm 1.4392 (1.8344) [2022-09-29 11:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][400/1251] eta 0:11:04 lr 0.000516 time 0.8034 (0.7806) loss 4.0973 (5.1205) grad_norm 1.2999 (1.8414) [2022-09-29 11:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][500/1251] eta 0:09:46 lr 0.000520 time 0.9063 (0.7814) loss 5.4676 (5.1100) grad_norm 1.8469 (1.8464) [2022-09-29 11:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][600/1251] eta 0:08:27 lr 0.000524 time 0.7298 (0.7801) loss 5.2178 (5.1136) grad_norm 1.8992 (1.8429) [2022-09-29 11:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][700/1251] eta 0:07:09 lr 0.000528 time 0.6833 (0.7793) loss 5.6060 (5.1099) grad_norm 1.4990 (1.8379) [2022-09-29 11:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][800/1251] eta 0:05:51 lr 0.000532 time 0.8052 (0.7800) loss 4.0589 (5.0950) grad_norm 2.1496 (1.8305) [2022-09-29 11:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][900/1251] eta 0:04:33 lr 0.000536 time 0.8450 (0.7801) loss 4.0958 (5.0861) grad_norm 2.3171 (1.8293) [2022-09-29 11:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1000/1251] eta 0:03:15 lr 0.000540 time 0.7359 (0.7789) loss 5.6835 (5.0897) grad_norm 1.7481 (1.8211) [2022-09-29 11:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1100/1251] eta 0:01:57 lr 0.000544 time 0.7712 (0.7784) loss 5.3485 (5.0748) grad_norm 1.7633 (1.8203) [2022-09-29 11:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1200/1251] eta 0:00:39 lr 0.000548 time 0.8132 (0.7783) loss 5.6226 (5.0686) grad_norm 2.0805 (1.8173) [2022-09-29 12:00:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 10 training takes 0:16:13 [2022-09-29 12:00:22 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saving...... [2022-09-29 12:00:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saved !!! [2022-09-29 12:00:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.387 (4.387) Loss 2.5064 (2.5064) Acc@1 46.191 (46.191) Acc@5 71.289 (71.289) [2022-09-29 12:00:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 46.494 Acc@5 72.486 [2022-09-29 12:00:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 46.5% [2022-09-29 12:00:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 46.49% [2022-09-29 12:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][0/1251] eta 1:37:56 lr 0.000550 time 4.6972 (4.6972) loss 5.4990 (5.4990) grad_norm 1.6074 (1.6074) [2022-09-29 12:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][100/1251] eta 0:15:28 lr 0.000554 time 0.7402 (0.8064) loss 4.9167 (5.0124) grad_norm 1.7350 (1.7594) [2022-09-29 12:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][200/1251] eta 0:13:48 lr 0.000558 time 0.9029 (0.7881) loss 4.1819 (4.9749) grad_norm 2.5979 (1.7900) [2022-09-29 12:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][300/1251] eta 0:12:25 lr 0.000562 time 0.8161 (0.7834) loss 5.4890 (4.9950) grad_norm 1.7757 (1.7570) [2022-09-29 12:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][400/1251] eta 0:11:05 lr 0.000566 time 0.7569 (0.7820) loss 5.4449 (5.0100) grad_norm 1.4042 (1.7613) [2022-09-29 12:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][500/1251] eta 0:09:47 lr 0.000570 time 0.8516 (0.7819) loss 5.1469 (4.9950) grad_norm 1.8557 (1.7722) [2022-09-29 12:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][600/1251] eta 0:08:27 lr 0.000574 time 0.8674 (0.7792) loss 5.3643 (4.9871) grad_norm 1.6905 (1.7621) [2022-09-29 12:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][700/1251] eta 0:07:08 lr 0.000578 time 0.8102 (0.7777) loss 5.5004 (4.9985) grad_norm 1.6659 (1.7585) [2022-09-29 12:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][800/1251] eta 0:05:50 lr 0.000582 time 0.8023 (0.7780) loss 5.0970 (5.0052) grad_norm 1.6428 (1.7574) [2022-09-29 12:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][900/1251] eta 0:04:33 lr 0.000586 time 0.8279 (0.7779) loss 4.2556 (5.0027) grad_norm 1.6619 (1.7513) [2022-09-29 12:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1000/1251] eta 0:03:15 lr 0.000590 time 0.6772 (0.7778) loss 5.2273 (4.9856) grad_norm 1.7408 (1.7433) [2022-09-29 12:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1100/1251] eta 0:01:57 lr 0.000594 time 0.7961 (0.7776) loss 5.7447 (4.9827) grad_norm 1.4676 (1.7410) [2022-09-29 12:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1200/1251] eta 0:00:39 lr 0.000598 time 0.9140 (0.7773) loss 4.7663 (4.9813) grad_norm 1.6894 (1.7362) [2022-09-29 12:16:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 11 training takes 0:16:12 [2022-09-29 12:17:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.461 (4.461) Loss 2.3256 (2.3256) Acc@1 50.098 (50.098) Acc@5 76.367 (76.367) [2022-09-29 12:17:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 49.118 Acc@5 74.382 [2022-09-29 12:17:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 49.1% [2022-09-29 12:17:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 49.12% [2022-09-29 12:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][0/1251] eta 1:44:32 lr 0.000600 time 5.0136 (5.0136) loss 5.6861 (5.6861) grad_norm 1.4432 (1.4432) [2022-09-29 12:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][100/1251] eta 0:15:35 lr 0.000604 time 0.8461 (0.8127) loss 4.5113 (4.8755) grad_norm 1.9762 (1.7915) [2022-09-29 12:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][200/1251] eta 0:13:54 lr 0.000608 time 0.6569 (0.7941) loss 5.0685 (4.9108) grad_norm 2.0521 (1.7026) [2022-09-29 12:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][300/1251] eta 0:12:28 lr 0.000612 time 0.7144 (0.7875) loss 5.0817 (4.9243) grad_norm 1.6517 (1.6979) [2022-09-29 12:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][400/1251] eta 0:11:08 lr 0.000616 time 0.8501 (0.7859) loss 5.1188 (4.9120) grad_norm 2.7773 (1.7074) [2022-09-29 12:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][500/1251] eta 0:09:49 lr 0.000620 time 0.8604 (0.7844) loss 5.6521 (4.9017) grad_norm 1.7055 (1.7007) [2022-09-29 12:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][600/1251] eta 0:08:29 lr 0.000624 time 0.6564 (0.7832) loss 5.2050 (4.9104) grad_norm 1.4822 (1.6924) [2022-09-29 12:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][700/1251] eta 0:07:11 lr 0.000628 time 0.6344 (0.7832) loss 4.7997 (4.9090) grad_norm 1.6745 (1.6844) [2022-09-29 12:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][800/1251] eta 0:05:52 lr 0.000632 time 0.7342 (0.7808) loss 3.9735 (4.9037) grad_norm 1.7960 (1.6816) [2022-09-29 12:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][900/1251] eta 0:04:33 lr 0.000636 time 0.9402 (0.7798) loss 5.3490 (4.9152) grad_norm 1.4474 (1.6780) [2022-09-29 12:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1000/1251] eta 0:03:15 lr 0.000640 time 0.8271 (0.7797) loss 4.8053 (4.9012) grad_norm 1.6722 (1.6808) [2022-09-29 12:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1100/1251] eta 0:01:57 lr 0.000644 time 0.8410 (0.7797) loss 4.7992 (4.8980) grad_norm 1.7844 (1.6755) [2022-09-29 12:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1200/1251] eta 0:00:39 lr 0.000648 time 0.8693 (0.7788) loss 4.2730 (4.8990) grad_norm 1.7577 (1.6732) [2022-09-29 12:33:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 12 training takes 0:16:14 [2022-09-29 12:33:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.718 (4.718) Loss 2.2063 (2.2063) Acc@1 52.246 (52.246) Acc@5 77.246 (77.246) [2022-09-29 12:33:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 50.248 Acc@5 75.672 [2022-09-29 12:33:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 50.2% [2022-09-29 12:33:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 50.25% [2022-09-29 12:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][0/1251] eta 1:34:47 lr 0.000650 time 4.5464 (4.5464) loss 3.5218 (3.5218) grad_norm 1.5610 (1.5610) [2022-09-29 12:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][100/1251] eta 0:15:41 lr 0.000654 time 0.7823 (0.8177) loss 3.8092 (4.7712) grad_norm 1.7909 (1.6296) [2022-09-29 12:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][200/1251] eta 0:13:56 lr 0.000658 time 0.6588 (0.7955) loss 4.3978 (4.8015) grad_norm 1.8482 (1.6318) [2022-09-29 12:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][300/1251] eta 0:12:28 lr 0.000662 time 0.8235 (0.7871) loss 5.3326 (4.7938) grad_norm 1.7583 (1.6297) [2022-09-29 12:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][400/1251] eta 0:11:08 lr 0.000666 time 0.6404 (0.7860) loss 5.3429 (4.8033) grad_norm 1.6661 (1.6223) [2022-09-29 12:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][500/1251] eta 0:09:49 lr 0.000670 time 0.8433 (0.7843) loss 5.5341 (4.8174) grad_norm 1.2477 (1.6199) [2022-09-29 12:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][600/1251] eta 0:08:30 lr 0.000674 time 0.9095 (0.7835) loss 5.3372 (4.8171) grad_norm 1.6944 (1.6106) [2022-09-29 12:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][700/1251] eta 0:07:10 lr 0.000678 time 0.8136 (0.7819) loss 5.9294 (4.8179) grad_norm 1.8668 (1.6152) [2022-09-29 12:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][800/1251] eta 0:05:52 lr 0.000682 time 0.7026 (0.7807) loss 4.6474 (4.8067) grad_norm 1.5850 (1.6152) [2022-09-29 12:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][900/1251] eta 0:04:33 lr 0.000686 time 0.7549 (0.7794) loss 4.1121 (4.8146) grad_norm 1.4845 (1.6146) [2022-09-29 12:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1000/1251] eta 0:03:15 lr 0.000690 time 0.7803 (0.7791) loss 4.9745 (4.8113) grad_norm 1.3683 (1.6150) [2022-09-29 12:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1100/1251] eta 0:01:57 lr 0.000694 time 0.7915 (0.7788) loss 5.3207 (4.8099) grad_norm 1.2648 (1.6050) [2022-09-29 12:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1200/1251] eta 0:00:39 lr 0.000698 time 0.6962 (0.7784) loss 4.7089 (4.8059) grad_norm 1.2235 (1.6035) [2022-09-29 12:50:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 13 training takes 0:16:13 [2022-09-29 12:50:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.762 (4.762) Loss 2.2196 (2.2196) Acc@1 50.977 (50.977) Acc@5 77.344 (77.344) [2022-09-29 12:50:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 52.434 Acc@5 77.412 [2022-09-29 12:50:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 52.4% [2022-09-29 12:50:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 52.43% [2022-09-29 12:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][0/1251] eta 1:38:52 lr 0.000700 time 4.7426 (4.7426) loss 5.1517 (5.1517) grad_norm 1.4404 (1.4404) [2022-09-29 12:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][100/1251] eta 0:15:41 lr 0.000704 time 0.8197 (0.8182) loss 4.4888 (4.8160) grad_norm 1.5086 (1.5376) [2022-09-29 12:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][200/1251] eta 0:13:55 lr 0.000708 time 0.8524 (0.7950) loss 5.0643 (4.7781) grad_norm 1.7742 (1.5704) [2022-09-29 12:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][300/1251] eta 0:12:31 lr 0.000712 time 0.8030 (0.7905) loss 4.6734 (4.7738) grad_norm 1.7078 (1.5580) [2022-09-29 12:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][400/1251] eta 0:11:09 lr 0.000716 time 0.8600 (0.7864) loss 4.9300 (4.7650) grad_norm 1.2272 (1.5464) [2022-09-29 12:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][500/1251] eta 0:09:49 lr 0.000720 time 0.7497 (0.7855) loss 5.6593 (4.7524) grad_norm 1.5838 (1.5472) [2022-09-29 12:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][600/1251] eta 0:08:30 lr 0.000724 time 0.7432 (0.7848) loss 4.9447 (4.7651) grad_norm 1.2868 (1.5529) [2022-09-29 12:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][700/1251] eta 0:07:11 lr 0.000728 time 0.8170 (0.7829) loss 4.9108 (4.7663) grad_norm 1.2677 (1.5512) [2022-09-29 13:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][800/1251] eta 0:05:52 lr 0.000732 time 0.8988 (0.7818) loss 3.7165 (4.7770) grad_norm 1.6106 (1.5509) [2022-09-29 13:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][900/1251] eta 0:04:33 lr 0.000736 time 0.7111 (0.7799) loss 4.2393 (4.7769) grad_norm 1.7329 (1.5475) [2022-09-29 13:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1000/1251] eta 0:03:15 lr 0.000740 time 0.7383 (0.7801) loss 4.3971 (4.7707) grad_norm 1.5055 (1.5409) [2022-09-29 13:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1100/1251] eta 0:01:57 lr 0.000744 time 0.8052 (0.7803) loss 5.2889 (4.7644) grad_norm 1.7875 (1.5417) [2022-09-29 13:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1200/1251] eta 0:00:39 lr 0.000748 time 0.6411 (0.7790) loss 4.8629 (4.7594) grad_norm 1.3604 (1.5375) [2022-09-29 13:06:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 14 training takes 0:16:14 [2022-09-29 13:06:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.509 (4.509) Loss 2.1546 (2.1546) Acc@1 53.516 (53.516) Acc@5 76.758 (76.758) [2022-09-29 13:07:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 54.104 Acc@5 78.788 [2022-09-29 13:07:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 54.1% [2022-09-29 13:07:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 54.10% [2022-09-29 13:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][0/1251] eta 1:40:20 lr 0.000750 time 4.8127 (4.8127) loss 4.9298 (4.9298) grad_norm 1.2261 (1.2261) [2022-09-29 13:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][100/1251] eta 0:15:45 lr 0.000754 time 0.6679 (0.8211) loss 5.4008 (4.7135) grad_norm 1.4464 (1.5533) [2022-09-29 13:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][200/1251] eta 0:13:56 lr 0.000758 time 0.6047 (0.7956) loss 5.5609 (4.7047) grad_norm 1.7509 (1.5478) [2022-09-29 13:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][300/1251] eta 0:12:31 lr 0.000762 time 0.6860 (0.7901) loss 5.0971 (4.7054) grad_norm 1.3077 (1.5464) [2022-09-29 13:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][400/1251] eta 0:11:09 lr 0.000766 time 0.7923 (0.7872) loss 4.1537 (4.7048) grad_norm 1.5498 (1.5419) [2022-09-29 13:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][500/1251] eta 0:09:49 lr 0.000770 time 0.9383 (0.7855) loss 4.2875 (4.7211) grad_norm 1.4077 (1.5289) [2022-09-29 13:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][600/1251] eta 0:08:28 lr 0.000774 time 0.8107 (0.7819) loss 4.4222 (4.7185) grad_norm 1.5475 (1.5192) [2022-09-29 13:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][700/1251] eta 0:07:10 lr 0.000778 time 0.7962 (0.7816) loss 4.9843 (4.7158) grad_norm 1.2057 (1.5133) [2022-09-29 13:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][800/1251] eta 0:05:52 lr 0.000782 time 0.7915 (0.7808) loss 4.7407 (4.7186) grad_norm 1.7276 (1.5128) [2022-09-29 13:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][900/1251] eta 0:04:33 lr 0.000786 time 0.8452 (0.7800) loss 4.2501 (4.7130) grad_norm 1.4507 (1.5136) [2022-09-29 13:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1000/1251] eta 0:03:15 lr 0.000790 time 0.9476 (0.7794) loss 5.0888 (4.7129) grad_norm 1.3764 (1.5099) [2022-09-29 13:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1100/1251] eta 0:01:57 lr 0.000794 time 0.6795 (0.7785) loss 5.3699 (4.7018) grad_norm 1.2176 (1.5051) [2022-09-29 13:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1200/1251] eta 0:00:39 lr 0.000798 time 0.8186 (0.7786) loss 5.5989 (4.7057) grad_norm 1.9565 (1.5012) [2022-09-29 13:23:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 15 training takes 0:16:13 [2022-09-29 13:23:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.517 (4.517) Loss 2.0776 (2.0776) Acc@1 53.906 (53.906) Acc@5 77.734 (77.734) [2022-09-29 13:23:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 55.182 Acc@5 79.742 [2022-09-29 13:23:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 55.2% [2022-09-29 13:23:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 55.18% [2022-09-29 13:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][0/1251] eta 1:28:47 lr 0.000800 time 4.2583 (4.2583) loss 4.8850 (4.8850) grad_norm 1.5180 (1.5180) [2022-09-29 13:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][100/1251] eta 0:15:35 lr 0.000804 time 0.6726 (0.8127) loss 4.7526 (4.7263) grad_norm 1.9630 (1.5224) [2022-09-29 13:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][200/1251] eta 0:13:55 lr 0.000808 time 0.8433 (0.7945) loss 5.0230 (4.6562) grad_norm 1.3867 (1.4810) [2022-09-29 13:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][300/1251] eta 0:12:30 lr 0.000812 time 0.8173 (0.7889) loss 4.6227 (4.6685) grad_norm 1.6571 (1.4791) [2022-09-29 13:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][400/1251] eta 0:10:39 lr 0.000816 time 0.6838 (0.7511) loss 4.8928 (4.6563) grad_norm 2.3208 (1.4741) [2022-09-29 13:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][500/1251] eta 0:09:26 lr 0.000820 time 0.6502 (0.7545) loss 5.0571 (4.6672) grad_norm 1.4631 (1.4648) [2022-09-29 13:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][600/1251] eta 0:08:14 lr 0.000824 time 0.6807 (0.7589) loss 4.8910 (4.6793) grad_norm 1.3509 (1.4620) [2022-09-29 13:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][700/1251] eta 0:06:59 lr 0.000828 time 0.9310 (0.7608) loss 5.0206 (4.6784) grad_norm 1.3618 (1.4573) [2022-09-29 13:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][800/1251] eta 0:05:44 lr 0.000832 time 0.7519 (0.7631) loss 5.1011 (4.6752) grad_norm 1.7154 (1.4540) [2022-09-29 13:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][900/1251] eta 0:04:28 lr 0.000836 time 0.6569 (0.7635) loss 5.0232 (4.6711) grad_norm 1.3521 (1.4458) [2022-09-29 13:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1000/1251] eta 0:03:11 lr 0.000840 time 0.7673 (0.7646) loss 4.9732 (4.6660) grad_norm 1.3424 (1.4460) [2022-09-29 13:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1100/1251] eta 0:01:55 lr 0.000844 time 0.8413 (0.7654) loss 4.7655 (4.6569) grad_norm 1.6892 (1.4442) [2022-09-29 13:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1200/1251] eta 0:00:39 lr 0.000848 time 0.8932 (0.7653) loss 4.8255 (4.6574) grad_norm 1.4152 (1.4432) [2022-09-29 13:39:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 16 training takes 0:15:57 [2022-09-29 13:39:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.147 (4.147) Loss 1.8881 (1.8881) Acc@1 58.984 (58.984) Acc@5 81.152 (81.152) [2022-09-29 13:39:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 56.876 Acc@5 81.190 [2022-09-29 13:39:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 56.9% [2022-09-29 13:39:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 56.88% [2022-09-29 13:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][0/1251] eta 1:15:55 lr 0.000850 time 3.6414 (3.6414) loss 4.1229 (4.1229) grad_norm 1.7183 (1.7183) [2022-09-29 13:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][100/1251] eta 0:15:05 lr 0.000854 time 0.6697 (0.7863) loss 4.9185 (4.5751) grad_norm 1.5343 (1.3583) [2022-09-29 13:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][200/1251] eta 0:13:38 lr 0.000858 time 0.7789 (0.7788) loss 5.0636 (4.5603) grad_norm 1.1883 (1.3936) [2022-09-29 13:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][300/1251] eta 0:12:18 lr 0.000862 time 0.8037 (0.7769) loss 4.8124 (4.5476) grad_norm 1.2851 (1.4006) [2022-09-29 13:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][400/1251] eta 0:10:59 lr 0.000866 time 0.8579 (0.7750) loss 5.1919 (4.5476) grad_norm 1.3124 (1.4034) [2022-09-29 13:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][500/1251] eta 0:09:42 lr 0.000870 time 0.8412 (0.7760) loss 4.0336 (4.5743) grad_norm 1.3079 (1.4018) [2022-09-29 13:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][600/1251] eta 0:08:24 lr 0.000874 time 0.7085 (0.7751) loss 5.2831 (4.5825) grad_norm 1.2977 (1.4027) [2022-09-29 13:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][700/1251] eta 0:07:07 lr 0.000878 time 0.8467 (0.7753) loss 4.8564 (4.5886) grad_norm 1.5630 (1.4122) [2022-09-29 13:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][800/1251] eta 0:05:49 lr 0.000882 time 0.8497 (0.7748) loss 4.9570 (4.5946) grad_norm 1.1129 (1.4064) [2022-09-29 13:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][900/1251] eta 0:04:31 lr 0.000886 time 0.8646 (0.7749) loss 5.1140 (4.5888) grad_norm 1.1669 (1.4048) [2022-09-29 13:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1000/1251] eta 0:03:14 lr 0.000890 time 0.5966 (0.7742) loss 4.9880 (4.5973) grad_norm 1.2414 (1.4021) [2022-09-29 13:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1100/1251] eta 0:01:56 lr 0.000894 time 0.8465 (0.7738) loss 4.9035 (4.6072) grad_norm 1.2158 (1.3997) [2022-09-29 13:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1200/1251] eta 0:00:39 lr 0.000898 time 0.7193 (0.7740) loss 5.1880 (4.6154) grad_norm 1.6206 (1.3982) [2022-09-29 13:56:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 17 training takes 0:16:08 [2022-09-29 13:56:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.356 (4.356) Loss 1.9326 (1.9326) Acc@1 56.641 (56.641) Acc@5 82.129 (82.129) [2022-09-29 13:56:25 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 57.076 Acc@5 81.558 [2022-09-29 13:56:25 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 57.1% [2022-09-29 13:56:25 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 57.08% [2022-09-29 13:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][0/1251] eta 1:37:27 lr 0.000900 time 4.6740 (4.6740) loss 4.3245 (4.3245) grad_norm 1.2444 (1.2444) [2022-09-29 13:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][100/1251] eta 0:15:26 lr 0.000904 time 0.8699 (0.8052) loss 4.3945 (4.5115) grad_norm 1.4969 (1.4174) [2022-09-29 13:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][200/1251] eta 0:13:49 lr 0.000908 time 0.8258 (0.7897) loss 5.1868 (4.5276) grad_norm 1.3128 (1.4067) [2022-09-29 14:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][300/1251] eta 0:12:27 lr 0.000912 time 0.6610 (0.7856) loss 4.7745 (4.5656) grad_norm 1.2233 (1.4026) [2022-09-29 14:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][400/1251] eta 0:11:05 lr 0.000916 time 0.8320 (0.7823) loss 4.7905 (4.5507) grad_norm 1.5491 (1.3943) [2022-09-29 14:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][500/1251] eta 0:09:46 lr 0.000920 time 0.8049 (0.7814) loss 4.6214 (4.5554) grad_norm 1.4372 (1.3885) [2022-09-29 14:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][600/1251] eta 0:08:29 lr 0.000924 time 0.7604 (0.7820) loss 5.1737 (4.5443) grad_norm 1.0632 (1.3757) [2022-09-29 14:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][700/1251] eta 0:07:09 lr 0.000928 time 0.8206 (0.7804) loss 4.7469 (4.5428) grad_norm 1.2205 (1.3665) [2022-09-29 14:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][800/1251] eta 0:05:51 lr 0.000932 time 0.7898 (0.7802) loss 4.4172 (4.5498) grad_norm 1.3059 (1.3638) [2022-09-29 14:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][900/1251] eta 0:04:33 lr 0.000936 time 0.8308 (0.7798) loss 4.6253 (4.5402) grad_norm 1.4159 (1.3648) [2022-09-29 14:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1000/1251] eta 0:03:15 lr 0.000940 time 0.8672 (0.7792) loss 5.0683 (4.5454) grad_norm 1.2866 (1.3628) [2022-09-29 14:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1100/1251] eta 0:01:57 lr 0.000944 time 0.8424 (0.7775) loss 5.1238 (4.5551) grad_norm 1.3616 (1.3667) [2022-09-29 14:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1200/1251] eta 0:00:39 lr 0.000948 time 0.8148 (0.7769) loss 5.5567 (4.5578) grad_norm 1.1188 (1.3688) [2022-09-29 14:12:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 18 training takes 0:16:11 [2022-09-29 14:12:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.564 (4.564) Loss 1.8679 (1.8679) Acc@1 59.082 (59.082) Acc@5 83.008 (83.008) [2022-09-29 14:12:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 58.210 Acc@5 82.020 [2022-09-29 14:12:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 58.2% [2022-09-29 14:12:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 58.21% [2022-09-29 14:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][0/1251] eta 1:27:51 lr 0.000950 time 4.2137 (4.2137) loss 4.4942 (4.4942) grad_norm 1.2207 (1.2207) [2022-09-29 14:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][100/1251] eta 0:15:37 lr 0.000954 time 0.8343 (0.8141) loss 4.6495 (4.4984) grad_norm 1.3087 (1.3530) [2022-09-29 14:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][200/1251] eta 0:13:57 lr 0.000958 time 0.8314 (0.7968) loss 3.5556 (4.5224) grad_norm 1.2477 (1.3390) [2022-09-29 14:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][300/1251] eta 0:12:34 lr 0.000962 time 0.8216 (0.7935) loss 3.6749 (4.5202) grad_norm 1.3500 (1.3270) [2022-09-29 14:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][400/1251] eta 0:11:11 lr 0.000966 time 0.6783 (0.7885) loss 3.7199 (4.5426) grad_norm 1.4493 (1.3213) [2022-09-29 14:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][500/1251] eta 0:09:49 lr 0.000970 time 0.6777 (0.7844) loss 4.4055 (4.5446) grad_norm 1.2047 (1.3201) [2022-09-29 14:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][600/1251] eta 0:08:28 lr 0.000974 time 0.7973 (0.7814) loss 4.1338 (4.5414) grad_norm 1.4570 (1.3239) [2022-09-29 14:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][700/1251] eta 0:07:10 lr 0.000978 time 0.8491 (0.7808) loss 4.5482 (4.5300) grad_norm 1.3535 (1.3208) [2022-09-29 14:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][800/1251] eta 0:05:51 lr 0.000982 time 0.9064 (0.7795) loss 4.6784 (4.5325) grad_norm 1.1884 (1.3165) [2022-09-29 14:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][900/1251] eta 0:04:33 lr 0.000986 time 0.8017 (0.7788) loss 4.7665 (4.5378) grad_norm 1.0170 (1.3154) [2022-09-29 14:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1000/1251] eta 0:03:15 lr 0.000990 time 0.7759 (0.7793) loss 4.6655 (4.5503) grad_norm 1.3288 (1.3138) [2022-09-29 14:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1100/1251] eta 0:01:57 lr 0.000994 time 0.7749 (0.7782) loss 5.5268 (4.5456) grad_norm 1.3672 (1.3133) [2022-09-29 14:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1200/1251] eta 0:00:39 lr 0.000998 time 0.8414 (0.7783) loss 4.8210 (4.5441) grad_norm 1.3292 (1.3119) [2022-09-29 14:29:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 19 training takes 0:16:12 [2022-09-29 14:29:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.713 (3.713) Loss 1.8243 (1.8243) Acc@1 59.766 (59.766) Acc@5 83.301 (83.301) [2022-09-29 14:29:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 59.054 Acc@5 82.830 [2022-09-29 14:29:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 59.1% [2022-09-29 14:29:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 59.05% [2022-09-29 14:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][0/1251] eta 1:46:23 lr 0.000989 time 5.1030 (5.1030) loss 4.3567 (4.3567) grad_norm 1.0675 (1.0675) [2022-09-29 14:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][100/1251] eta 0:15:35 lr 0.000989 time 0.8300 (0.8129) loss 4.1558 (4.4178) grad_norm 1.3771 (1.3012) [2022-09-29 14:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][200/1251] eta 0:13:53 lr 0.000989 time 0.8214 (0.7934) loss 4.0797 (4.4812) grad_norm 1.0919 (1.2892) [2022-09-29 14:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][300/1251] eta 0:12:26 lr 0.000989 time 0.8483 (0.7848) loss 3.4632 (4.4993) grad_norm 0.9972 (1.2752) [2022-09-29 14:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][400/1251] eta 0:11:03 lr 0.000989 time 0.8170 (0.7795) loss 5.0547 (4.4935) grad_norm 1.1200 (1.2804) [2022-09-29 14:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][500/1251] eta 0:09:44 lr 0.000989 time 0.7555 (0.7782) loss 4.7235 (4.4782) grad_norm 1.0648 (1.2758) [2022-09-29 14:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][600/1251] eta 0:08:26 lr 0.000989 time 0.6763 (0.7773) loss 4.8621 (4.4785) grad_norm 1.5842 (1.2797) [2022-09-29 14:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][700/1251] eta 0:07:08 lr 0.000989 time 0.8495 (0.7775) loss 5.2073 (4.4748) grad_norm 1.0966 (1.2746) [2022-09-29 14:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][800/1251] eta 0:05:50 lr 0.000988 time 0.7963 (0.7772) loss 4.3338 (4.4722) grad_norm 1.3055 (1.2720) [2022-09-29 14:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][900/1251] eta 0:04:32 lr 0.000988 time 0.8159 (0.7773) loss 4.7936 (4.4672) grad_norm 1.5816 (1.2691) [2022-09-29 14:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1000/1251] eta 0:03:15 lr 0.000988 time 0.9113 (0.7770) loss 5.0203 (4.4581) grad_norm 1.3246 (1.2693) [2022-09-29 14:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1100/1251] eta 0:01:57 lr 0.000988 time 0.7227 (0.7765) loss 4.5430 (4.4527) grad_norm 1.0218 (1.2661) [2022-09-29 14:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1200/1251] eta 0:00:39 lr 0.000988 time 0.6550 (0.7761) loss 4.5002 (4.4541) grad_norm 0.9478 (1.2636) [2022-09-29 14:45:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 20 training takes 0:16:10 [2022-09-29 14:45:41 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saving...... [2022-09-29 14:45:41 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saved !!! [2022-09-29 14:45:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.342 (4.342) Loss 1.7332 (1.7332) Acc@1 61.230 (61.230) Acc@5 85.156 (85.156) [2022-09-29 14:46:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 60.464 Acc@5 83.824 [2022-09-29 14:46:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 60.5% [2022-09-29 14:46:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 60.46% [2022-09-29 14:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][0/1251] eta 1:29:54 lr 0.000988 time 4.3125 (4.3125) loss 3.8595 (3.8595) grad_norm 0.9677 (0.9677) [2022-09-29 14:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][100/1251] eta 0:15:39 lr 0.000988 time 0.6970 (0.8165) loss 4.8002 (4.5281) grad_norm 1.4077 (1.2727) [2022-09-29 14:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][200/1251] eta 0:13:57 lr 0.000988 time 0.9227 (0.7965) loss 4.5786 (4.5204) grad_norm 1.0932 (1.2510) [2022-09-29 14:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][300/1251] eta 0:12:31 lr 0.000988 time 0.7599 (0.7901) loss 4.5138 (4.5111) grad_norm 1.1663 (1.2584) [2022-09-29 14:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][400/1251] eta 0:11:08 lr 0.000988 time 0.8545 (0.7856) loss 4.0627 (4.4906) grad_norm 1.2928 (1.2505) [2022-09-29 14:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][500/1251] eta 0:09:48 lr 0.000988 time 0.7792 (0.7830) loss 4.7715 (4.4937) grad_norm 1.3341 (1.2533) [2022-09-29 14:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][600/1251] eta 0:08:28 lr 0.000988 time 0.8218 (0.7807) loss 5.1102 (4.4975) grad_norm 1.5484 (1.2523) [2022-09-29 14:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][700/1251] eta 0:07:09 lr 0.000987 time 0.9044 (0.7801) loss 4.1236 (4.4976) grad_norm 1.1580 (1.2549) [2022-09-29 14:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][800/1251] eta 0:05:51 lr 0.000987 time 0.8004 (0.7786) loss 3.3219 (4.4834) grad_norm 1.2848 (1.2521) [2022-09-29 14:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][900/1251] eta 0:04:33 lr 0.000987 time 0.8000 (0.7786) loss 4.1867 (4.4923) grad_norm 1.1055 (1.2528) [2022-09-29 14:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1000/1251] eta 0:03:15 lr 0.000987 time 0.8348 (0.7778) loss 4.8202 (4.4823) grad_norm 0.8952 (1.2491) [2022-09-29 15:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1100/1251] eta 0:01:57 lr 0.000987 time 0.7645 (0.7775) loss 5.4063 (4.4750) grad_norm 1.2033 (1.2499) [2022-09-29 15:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1200/1251] eta 0:00:39 lr 0.000987 time 0.8226 (0.7764) loss 4.4924 (4.4672) grad_norm 1.2437 (1.2460) [2022-09-29 15:02:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 21 training takes 0:16:12 [2022-09-29 15:02:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.075 (4.075) Loss 1.7215 (1.7215) Acc@1 60.938 (60.938) Acc@5 83.594 (83.594) [2022-09-29 15:02:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 60.934 Acc@5 84.180 [2022-09-29 15:02:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 60.9% [2022-09-29 15:02:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 60.93% [2022-09-29 15:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][0/1251] eta 1:42:22 lr 0.000987 time 4.9097 (4.9097) loss 5.2409 (5.2409) grad_norm 1.1358 (1.1358) [2022-09-29 15:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][100/1251] eta 0:15:37 lr 0.000987 time 0.8481 (0.8143) loss 4.6487 (4.3890) grad_norm 1.2483 (1.2394) [2022-09-29 15:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][200/1251] eta 0:13:58 lr 0.000987 time 0.7199 (0.7980) loss 4.5331 (4.3943) grad_norm 1.3570 (1.2454) [2022-09-29 15:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][300/1251] eta 0:12:31 lr 0.000987 time 0.7331 (0.7901) loss 4.2625 (4.3965) grad_norm 1.2250 (1.2496) [2022-09-29 15:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][400/1251] eta 0:11:11 lr 0.000987 time 0.9193 (0.7887) loss 4.6754 (4.4013) grad_norm 0.9958 (1.2521) [2022-09-29 15:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][500/1251] eta 0:09:49 lr 0.000986 time 0.7526 (0.7848) loss 4.8805 (4.4043) grad_norm 1.1720 (1.2342) [2022-09-29 15:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][600/1251] eta 0:08:29 lr 0.000986 time 0.7264 (0.7830) loss 4.3990 (4.4016) grad_norm 0.9505 (1.2356) [2022-09-29 15:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][700/1251] eta 0:07:09 lr 0.000986 time 0.7608 (0.7794) loss 4.7893 (4.3966) grad_norm 1.1745 (1.2303) [2022-09-29 15:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][800/1251] eta 0:05:51 lr 0.000986 time 0.8333 (0.7786) loss 4.6342 (4.4007) grad_norm 1.5953 (1.2328) [2022-09-29 15:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][900/1251] eta 0:04:33 lr 0.000986 time 0.9240 (0.7779) loss 4.6143 (4.3905) grad_norm 1.1166 (1.2302) [2022-09-29 15:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1000/1251] eta 0:03:15 lr 0.000986 time 0.7182 (0.7779) loss 4.6879 (4.3903) grad_norm 1.0116 (1.2306) [2022-09-29 15:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1100/1251] eta 0:01:57 lr 0.000986 time 0.6641 (0.7770) loss 5.2314 (4.3981) grad_norm 1.5092 (1.2284) [2022-09-29 15:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1200/1251] eta 0:00:39 lr 0.000986 time 0.7314 (0.7767) loss 4.3638 (4.4029) grad_norm 1.4037 (1.2273) [2022-09-29 15:18:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 22 training takes 0:16:11 [2022-09-29 15:18:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.401 (4.401) Loss 1.6676 (1.6676) Acc@1 64.062 (64.062) Acc@5 85.352 (85.352) [2022-09-29 15:19:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 62.082 Acc@5 84.796 [2022-09-29 15:19:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 62.1% [2022-09-29 15:19:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 62.08% [2022-09-29 15:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][0/1251] eta 1:35:31 lr 0.000986 time 4.5813 (4.5813) loss 3.5243 (3.5243) grad_norm 1.3699 (1.3699) [2022-09-29 15:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][100/1251] eta 0:15:37 lr 0.000986 time 0.7315 (0.8142) loss 4.1531 (4.4121) grad_norm 1.3725 (1.1844) [2022-09-29 15:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][200/1251] eta 0:13:57 lr 0.000986 time 0.8298 (0.7972) loss 4.2854 (4.3471) grad_norm 1.2695 (1.1927) [2022-09-29 15:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][300/1251] eta 0:12:31 lr 0.000985 time 0.8549 (0.7898) loss 4.2222 (4.3531) grad_norm 1.1418 (1.1992) [2022-09-29 15:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][400/1251] eta 0:11:08 lr 0.000985 time 0.7722 (0.7856) loss 4.8632 (4.3586) grad_norm 1.1894 (1.1979) [2022-09-29 15:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][500/1251] eta 0:09:49 lr 0.000985 time 0.7991 (0.7844) loss 4.5740 (4.3517) grad_norm 1.2131 (1.2006) [2022-09-29 15:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][600/1251] eta 0:08:30 lr 0.000985 time 0.7894 (0.7835) loss 5.3267 (4.3798) grad_norm 1.2150 (1.2002) [2022-09-29 15:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][700/1251] eta 0:07:09 lr 0.000985 time 0.7918 (0.7803) loss 4.6561 (4.3710) grad_norm 1.1035 (1.1965) [2022-09-29 15:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][800/1251] eta 0:05:52 lr 0.000985 time 0.8158 (0.7812) loss 4.8139 (4.3772) grad_norm 1.0291 (1.1974) [2022-09-29 15:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][900/1251] eta 0:04:33 lr 0.000985 time 0.8343 (0.7806) loss 4.5352 (4.3695) grad_norm 1.0464 (1.1913) [2022-09-29 15:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1000/1251] eta 0:03:15 lr 0.000985 time 0.6327 (0.7791) loss 3.8040 (4.3671) grad_norm 1.2721 (1.1944) [2022-09-29 15:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1100/1251] eta 0:01:57 lr 0.000985 time 0.7891 (0.7787) loss 5.2217 (4.3685) grad_norm 1.2333 (1.1896) [2022-09-29 15:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1200/1251] eta 0:00:39 lr 0.000985 time 0.6635 (0.7783) loss 3.0887 (4.3675) grad_norm 1.1461 (1.1903) [2022-09-29 15:35:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 23 training takes 0:16:13 [2022-09-29 15:35:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.331 (4.331) Loss 1.5893 (1.5893) Acc@1 64.453 (64.453) Acc@5 87.793 (87.793) [2022-09-29 15:35:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 62.202 Acc@5 84.990 [2022-09-29 15:35:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 62.2% [2022-09-29 15:35:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 62.20% [2022-09-29 15:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][0/1251] eta 1:47:34 lr 0.000984 time 5.1599 (5.1599) loss 4.9390 (4.9390) grad_norm 1.0515 (1.0515) [2022-09-29 15:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][100/1251] eta 0:15:38 lr 0.000984 time 0.7758 (0.8154) loss 4.6913 (4.3815) grad_norm 1.1643 (1.1964) [2022-09-29 15:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][200/1251] eta 0:14:00 lr 0.000984 time 0.8270 (0.7999) loss 3.1921 (4.3761) grad_norm 1.1554 (1.1870) [2022-09-29 15:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][300/1251] eta 0:12:33 lr 0.000984 time 0.8499 (0.7918) loss 4.2845 (4.3692) grad_norm 1.1215 (1.2008) [2022-09-29 15:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][400/1251] eta 0:11:09 lr 0.000984 time 0.7972 (0.7868) loss 4.6543 (4.3563) grad_norm 1.0448 (1.1942) [2022-09-29 15:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][500/1251] eta 0:09:48 lr 0.000984 time 0.7994 (0.7837) loss 3.4584 (4.3451) grad_norm 0.9263 (1.1872) [2022-09-29 15:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][600/1251] eta 0:08:28 lr 0.000984 time 0.6883 (0.7815) loss 3.9687 (4.3345) grad_norm 1.2066 (1.1824) [2022-09-29 15:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][700/1251] eta 0:07:10 lr 0.000984 time 0.6770 (0.7805) loss 4.8473 (4.3303) grad_norm 1.0089 (1.1840) [2022-09-29 15:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][800/1251] eta 0:05:51 lr 0.000984 time 0.9355 (0.7798) loss 4.5481 (4.3517) grad_norm 1.0862 (1.1795) [2022-09-29 15:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][900/1251] eta 0:04:33 lr 0.000984 time 0.6802 (0.7789) loss 3.3243 (4.3538) grad_norm 1.2142 (1.1810) [2022-09-29 15:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1000/1251] eta 0:03:15 lr 0.000983 time 0.7970 (0.7777) loss 5.1737 (4.3447) grad_norm 1.2565 (1.1804) [2022-09-29 15:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1100/1251] eta 0:01:57 lr 0.000983 time 0.6452 (0.7776) loss 4.6063 (4.3542) grad_norm 1.1376 (1.1757) [2022-09-29 15:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1200/1251] eta 0:00:39 lr 0.000983 time 0.3112 (0.7736) loss 4.2596 (4.3499) grad_norm 1.1356 (1.1757) [2022-09-29 15:51:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 24 training takes 0:15:58 [2022-09-29 15:51:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.216 (4.216) Loss 1.5992 (1.5992) Acc@1 62.305 (62.305) Acc@5 86.816 (86.816) [2022-09-29 15:52:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 63.114 Acc@5 85.818 [2022-09-29 15:52:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 63.1% [2022-09-29 15:52:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 63.11% [2022-09-29 15:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][0/1251] eta 1:41:34 lr 0.000983 time 4.8721 (4.8721) loss 5.3171 (5.3171) grad_norm 0.8895 (0.8895) [2022-09-29 15:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][100/1251] eta 0:15:29 lr 0.000983 time 0.8507 (0.8074) loss 4.9868 (4.3152) grad_norm 1.0250 (1.1617) [2022-09-29 15:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][200/1251] eta 0:13:52 lr 0.000983 time 0.8491 (0.7922) loss 4.6268 (4.3268) grad_norm 1.1443 (1.1456) [2022-09-29 15:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][300/1251] eta 0:12:27 lr 0.000983 time 0.7028 (0.7863) loss 5.1022 (4.3210) grad_norm 0.9980 (1.1526) [2022-09-29 15:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][400/1251] eta 0:11:05 lr 0.000983 time 0.7556 (0.7824) loss 5.0647 (4.3230) grad_norm 1.2010 (1.1551) [2022-09-29 15:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][500/1251] eta 0:09:45 lr 0.000983 time 0.9129 (0.7802) loss 4.5224 (4.3291) grad_norm 1.1152 (1.1510) [2022-09-29 15:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][600/1251] eta 0:08:27 lr 0.000982 time 0.6592 (0.7801) loss 4.6403 (4.3269) grad_norm 1.3495 (1.1557) [2022-09-29 16:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][700/1251] eta 0:07:08 lr 0.000982 time 0.7917 (0.7783) loss 3.3800 (4.3228) grad_norm 1.2048 (1.1601) [2022-09-29 16:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][800/1251] eta 0:05:50 lr 0.000982 time 0.7813 (0.7772) loss 4.3882 (4.3137) grad_norm 0.9867 (1.1615) [2022-09-29 16:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][900/1251] eta 0:04:32 lr 0.000982 time 0.7633 (0.7769) loss 4.3509 (4.3267) grad_norm 1.1325 (1.1620) [2022-09-29 16:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1000/1251] eta 0:03:14 lr 0.000982 time 0.8553 (0.7768) loss 3.5780 (4.3272) grad_norm 0.9992 (1.1615) [2022-09-29 16:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1100/1251] eta 0:01:57 lr 0.000982 time 0.8052 (0.7769) loss 3.7283 (4.3298) grad_norm 1.0206 (1.1607) [2022-09-29 16:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1200/1251] eta 0:00:39 lr 0.000982 time 0.6825 (0.7762) loss 5.1803 (4.3277) grad_norm 1.0698 (1.1561) [2022-09-29 16:08:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 25 training takes 0:16:11 [2022-09-29 16:08:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.434 (4.434) Loss 1.5449 (1.5449) Acc@1 64.453 (64.453) Acc@5 86.035 (86.035) [2022-09-29 16:08:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 63.540 Acc@5 86.256 [2022-09-29 16:08:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 63.5% [2022-09-29 16:08:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 63.54% [2022-09-29 16:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][0/1251] eta 1:33:18 lr 0.000982 time 4.4753 (4.4753) loss 4.4396 (4.4396) grad_norm 1.3142 (1.3142) [2022-09-29 16:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][100/1251] eta 0:15:24 lr 0.000982 time 0.8283 (0.8030) loss 4.9668 (4.2643) grad_norm 1.1439 (1.1607) [2022-09-29 16:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][200/1251] eta 0:13:44 lr 0.000982 time 0.7663 (0.7848) loss 4.6569 (4.2821) grad_norm 1.0120 (1.1594) [2022-09-29 16:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][300/1251] eta 0:12:21 lr 0.000981 time 0.6474 (0.7794) loss 5.1852 (4.3093) grad_norm 1.2391 (1.1444) [2022-09-29 16:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][400/1251] eta 0:11:04 lr 0.000981 time 0.6735 (0.7812) loss 4.4832 (4.2975) grad_norm 1.4562 (1.1431) [2022-09-29 16:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][500/1251] eta 0:09:46 lr 0.000981 time 0.7495 (0.7807) loss 4.7002 (4.2912) grad_norm 1.0996 (1.1480) [2022-09-29 16:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][600/1251] eta 0:08:28 lr 0.000981 time 0.7792 (0.7817) loss 4.6014 (4.3033) grad_norm 0.9547 (1.1446) [2022-09-29 16:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][700/1251] eta 0:07:10 lr 0.000981 time 0.9133 (0.7815) loss 5.0422 (4.2918) grad_norm 1.2139 (1.1398) [2022-09-29 16:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][800/1251] eta 0:05:51 lr 0.000981 time 0.8220 (0.7803) loss 4.9768 (4.3007) grad_norm 1.3627 (1.1410) [2022-09-29 16:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][900/1251] eta 0:04:33 lr 0.000981 time 0.8017 (0.7801) loss 4.7354 (4.2925) grad_norm 1.0581 (1.1392) [2022-09-29 16:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1000/1251] eta 0:03:15 lr 0.000981 time 0.8154 (0.7782) loss 4.5534 (4.2976) grad_norm 1.2175 (1.1370) [2022-09-29 16:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1100/1251] eta 0:01:57 lr 0.000981 time 0.7810 (0.7783) loss 4.4430 (4.3058) grad_norm 0.9737 (1.1398) [2022-09-29 16:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1200/1251] eta 0:00:39 lr 0.000980 time 0.7913 (0.7774) loss 4.5562 (4.2916) grad_norm 1.2180 (1.1391) [2022-09-29 16:24:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 26 training takes 0:16:12 [2022-09-29 16:24:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.939 (3.939) Loss 1.6566 (1.6566) Acc@1 61.035 (61.035) Acc@5 83.594 (83.594) [2022-09-29 16:25:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 64.222 Acc@5 86.436 [2022-09-29 16:25:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 64.2% [2022-09-29 16:25:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 64.22% [2022-09-29 16:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][0/1251] eta 1:22:02 lr 0.000980 time 3.9345 (3.9345) loss 4.6873 (4.6873) grad_norm 1.2058 (1.2058) [2022-09-29 16:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][100/1251] eta 0:15:34 lr 0.000980 time 0.7437 (0.8117) loss 4.6519 (4.3601) grad_norm 1.2220 (1.1606) [2022-09-29 16:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][200/1251] eta 0:13:58 lr 0.000980 time 0.8660 (0.7979) loss 4.6312 (4.2642) grad_norm 0.9696 (1.1847) [2022-09-29 16:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][300/1251] eta 0:12:35 lr 0.000980 time 0.8065 (0.7945) loss 3.9414 (4.2526) grad_norm 1.0023 (1.1698) [2022-09-29 16:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][400/1251] eta 0:11:13 lr 0.000980 time 0.8119 (0.7914) loss 3.5928 (4.2394) grad_norm 1.0673 (1.1617) [2022-09-29 16:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][500/1251] eta 0:09:53 lr 0.000980 time 0.8452 (0.7898) loss 4.3197 (4.2238) grad_norm 1.0413 (1.1510) [2022-09-29 16:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][600/1251] eta 0:08:33 lr 0.000980 time 0.7432 (0.7887) loss 4.5906 (4.2406) grad_norm 1.0689 (1.1411) [2022-09-29 16:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][700/1251] eta 0:07:14 lr 0.000980 time 0.7452 (0.7885) loss 3.9296 (4.2350) grad_norm 1.1666 (1.1387) [2022-09-29 16:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][800/1251] eta 0:05:54 lr 0.000979 time 0.8634 (0.7863) loss 4.2713 (4.2397) grad_norm 0.9971 (1.1381) [2022-09-29 16:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][900/1251] eta 0:04:35 lr 0.000979 time 0.7938 (0.7854) loss 3.3856 (4.2246) grad_norm 0.9843 (1.1388) [2022-09-29 16:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1000/1251] eta 0:03:16 lr 0.000979 time 0.8678 (0.7847) loss 4.9904 (4.2409) grad_norm 1.1759 (1.1351) [2022-09-29 16:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1100/1251] eta 0:01:58 lr 0.000979 time 0.8437 (0.7842) loss 3.5685 (4.2447) grad_norm 1.1197 (1.1315) [2022-09-29 16:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1200/1251] eta 0:00:39 lr 0.000979 time 0.6978 (0.7834) loss 3.5495 (4.2460) grad_norm 1.0591 (1.1348) [2022-09-29 16:41:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 27 training takes 0:16:20 [2022-09-29 16:41:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.080 (4.080) Loss 1.6463 (1.6463) Acc@1 63.281 (63.281) Acc@5 85.254 (85.254) [2022-09-29 16:41:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.010 Acc@5 86.774 [2022-09-29 16:41:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.0% [2022-09-29 16:41:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.01% [2022-09-29 16:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][0/1251] eta 1:41:06 lr 0.000979 time 4.8497 (4.8497) loss 4.6423 (4.6423) grad_norm 1.0688 (1.0688) [2022-09-29 16:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][100/1251] eta 0:15:57 lr 0.000979 time 0.8443 (0.8321) loss 4.9028 (4.2452) grad_norm 1.0195 (1.1321) [2022-09-29 16:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][200/1251] eta 0:14:07 lr 0.000979 time 0.8510 (0.8064) loss 3.9322 (4.3023) grad_norm 1.5783 (1.1232) [2022-09-29 16:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][300/1251] eta 0:12:33 lr 0.000979 time 0.6390 (0.7928) loss 3.5614 (4.2784) grad_norm 0.9879 (1.1138) [2022-09-29 16:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][400/1251] eta 0:11:10 lr 0.000978 time 0.7666 (0.7878) loss 3.8393 (4.2360) grad_norm 1.1400 (1.1258) [2022-09-29 16:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][500/1251] eta 0:09:50 lr 0.000978 time 0.7886 (0.7866) loss 4.0165 (4.2432) grad_norm 1.0690 (1.1256) [2022-09-29 16:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][600/1251] eta 0:08:31 lr 0.000978 time 0.8734 (0.7855) loss 4.4121 (4.2570) grad_norm 0.9188 (1.1279) [2022-09-29 16:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][700/1251] eta 0:07:11 lr 0.000978 time 0.8217 (0.7827) loss 4.3105 (4.2347) grad_norm 1.1747 (1.1250) [2022-09-29 16:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][800/1251] eta 0:05:51 lr 0.000978 time 0.8019 (0.7805) loss 4.2600 (4.2330) grad_norm 0.9843 (1.1235) [2022-09-29 16:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][900/1251] eta 0:04:33 lr 0.000978 time 0.6767 (0.7796) loss 4.4197 (4.2209) grad_norm 1.1266 (1.1252) [2022-09-29 16:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1000/1251] eta 0:03:15 lr 0.000978 time 0.8289 (0.7790) loss 4.0809 (4.2172) grad_norm 1.0132 (1.1292) [2022-09-29 16:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1100/1251] eta 0:01:57 lr 0.000978 time 0.7058 (0.7787) loss 4.1308 (4.2147) grad_norm 0.9921 (1.1273) [2022-09-29 16:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1200/1251] eta 0:00:39 lr 0.000977 time 0.7989 (0.7775) loss 4.4136 (4.2120) grad_norm 1.0107 (1.1268) [2022-09-29 16:57:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 28 training takes 0:16:11 [2022-09-29 16:58:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 5.168 (5.168) Loss 1.5130 (1.5130) Acc@1 66.699 (66.699) Acc@5 85.840 (85.840) [2022-09-29 16:58:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.470 Acc@5 87.168 [2022-09-29 16:58:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.5% [2022-09-29 16:58:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.47% [2022-09-29 16:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][0/1251] eta 1:27:56 lr 0.000977 time 4.2181 (4.2181) loss 4.6066 (4.6066) grad_norm 1.1593 (1.1593) [2022-09-29 16:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][100/1251] eta 0:15:39 lr 0.000977 time 0.8204 (0.8165) loss 5.0113 (4.1005) grad_norm 1.0647 (1.1665) [2022-09-29 17:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][200/1251] eta 0:13:58 lr 0.000977 time 0.8574 (0.7983) loss 3.2157 (4.1253) grad_norm 1.1073 (1.1746) [2022-09-29 17:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][300/1251] eta 0:12:31 lr 0.000977 time 0.8115 (0.7902) loss 4.2719 (4.1582) grad_norm 0.8417 (1.1553) [2022-09-29 17:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][400/1251] eta 0:11:09 lr 0.000977 time 0.8523 (0.7873) loss 3.9502 (4.1737) grad_norm 0.9948 (1.1434) [2022-09-29 17:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][500/1251] eta 0:09:50 lr 0.000977 time 0.8067 (0.7867) loss 3.5241 (4.1756) grad_norm 1.0407 (1.1432) [2022-09-29 17:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][600/1251] eta 0:08:30 lr 0.000977 time 0.7893 (0.7849) loss 4.3589 (4.1699) grad_norm 1.0728 (1.1349) [2022-09-29 17:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][700/1251] eta 0:07:11 lr 0.000976 time 0.7177 (0.7837) loss 4.7696 (4.1821) grad_norm 0.9336 (1.1354) [2022-09-29 17:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][800/1251] eta 0:05:52 lr 0.000976 time 0.8532 (0.7826) loss 4.7413 (4.1851) grad_norm 1.0499 (1.1303) [2022-09-29 17:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][900/1251] eta 0:04:34 lr 0.000976 time 0.6371 (0.7819) loss 4.8871 (4.1884) grad_norm 0.9847 (1.1286) [2022-09-29 17:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1000/1251] eta 0:03:15 lr 0.000976 time 0.7660 (0.7805) loss 5.3236 (4.1848) grad_norm 1.2387 (1.1272) [2022-09-29 17:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1100/1251] eta 0:01:57 lr 0.000976 time 0.7998 (0.7804) loss 4.0926 (4.1873) grad_norm 1.1772 (1.1252) [2022-09-29 17:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1200/1251] eta 0:00:39 lr 0.000976 time 0.8287 (0.7805) loss 4.4655 (4.1874) grad_norm 1.3459 (1.1228) [2022-09-29 17:14:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 29 training takes 0:16:16 [2022-09-29 17:14:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.787 (3.787) Loss 1.5183 (1.5183) Acc@1 65.234 (65.234) Acc@5 88.184 (88.184) [2022-09-29 17:14:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.598 Acc@5 87.528 [2022-09-29 17:14:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.6% [2022-09-29 17:14:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.60% [2022-09-29 17:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][0/1251] eta 1:45:14 lr 0.000976 time 5.0473 (5.0473) loss 4.8927 (4.8927) grad_norm 1.0204 (1.0204) [2022-09-29 17:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][100/1251] eta 0:15:49 lr 0.000976 time 0.8000 (0.8246) loss 3.4183 (4.1689) grad_norm 1.2247 (1.1018) [2022-09-29 17:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][200/1251] eta 0:14:02 lr 0.000976 time 0.8440 (0.8021) loss 4.9746 (4.2093) grad_norm 1.1693 (1.1086) [2022-09-29 17:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][300/1251] eta 0:12:36 lr 0.000975 time 0.8572 (0.7956) loss 4.2931 (4.1746) grad_norm 1.2608 (1.1013) [2022-09-29 17:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][400/1251] eta 0:11:12 lr 0.000975 time 0.8239 (0.7906) loss 4.5064 (4.1835) grad_norm 1.1006 (1.1084) [2022-09-29 17:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][500/1251] eta 0:09:51 lr 0.000975 time 0.9228 (0.7880) loss 3.2404 (4.1623) grad_norm 1.1743 (1.1054) [2022-09-29 17:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][600/1251] eta 0:08:30 lr 0.000975 time 0.6822 (0.7842) loss 2.7941 (4.1719) grad_norm 0.8882 (1.1079) [2022-09-29 17:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][700/1251] eta 0:07:11 lr 0.000975 time 0.8221 (0.7828) loss 3.7593 (4.1772) grad_norm 1.0145 (1.1087) [2022-09-29 17:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][800/1251] eta 0:05:52 lr 0.000975 time 0.6537 (0.7809) loss 4.3691 (4.1719) grad_norm 1.3289 (1.1094) [2022-09-29 17:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][900/1251] eta 0:04:33 lr 0.000975 time 0.6513 (0.7803) loss 3.6718 (4.1717) grad_norm 1.3015 (1.1076) [2022-09-29 17:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1000/1251] eta 0:03:15 lr 0.000974 time 0.8715 (0.7797) loss 4.2785 (4.1734) grad_norm 1.2768 (1.1091) [2022-09-29 17:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1100/1251] eta 0:01:57 lr 0.000974 time 0.8565 (0.7800) loss 4.0868 (4.1658) grad_norm 1.1709 (1.1066) [2022-09-29 17:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1200/1251] eta 0:00:39 lr 0.000974 time 0.7843 (0.7795) loss 4.3550 (4.1652) grad_norm 0.9711 (1.1033) [2022-09-29 17:31:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 30 training takes 0:16:15 [2022-09-29 17:31:13 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saving...... [2022-09-29 17:31:13 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saved !!! [2022-09-29 17:31:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.100 (4.100) Loss 1.5439 (1.5439) Acc@1 64.746 (64.746) Acc@5 86.914 (86.914) [2022-09-29 17:31:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.890 Acc@5 87.546 [2022-09-29 17:31:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.9% [2022-09-29 17:31:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.89% [2022-09-29 17:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][0/1251] eta 1:37:41 lr 0.000974 time 4.6851 (4.6851) loss 3.0640 (3.0640) grad_norm 1.3178 (1.3178) [2022-09-29 17:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][100/1251] eta 0:15:39 lr 0.000974 time 0.8529 (0.8162) loss 4.5324 (4.0325) grad_norm 0.9256 (1.1271) [2022-09-29 17:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][200/1251] eta 0:14:00 lr 0.000974 time 0.7162 (0.7993) loss 4.2727 (4.1165) grad_norm 1.1213 (1.1163) [2022-09-29 17:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][300/1251] eta 0:12:30 lr 0.000974 time 0.8031 (0.7895) loss 3.7474 (4.1368) grad_norm 0.9656 (1.1052) [2022-09-29 17:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][400/1251] eta 0:11:10 lr 0.000974 time 0.7803 (0.7875) loss 4.1247 (4.1481) grad_norm 0.9789 (1.1129) [2022-09-29 17:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][500/1251] eta 0:09:50 lr 0.000973 time 0.7130 (0.7859) loss 3.8999 (4.1511) grad_norm 1.1334 (1.1135) [2022-09-29 17:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][600/1251] eta 0:08:31 lr 0.000973 time 0.8413 (0.7858) loss 4.6039 (4.1445) grad_norm 0.9780 (1.1156) [2022-09-29 17:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][700/1251] eta 0:07:13 lr 0.000973 time 0.8065 (0.7860) loss 4.6172 (4.1571) grad_norm 0.9510 (1.1164) [2022-09-29 17:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][800/1251] eta 0:05:54 lr 0.000973 time 0.7320 (0.7850) loss 4.0680 (4.1618) grad_norm 1.1741 (1.1132) [2022-09-29 17:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][900/1251] eta 0:04:35 lr 0.000973 time 0.8286 (0.7837) loss 4.3785 (4.1679) grad_norm 1.2250 (1.1125) [2022-09-29 17:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1000/1251] eta 0:03:16 lr 0.000973 time 0.8096 (0.7831) loss 4.3709 (4.1692) grad_norm 1.6427 (1.1097) [2022-09-29 17:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1100/1251] eta 0:01:58 lr 0.000973 time 0.6401 (0.7815) loss 3.0904 (4.1662) grad_norm 1.0911 (1.1098) [2022-09-29 17:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1200/1251] eta 0:00:39 lr 0.000973 time 0.9389 (0.7815) loss 4.4310 (4.1638) grad_norm 1.2228 (1.1088) [2022-09-29 17:47:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 31 training takes 0:16:17 [2022-09-29 17:47:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.420 (4.420) Loss 1.5007 (1.5007) Acc@1 65.723 (65.723) Acc@5 87.793 (87.793) [2022-09-29 17:48:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 66.262 Acc@5 87.816 [2022-09-29 17:48:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 66.3% [2022-09-29 17:48:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 66.26% [2022-09-29 17:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][0/1251] eta 1:42:48 lr 0.000972 time 4.9306 (4.9306) loss 3.7816 (3.7816) grad_norm 0.9581 (0.9581) [2022-09-29 17:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][100/1251] eta 0:15:18 lr 0.000972 time 0.6703 (0.7984) loss 3.7881 (4.1100) grad_norm 0.9642 (1.0717) [2022-09-29 17:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][200/1251] eta 0:13:47 lr 0.000972 time 0.8337 (0.7874) loss 3.9409 (4.1298) grad_norm 1.1587 (1.0912) [2022-09-29 17:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][300/1251] eta 0:12:24 lr 0.000972 time 0.6888 (0.7825) loss 4.6166 (4.1599) grad_norm 1.1233 (1.0977) [2022-09-29 17:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][400/1251] eta 0:11:02 lr 0.000972 time 0.9594 (0.7782) loss 3.3288 (4.1516) grad_norm 1.0070 (1.0982) [2022-09-29 17:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][500/1251] eta 0:09:43 lr 0.000972 time 0.8312 (0.7766) loss 3.8783 (4.1763) grad_norm 1.1905 (1.1038) [2022-09-29 17:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][600/1251] eta 0:08:25 lr 0.000972 time 0.7832 (0.7767) loss 3.3010 (4.1546) grad_norm 1.2125 (1.1004) [2022-09-29 17:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][700/1251] eta 0:07:07 lr 0.000972 time 0.8127 (0.7766) loss 4.2607 (4.1506) grad_norm 0.9083 (1.1013) [2022-09-29 17:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][800/1251] eta 0:05:49 lr 0.000971 time 0.7473 (0.7752) loss 3.9750 (4.1381) grad_norm 1.1919 (1.1007) [2022-09-29 17:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][900/1251] eta 0:04:31 lr 0.000971 time 0.9212 (0.7747) loss 4.4874 (4.1345) grad_norm 0.9638 (1.0967) [2022-09-29 18:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1000/1251] eta 0:03:14 lr 0.000971 time 0.8001 (0.7743) loss 4.1869 (4.1365) grad_norm 1.1800 (1.0968) [2022-09-29 18:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1100/1251] eta 0:01:56 lr 0.000971 time 0.8265 (0.7735) loss 4.1963 (4.1454) grad_norm 0.9868 (1.0960) [2022-09-29 18:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1200/1251] eta 0:00:39 lr 0.000971 time 0.7278 (0.7730) loss 4.9165 (4.1472) grad_norm 1.0612 (1.0965) [2022-09-29 18:04:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 32 training takes 0:16:07 [2022-09-29 18:04:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.196 (4.196) Loss 1.3691 (1.3691) Acc@1 68.066 (68.066) Acc@5 90.332 (90.332) [2022-09-29 18:04:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 66.632 Acc@5 88.148 [2022-09-29 18:04:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 66.6% [2022-09-29 18:04:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 66.63% [2022-09-29 18:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][0/1251] eta 1:46:46 lr 0.000971 time 5.1207 (5.1207) loss 4.4136 (4.4136) grad_norm 1.3381 (1.3381) [2022-09-29 18:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][100/1251] eta 0:15:39 lr 0.000971 time 0.7576 (0.8160) loss 4.0227 (4.1176) grad_norm 0.9799 (1.1515) [2022-09-29 18:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][200/1251] eta 0:13:54 lr 0.000970 time 0.8213 (0.7943) loss 3.3488 (4.0999) grad_norm 0.9217 (1.1123) [2022-09-29 18:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][300/1251] eta 0:12:27 lr 0.000970 time 0.8171 (0.7861) loss 4.2904 (4.1124) grad_norm 1.0573 (1.0959) [2022-09-29 18:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][400/1251] eta 0:11:04 lr 0.000970 time 0.7748 (0.7809) loss 3.5683 (4.0877) grad_norm 0.9204 (1.0907) [2022-09-29 18:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][500/1251] eta 0:09:43 lr 0.000970 time 0.7835 (0.7770) loss 4.2306 (4.0796) grad_norm 1.2734 (1.0926) [2022-09-29 18:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][600/1251] eta 0:08:26 lr 0.000970 time 0.8996 (0.7781) loss 3.4250 (4.0708) grad_norm 0.9623 (1.0941) [2022-09-29 18:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][700/1251] eta 0:07:08 lr 0.000970 time 0.6835 (0.7768) loss 4.6961 (4.0881) grad_norm 1.1672 (1.0980) [2022-09-29 18:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][800/1251] eta 0:05:44 lr 0.000970 time 0.5350 (0.7645) loss 3.2517 (4.0912) grad_norm 1.3788 (1.1023) [2022-09-29 18:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][900/1251] eta 0:04:25 lr 0.000969 time 0.8162 (0.7566) loss 3.3342 (4.0967) grad_norm 1.0544 (1.0999) [2022-09-29 18:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1000/1251] eta 0:03:10 lr 0.000969 time 0.6494 (0.7578) loss 4.5663 (4.1009) grad_norm 1.0274 (1.1000) [2022-09-29 18:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1100/1251] eta 0:01:54 lr 0.000969 time 0.9072 (0.7589) loss 3.9933 (4.1017) grad_norm 1.2596 (1.1016) [2022-09-29 18:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1200/1251] eta 0:00:38 lr 0.000969 time 0.7934 (0.7596) loss 4.6390 (4.1067) grad_norm 1.1663 (1.0999) [2022-09-29 18:20:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 33 training takes 0:15:51 [2022-09-29 18:20:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.341 (4.341) Loss 1.4409 (1.4409) Acc@1 65.430 (65.430) Acc@5 87.207 (87.207) [2022-09-29 18:20:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.110 Acc@5 88.344 [2022-09-29 18:20:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.1% [2022-09-29 18:20:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.11% [2022-09-29 18:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][0/1251] eta 1:44:31 lr 0.000969 time 5.0130 (5.0130) loss 4.2931 (4.2931) grad_norm 1.1578 (1.1578) [2022-09-29 18:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][100/1251] eta 0:15:43 lr 0.000969 time 0.7404 (0.8201) loss 4.3705 (4.0037) grad_norm 1.0591 (1.0897) [2022-09-29 18:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][200/1251] eta 0:13:57 lr 0.000969 time 0.8061 (0.7971) loss 4.5212 (4.0462) grad_norm 1.2156 (1.1143) [2022-09-29 18:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][300/1251] eta 0:12:31 lr 0.000969 time 0.8992 (0.7907) loss 4.1886 (4.0652) grad_norm 1.3401 (1.1154) [2022-09-29 18:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][400/1251] eta 0:11:09 lr 0.000968 time 0.7900 (0.7872) loss 3.4327 (4.0612) grad_norm 1.0611 (1.1125) [2022-09-29 18:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][500/1251] eta 0:09:48 lr 0.000968 time 0.7750 (0.7836) loss 4.3758 (4.0565) grad_norm 1.2939 (1.1073) [2022-09-29 18:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][600/1251] eta 0:08:28 lr 0.000968 time 0.7043 (0.7816) loss 3.6126 (4.0679) grad_norm 0.9033 (1.1061) [2022-09-29 18:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][700/1251] eta 0:07:10 lr 0.000968 time 0.7876 (0.7812) loss 4.4575 (4.0811) grad_norm 1.0466 (1.1018) [2022-09-29 18:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][800/1251] eta 0:05:51 lr 0.000968 time 0.7786 (0.7804) loss 4.1715 (4.0937) grad_norm 0.9292 (1.1033) [2022-09-29 18:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][900/1251] eta 0:04:33 lr 0.000968 time 0.7977 (0.7797) loss 4.2914 (4.1029) grad_norm 1.2098 (1.1036) [2022-09-29 18:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1000/1251] eta 0:03:15 lr 0.000967 time 0.7232 (0.7801) loss 4.8018 (4.0990) grad_norm 0.9346 (1.1008) [2022-09-29 18:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1100/1251] eta 0:01:57 lr 0.000967 time 0.7976 (0.7792) loss 3.8508 (4.0972) grad_norm 0.9148 (1.0973) [2022-09-29 18:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1200/1251] eta 0:00:39 lr 0.000967 time 0.8091 (0.7792) loss 4.9944 (4.0999) grad_norm 0.9581 (1.0968) [2022-09-29 18:37:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 34 training takes 0:16:14 [2022-09-29 18:37:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.136 (4.136) Loss 1.3950 (1.3950) Acc@1 66.699 (66.699) Acc@5 87.988 (87.988) [2022-09-29 18:37:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.276 Acc@5 88.344 [2022-09-29 18:37:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.3% [2022-09-29 18:37:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.28% [2022-09-29 18:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][0/1251] eta 1:40:23 lr 0.000967 time 4.8147 (4.8147) loss 4.4138 (4.4138) grad_norm 1.6406 (1.6406) [2022-09-29 18:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][100/1251] eta 0:15:48 lr 0.000967 time 0.7815 (0.8242) loss 3.1405 (4.0451) grad_norm 0.8680 (1.0972) [2022-09-29 18:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][200/1251] eta 0:13:57 lr 0.000967 time 0.8512 (0.7966) loss 2.9199 (4.0753) grad_norm 0.9805 (1.0839) [2022-09-29 18:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][300/1251] eta 0:12:30 lr 0.000967 time 0.8098 (0.7897) loss 4.5813 (4.0966) grad_norm 0.9979 (1.0778) [2022-09-29 18:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][400/1251] eta 0:11:10 lr 0.000967 time 0.8428 (0.7873) loss 4.6781 (4.1223) grad_norm 0.9584 (1.0855) [2022-09-29 18:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][500/1251] eta 0:09:48 lr 0.000966 time 0.8367 (0.7830) loss 4.1979 (4.1245) grad_norm 0.9988 (1.0797) [2022-09-29 18:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][600/1251] eta 0:08:28 lr 0.000966 time 0.8015 (0.7805) loss 3.9126 (4.1339) grad_norm 0.9105 (1.0873) [2022-09-29 18:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][700/1251] eta 0:07:09 lr 0.000966 time 0.6799 (0.7793) loss 4.0637 (4.1139) grad_norm 1.1238 (1.0894) [2022-09-29 18:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][800/1251] eta 0:05:51 lr 0.000966 time 0.8048 (0.7793) loss 4.5275 (4.1073) grad_norm 1.1165 (1.0958) [2022-09-29 18:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][900/1251] eta 0:04:33 lr 0.000966 time 0.8650 (0.7793) loss 3.3514 (4.1090) grad_norm 1.1698 (1.0994) [2022-09-29 18:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1000/1251] eta 0:03:15 lr 0.000966 time 0.8256 (0.7786) loss 4.1214 (4.1143) grad_norm 1.2612 (1.0954) [2022-09-29 18:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1100/1251] eta 0:01:57 lr 0.000965 time 0.7608 (0.7775) loss 4.9051 (4.1062) grad_norm 1.2004 (1.0959) [2022-09-29 18:53:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1200/1251] eta 0:00:39 lr 0.000965 time 0.7480 (0.7768) loss 3.0709 (4.1051) grad_norm 1.1765 (1.0936) [2022-09-29 18:53:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 35 training takes 0:16:11 [2022-09-29 18:53:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.056 (4.056) Loss 1.5434 (1.5434) Acc@1 65.039 (65.039) Acc@5 87.305 (87.305) [2022-09-29 18:54:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.398 Acc@5 88.512 [2022-09-29 18:54:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.4% [2022-09-29 18:54:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.40% [2022-09-29 18:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][0/1251] eta 1:42:32 lr 0.000965 time 4.9179 (4.9179) loss 4.3926 (4.3926) grad_norm 1.0863 (1.0863) [2022-09-29 18:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][100/1251] eta 0:15:34 lr 0.000965 time 0.8524 (0.8119) loss 4.1578 (4.1364) grad_norm 1.1763 (1.1300) [2022-09-29 18:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][200/1251] eta 0:13:51 lr 0.000965 time 0.8770 (0.7914) loss 4.7914 (4.0996) grad_norm 1.0278 (1.1009) [2022-09-29 18:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][300/1251] eta 0:12:25 lr 0.000965 time 0.8520 (0.7843) loss 4.5442 (4.1071) grad_norm 1.2512 (1.1008) [2022-09-29 18:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][400/1251] eta 0:11:06 lr 0.000965 time 0.8486 (0.7834) loss 4.8307 (4.0944) grad_norm 1.1556 (1.0954) [2022-09-29 19:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][500/1251] eta 0:09:47 lr 0.000964 time 0.6962 (0.7825) loss 3.7066 (4.0722) grad_norm 0.9920 (1.0925) [2022-09-29 19:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][600/1251] eta 0:08:27 lr 0.000964 time 0.6697 (0.7801) loss 4.3747 (4.0671) grad_norm 1.1742 (1.0982) [2022-09-29 19:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][700/1251] eta 0:07:10 lr 0.000964 time 0.8764 (0.7805) loss 4.0724 (4.0708) grad_norm 1.2670 (1.1005) [2022-09-29 19:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][800/1251] eta 0:05:50 lr 0.000964 time 0.7530 (0.7782) loss 3.5549 (4.0662) grad_norm 1.3582 (1.1024) [2022-09-29 19:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][900/1251] eta 0:04:32 lr 0.000964 time 0.8338 (0.7772) loss 3.4891 (4.0577) grad_norm 0.9783 (1.1044) [2022-09-29 19:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1000/1251] eta 0:03:14 lr 0.000964 time 0.8092 (0.7758) loss 3.2386 (4.0677) grad_norm 1.3525 (1.1020) [2022-09-29 19:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1100/1251] eta 0:01:57 lr 0.000964 time 0.8166 (0.7757) loss 3.9861 (4.0673) grad_norm 0.9745 (1.1023) [2022-09-29 19:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1200/1251] eta 0:00:39 lr 0.000963 time 0.9187 (0.7754) loss 4.4444 (4.0743) grad_norm 1.0299 (1.0987) [2022-09-29 19:10:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 36 training takes 0:16:09 [2022-09-29 19:10:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.440 (4.440) Loss 1.4687 (1.4687) Acc@1 67.773 (67.773) Acc@5 89.062 (89.062) [2022-09-29 19:10:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.828 Acc@5 88.658 [2022-09-29 19:10:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.8% [2022-09-29 19:10:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.83% [2022-09-29 19:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][0/1251] eta 1:40:48 lr 0.000963 time 4.8351 (4.8351) loss 4.2414 (4.2414) grad_norm 1.2790 (1.2790) [2022-09-29 19:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][100/1251] eta 0:15:36 lr 0.000963 time 0.8742 (0.8140) loss 4.6633 (4.1118) grad_norm 1.1375 (1.1048) [2022-09-29 19:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][200/1251] eta 0:13:54 lr 0.000963 time 0.8227 (0.7941) loss 4.5481 (4.0797) grad_norm 1.5075 (1.1006) [2022-09-29 19:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][300/1251] eta 0:12:28 lr 0.000963 time 0.8031 (0.7867) loss 3.2644 (4.0550) grad_norm 1.0257 (1.1021) [2022-09-29 19:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][400/1251] eta 0:11:07 lr 0.000963 time 0.9500 (0.7844) loss 4.1517 (4.0573) grad_norm 1.0995 (1.0976) [2022-09-29 19:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][500/1251] eta 0:09:46 lr 0.000963 time 0.8352 (0.7810) loss 3.3299 (4.0603) grad_norm 0.9608 (1.0978) [2022-09-29 19:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][600/1251] eta 0:08:27 lr 0.000962 time 0.8265 (0.7803) loss 4.8471 (4.0707) grad_norm 1.2102 (1.0915) [2022-09-29 19:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][700/1251] eta 0:07:09 lr 0.000962 time 0.7225 (0.7792) loss 3.1889 (4.0745) grad_norm 1.1919 (1.0973) [2022-09-29 19:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][800/1251] eta 0:05:51 lr 0.000962 time 0.7281 (0.7788) loss 3.7490 (4.0840) grad_norm 1.0260 (1.0951) [2022-09-29 19:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][900/1251] eta 0:04:33 lr 0.000962 time 0.9273 (0.7795) loss 3.9937 (4.0788) grad_norm 1.1521 (1.0926) [2022-09-29 19:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1000/1251] eta 0:03:15 lr 0.000962 time 0.8239 (0.7799) loss 4.3741 (4.0794) grad_norm 1.0354 (1.0913) [2022-09-29 19:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1100/1251] eta 0:01:57 lr 0.000962 time 0.7257 (0.7792) loss 4.8312 (4.0793) grad_norm 0.9331 (1.0919) [2022-09-29 19:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1200/1251] eta 0:00:39 lr 0.000961 time 0.8224 (0.7777) loss 4.5930 (4.0846) grad_norm 1.0731 (1.0913) [2022-09-29 19:26:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 37 training takes 0:16:13 [2022-09-29 19:26:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.777 (4.777) Loss 1.3948 (1.3948) Acc@1 68.066 (68.066) Acc@5 87.305 (87.305) [2022-09-29 19:27:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.074 Acc@5 88.894 [2022-09-29 19:27:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.1% [2022-09-29 19:27:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.07% [2022-09-29 19:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][0/1251] eta 1:38:27 lr 0.000961 time 4.7223 (4.7223) loss 4.1195 (4.1195) grad_norm 0.8789 (0.8789) [2022-09-29 19:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][100/1251] eta 0:15:45 lr 0.000961 time 0.8331 (0.8217) loss 3.0360 (4.0596) grad_norm 1.0673 (1.1027) [2022-09-29 19:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][200/1251] eta 0:13:59 lr 0.000961 time 0.6688 (0.7992) loss 4.2599 (4.0853) grad_norm 0.9294 (1.0971) [2022-09-29 19:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][300/1251] eta 0:12:29 lr 0.000961 time 0.8357 (0.7880) loss 3.9250 (4.1005) grad_norm 1.0354 (1.0874) [2022-09-29 19:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][400/1251] eta 0:11:07 lr 0.000961 time 0.6642 (0.7840) loss 4.7902 (4.0994) grad_norm 1.0480 (1.0872) [2022-09-29 19:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][500/1251] eta 0:09:46 lr 0.000961 time 0.8220 (0.7808) loss 4.0701 (4.0901) grad_norm 1.0727 (1.0887) [2022-09-29 19:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][600/1251] eta 0:08:27 lr 0.000960 time 0.8588 (0.7799) loss 4.6066 (4.0755) grad_norm 1.0598 (1.0840) [2022-09-29 19:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][700/1251] eta 0:07:09 lr 0.000960 time 0.8239 (0.7794) loss 5.0874 (4.0802) grad_norm 1.5132 (1.0843) [2022-09-29 19:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][800/1251] eta 0:05:51 lr 0.000960 time 0.7411 (0.7787) loss 4.1846 (4.0746) grad_norm 1.2389 (1.0884) [2022-09-29 19:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][900/1251] eta 0:04:33 lr 0.000960 time 0.7386 (0.7788) loss 4.1925 (4.0818) grad_norm 1.1796 (1.0912) [2022-09-29 19:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1000/1251] eta 0:03:15 lr 0.000960 time 0.7450 (0.7781) loss 4.9024 (4.0803) grad_norm 0.9638 (1.0897) [2022-09-29 19:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1100/1251] eta 0:01:57 lr 0.000960 time 0.8047 (0.7783) loss 2.9794 (4.0754) grad_norm 1.2425 (1.0893) [2022-09-29 19:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1200/1251] eta 0:00:39 lr 0.000959 time 0.7364 (0.7779) loss 3.4875 (4.0687) grad_norm 1.1549 (1.0900) [2022-09-29 19:43:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 38 training takes 0:16:12 [2022-09-29 19:43:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.591 (4.591) Loss 1.2902 (1.2902) Acc@1 70.020 (70.020) Acc@5 90.234 (90.234) [2022-09-29 19:43:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.454 Acc@5 89.116 [2022-09-29 19:43:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.5% [2022-09-29 19:43:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.45% [2022-09-29 19:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][0/1251] eta 1:41:25 lr 0.000959 time 4.8641 (4.8641) loss 3.7344 (3.7344) grad_norm 0.9235 (0.9235) [2022-09-29 19:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][100/1251] eta 0:15:25 lr 0.000959 time 0.6919 (0.8037) loss 3.4462 (4.0287) grad_norm 0.9962 (1.0741) [2022-09-29 19:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][200/1251] eta 0:13:50 lr 0.000959 time 0.7690 (0.7899) loss 2.9862 (4.0838) grad_norm 1.2156 (1.0755) [2022-09-29 19:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][300/1251] eta 0:12:26 lr 0.000959 time 0.7531 (0.7855) loss 4.9485 (4.0649) grad_norm 1.0785 (1.0789) [2022-09-29 19:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][400/1251] eta 0:11:06 lr 0.000959 time 0.8131 (0.7829) loss 4.4119 (4.0642) grad_norm 1.2866 (1.0900) [2022-09-29 19:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][500/1251] eta 0:09:45 lr 0.000958 time 0.8669 (0.7797) loss 4.4336 (4.0581) grad_norm 0.9878 (1.0968) [2022-09-29 19:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][600/1251] eta 0:08:27 lr 0.000958 time 0.7613 (0.7797) loss 4.3488 (4.0697) grad_norm 1.4005 (1.0963) [2022-09-29 19:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][700/1251] eta 0:07:08 lr 0.000958 time 0.6638 (0.7782) loss 3.5818 (4.0655) grad_norm 1.0787 (1.0948) [2022-09-29 19:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][800/1251] eta 0:05:51 lr 0.000958 time 0.8007 (0.7785) loss 4.5262 (4.0600) grad_norm 1.1323 (1.0921) [2022-09-29 19:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][900/1251] eta 0:04:33 lr 0.000958 time 0.7317 (0.7788) loss 3.9327 (4.0588) grad_norm 0.9974 (1.0903) [2022-09-29 19:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1000/1251] eta 0:03:15 lr 0.000958 time 0.7861 (0.7776) loss 4.2067 (4.0532) grad_norm 1.1203 (1.0893) [2022-09-29 19:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1100/1251] eta 0:01:57 lr 0.000957 time 0.8277 (0.7775) loss 4.2222 (4.0562) grad_norm 0.9441 (1.0904) [2022-09-29 19:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1200/1251] eta 0:00:39 lr 0.000957 time 0.8524 (0.7766) loss 4.3273 (4.0661) grad_norm 1.0577 (1.0890) [2022-09-29 19:59:52 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 39 training takes 0:16:12 [2022-09-29 19:59:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.633 (4.633) Loss 1.3223 (1.3223) Acc@1 70.508 (70.508) Acc@5 89.746 (89.746) [2022-09-29 20:00:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.360 Acc@5 88.996 [2022-09-29 20:00:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-09-29 20:00:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.45% [2022-09-29 20:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][0/1251] eta 1:42:50 lr 0.000957 time 4.9327 (4.9327) loss 3.1103 (3.1103) grad_norm 1.2140 (1.2140) [2022-09-29 20:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][100/1251] eta 0:15:41 lr 0.000957 time 0.6655 (0.8181) loss 4.6109 (3.9785) grad_norm 0.9149 (1.1064) [2022-09-29 20:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][200/1251] eta 0:13:53 lr 0.000957 time 0.7962 (0.7932) loss 3.9440 (3.9876) grad_norm 1.0330 (1.1003) [2022-09-29 20:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][300/1251] eta 0:12:27 lr 0.000957 time 0.8560 (0.7865) loss 4.6128 (4.0102) grad_norm 1.0465 (1.0994) [2022-09-29 20:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][400/1251] eta 0:11:04 lr 0.000957 time 0.6451 (0.7807) loss 4.7783 (4.0484) grad_norm 0.9339 (1.1007) [2022-09-29 20:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][500/1251] eta 0:09:45 lr 0.000956 time 0.8903 (0.7802) loss 4.7119 (4.0520) grad_norm 1.1328 (1.0978) [2022-09-29 20:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][600/1251] eta 0:08:27 lr 0.000956 time 0.7624 (0.7799) loss 4.1408 (4.0454) grad_norm 1.1571 (1.1014) [2022-09-29 20:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][700/1251] eta 0:07:09 lr 0.000956 time 0.8350 (0.7796) loss 3.9759 (4.0598) grad_norm 1.0457 (1.0979) [2022-09-29 20:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][800/1251] eta 0:05:51 lr 0.000956 time 0.8187 (0.7784) loss 3.4950 (4.0447) grad_norm 1.1993 (1.0968) [2022-09-29 20:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][900/1251] eta 0:04:32 lr 0.000956 time 0.8215 (0.7777) loss 4.1397 (4.0604) grad_norm 0.8677 (1.0969) [2022-09-29 20:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1000/1251] eta 0:03:15 lr 0.000956 time 0.9201 (0.7775) loss 4.0075 (4.0564) grad_norm 1.3060 (1.0941) [2022-09-29 20:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1100/1251] eta 0:01:57 lr 0.000955 time 0.8051 (0.7768) loss 4.4025 (4.0516) grad_norm 0.9320 (1.0932) [2022-09-29 20:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1200/1251] eta 0:00:39 lr 0.000955 time 0.7890 (0.7771) loss 4.7464 (4.0421) grad_norm 0.9960 (1.0901) [2022-09-29 20:16:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 40 training takes 0:16:11 [2022-09-29 20:16:25 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saving...... [2022-09-29 20:16:25 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saved !!! [2022-09-29 20:16:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 5.024 (5.024) Loss 1.4635 (1.4635) Acc@1 66.602 (66.602) Acc@5 88.184 (88.184) [2022-09-29 20:16:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.554 Acc@5 89.146 [2022-09-29 20:16:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.6% [2022-09-29 20:16:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.55% [2022-09-29 20:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][0/1251] eta 1:44:28 lr 0.000955 time 5.0107 (5.0107) loss 4.4465 (4.4465) grad_norm 1.0807 (1.0807) [2022-09-29 20:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][100/1251] eta 0:15:34 lr 0.000955 time 0.7129 (0.8121) loss 4.0738 (4.0220) grad_norm 1.2189 (1.0842) [2022-09-29 20:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][200/1251] eta 0:13:57 lr 0.000955 time 0.9233 (0.7971) loss 4.2906 (4.0188) grad_norm 1.0811 (1.0890) [2022-09-29 20:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][300/1251] eta 0:12:33 lr 0.000955 time 0.6939 (0.7921) loss 4.7586 (4.0411) grad_norm 1.4357 (1.0948) [2022-09-29 20:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][400/1251] eta 0:11:08 lr 0.000954 time 0.7782 (0.7857) loss 4.6052 (4.0570) grad_norm 1.0397 (1.0926) [2022-09-29 20:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][500/1251] eta 0:09:47 lr 0.000954 time 0.7090 (0.7824) loss 2.8739 (4.0313) grad_norm 1.0253 (1.0909) [2022-09-29 20:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][600/1251] eta 0:08:27 lr 0.000954 time 0.8571 (0.7802) loss 4.3168 (4.0259) grad_norm 0.9790 (1.0870) [2022-09-29 20:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][700/1251] eta 0:07:09 lr 0.000954 time 0.9077 (0.7801) loss 3.8395 (4.0333) grad_norm 1.1314 (1.0844) [2022-09-29 20:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][800/1251] eta 0:05:51 lr 0.000954 time 0.6324 (0.7790) loss 3.0129 (4.0285) grad_norm 0.9729 (1.0833) [2022-09-29 20:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][900/1251] eta 0:04:33 lr 0.000954 time 0.7624 (0.7778) loss 3.9361 (4.0414) grad_norm 1.1335 (1.0842) [2022-09-29 20:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1000/1251] eta 0:03:15 lr 0.000953 time 0.6743 (0.7774) loss 3.6502 (4.0396) grad_norm 1.0408 (1.0834) [2022-09-29 20:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1100/1251] eta 0:01:57 lr 0.000953 time 0.8165 (0.7777) loss 3.6232 (4.0419) grad_norm 1.1741 (1.0864) [2022-09-29 20:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1200/1251] eta 0:00:39 lr 0.000953 time 0.7843 (0.7761) loss 3.1589 (4.0430) grad_norm 0.9742 (1.0881) [2022-09-29 20:32:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 41 training takes 0:16:11 [2022-09-29 20:33:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.771 (3.771) Loss 1.3836 (1.3836) Acc@1 66.504 (66.504) Acc@5 88.672 (88.672) [2022-09-29 20:33:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.614 Acc@5 89.348 [2022-09-29 20:33:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.6% [2022-09-29 20:33:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.61% [2022-09-29 20:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][0/1251] eta 1:41:21 lr 0.000953 time 4.8613 (4.8613) loss 4.0443 (4.0443) grad_norm 0.9411 (0.9411) [2022-09-29 20:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][100/1251] eta 0:15:31 lr 0.000953 time 0.7054 (0.8094) loss 4.9427 (4.0845) grad_norm 0.9527 (1.0737) [2022-09-29 20:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][200/1251] eta 0:13:51 lr 0.000953 time 0.7492 (0.7912) loss 3.2089 (4.0067) grad_norm 1.1188 (1.0806) [2022-09-29 20:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][300/1251] eta 0:12:28 lr 0.000952 time 0.8309 (0.7869) loss 4.2671 (3.9900) grad_norm 1.0215 (1.0899) [2022-09-29 20:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][400/1251] eta 0:10:46 lr 0.000952 time 0.7033 (0.7600) loss 3.7644 (3.9901) grad_norm 1.3708 (1.0927) [2022-09-29 20:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][500/1251] eta 0:09:22 lr 0.000952 time 0.8133 (0.7494) loss 3.9775 (3.9981) grad_norm 0.9874 (1.0938) [2022-09-29 20:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][600/1251] eta 0:08:10 lr 0.000952 time 0.6472 (0.7536) loss 3.5974 (4.0006) grad_norm 0.9730 (1.0951) [2022-09-29 20:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][700/1251] eta 0:06:57 lr 0.000952 time 0.7657 (0.7571) loss 3.6286 (3.9931) grad_norm 1.0146 (1.0926) [2022-09-29 20:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][800/1251] eta 0:05:42 lr 0.000951 time 0.8272 (0.7593) loss 3.3640 (3.9872) grad_norm 1.0022 (1.0916) [2022-09-29 20:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][900/1251] eta 0:04:27 lr 0.000951 time 0.9361 (0.7621) loss 4.7993 (3.9857) grad_norm 1.1495 (1.0877) [2022-09-29 20:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1000/1251] eta 0:03:11 lr 0.000951 time 0.8515 (0.7624) loss 3.3245 (3.9906) grad_norm 1.0311 (1.0838) [2022-09-29 20:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1100/1251] eta 0:01:55 lr 0.000951 time 0.7646 (0.7633) loss 4.0560 (3.9900) grad_norm 1.3397 (1.0841) [2022-09-29 20:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1200/1251] eta 0:00:38 lr 0.000951 time 0.7623 (0.7645) loss 4.5400 (3.9923) grad_norm 0.9885 (1.0859) [2022-09-29 20:49:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 42 training takes 0:15:57 [2022-09-29 20:49:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.495 (4.495) Loss 1.4180 (1.4180) Acc@1 67.871 (67.871) Acc@5 88.672 (88.672) [2022-09-29 20:49:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.072 Acc@5 89.548 [2022-09-29 20:49:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.1% [2022-09-29 20:49:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.07% [2022-09-29 20:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][0/1251] eta 1:25:28 lr 0.000951 time 4.0996 (4.0996) loss 4.0309 (4.0309) grad_norm 1.1053 (1.1053) [2022-09-29 20:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][100/1251] eta 0:15:28 lr 0.000950 time 0.7903 (0.8065) loss 4.0050 (4.0501) grad_norm 0.9890 (1.0820) [2022-09-29 20:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][200/1251] eta 0:13:50 lr 0.000950 time 0.8267 (0.7899) loss 4.1090 (4.0161) grad_norm 1.0120 (1.0882) [2022-09-29 20:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][300/1251] eta 0:12:23 lr 0.000950 time 0.7547 (0.7817) loss 4.1907 (4.0302) grad_norm 1.0524 (1.0892) [2022-09-29 20:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][400/1251] eta 0:11:05 lr 0.000950 time 0.8588 (0.7814) loss 3.0361 (4.0050) grad_norm 1.0782 (1.0864) [2022-09-29 20:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][500/1251] eta 0:09:44 lr 0.000950 time 0.6949 (0.7788) loss 4.0948 (4.0208) grad_norm 1.0075 (1.0915) [2022-09-29 20:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][600/1251] eta 0:08:25 lr 0.000950 time 0.8715 (0.7773) loss 3.9839 (4.0125) grad_norm 0.8812 (1.0933) [2022-09-29 20:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][700/1251] eta 0:07:08 lr 0.000949 time 0.7295 (0.7775) loss 4.6469 (4.0002) grad_norm 0.8805 (1.0902) [2022-09-29 20:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][800/1251] eta 0:05:50 lr 0.000949 time 0.6848 (0.7764) loss 4.7688 (4.0081) grad_norm 1.1307 (1.0903) [2022-09-29 21:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][900/1251] eta 0:04:32 lr 0.000949 time 0.7026 (0.7776) loss 4.2447 (4.0040) grad_norm 1.2172 (1.0934) [2022-09-29 21:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1000/1251] eta 0:03:15 lr 0.000949 time 0.8039 (0.7773) loss 3.4368 (3.9905) grad_norm 1.5700 (1.0948) [2022-09-29 21:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1100/1251] eta 0:01:57 lr 0.000949 time 0.8450 (0.7763) loss 4.6018 (3.9957) grad_norm 1.3560 (1.0944) [2022-09-29 21:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1200/1251] eta 0:00:39 lr 0.000948 time 0.7932 (0.7770) loss 2.9036 (3.9917) grad_norm 1.1744 (1.0937) [2022-09-29 21:05:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 43 training takes 0:16:11 [2022-09-29 21:05:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.523 (4.523) Loss 1.3421 (1.3421) Acc@1 68.066 (68.066) Acc@5 88.867 (88.867) [2022-09-29 21:06:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.846 Acc@5 89.414 [2022-09-29 21:06:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.8% [2022-09-29 21:06:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.07% [2022-09-29 21:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][0/1251] eta 1:39:28 lr 0.000948 time 4.7713 (4.7713) loss 3.6566 (3.6566) grad_norm 1.0656 (1.0656) [2022-09-29 21:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][100/1251] eta 0:15:41 lr 0.000948 time 0.7417 (0.8182) loss 4.0977 (4.0433) grad_norm 1.0360 (1.1092) [2022-09-29 21:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][200/1251] eta 0:13:59 lr 0.000948 time 0.6727 (0.7987) loss 2.9704 (4.0048) grad_norm 1.1227 (1.0955) [2022-09-29 21:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][300/1251] eta 0:12:30 lr 0.000948 time 0.8041 (0.7896) loss 4.0972 (3.9942) grad_norm 0.9192 (1.0945) [2022-09-29 21:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][400/1251] eta 0:11:07 lr 0.000948 time 0.6414 (0.7842) loss 3.8003 (4.0043) grad_norm 1.2630 (1.0875) [2022-09-29 21:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][500/1251] eta 0:09:47 lr 0.000947 time 0.6504 (0.7825) loss 2.8934 (4.0017) grad_norm 1.0823 (1.0850) [2022-09-29 21:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][600/1251] eta 0:08:28 lr 0.000947 time 0.7812 (0.7809) loss 3.9338 (4.0111) grad_norm 1.1851 (1.0901) [2022-09-29 21:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][700/1251] eta 0:07:09 lr 0.000947 time 0.8286 (0.7801) loss 2.7415 (3.9961) grad_norm 1.0228 (1.0886) [2022-09-29 21:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][800/1251] eta 0:05:51 lr 0.000947 time 0.7854 (0.7798) loss 3.4937 (3.9860) grad_norm 1.0937 (1.0887) [2022-09-29 21:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][900/1251] eta 0:04:33 lr 0.000947 time 0.7668 (0.7797) loss 3.8053 (3.9864) grad_norm 1.1557 (1.0889) [2022-09-29 21:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1000/1251] eta 0:03:15 lr 0.000947 time 0.8204 (0.7792) loss 4.1712 (3.9922) grad_norm 1.2187 (1.0890) [2022-09-29 21:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1100/1251] eta 0:01:57 lr 0.000946 time 0.8250 (0.7792) loss 4.4316 (3.9883) grad_norm 1.0900 (1.0844) [2022-09-29 21:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1200/1251] eta 0:00:39 lr 0.000946 time 0.6718 (0.7798) loss 3.9260 (3.9864) grad_norm 0.8964 (1.0845) [2022-09-29 21:22:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 44 training takes 0:16:16 [2022-09-29 21:22:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.016 (4.016) Loss 1.2853 (1.2853) Acc@1 69.434 (69.434) Acc@5 89.453 (89.453) [2022-09-29 21:22:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.398 Acc@5 89.522 [2022-09-29 21:22:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.4% [2022-09-29 21:22:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.40% [2022-09-29 21:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][0/1251] eta 1:46:53 lr 0.000946 time 5.1269 (5.1269) loss 4.7128 (4.7128) grad_norm 1.0772 (1.0772) [2022-09-29 21:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][100/1251] eta 0:15:44 lr 0.000946 time 0.6938 (0.8207) loss 4.1260 (4.1277) grad_norm 1.0169 (1.0747) [2022-09-29 21:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][200/1251] eta 0:14:04 lr 0.000946 time 0.7965 (0.8031) loss 4.5337 (4.0520) grad_norm 1.0966 (1.0835) [2022-09-29 21:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][300/1251] eta 0:12:34 lr 0.000945 time 0.7257 (0.7933) loss 3.5472 (4.0242) grad_norm 1.1052 (1.0807) [2022-09-29 21:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][400/1251] eta 0:11:10 lr 0.000945 time 0.7188 (0.7874) loss 4.0693 (4.0232) grad_norm 1.1700 (1.0834) [2022-09-29 21:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][500/1251] eta 0:09:48 lr 0.000945 time 0.8737 (0.7835) loss 2.8413 (4.0165) grad_norm 1.0616 (1.0839) [2022-09-29 21:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][600/1251] eta 0:08:29 lr 0.000945 time 0.8153 (0.7824) loss 4.1040 (4.0121) grad_norm 0.9173 (1.0830) [2022-09-29 21:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][700/1251] eta 0:07:10 lr 0.000945 time 0.6971 (0.7808) loss 4.3882 (4.0092) grad_norm 1.0913 (1.0846) [2022-09-29 21:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][800/1251] eta 0:05:51 lr 0.000945 time 0.7177 (0.7805) loss 4.3306 (4.0009) grad_norm 0.9798 (1.0877) [2022-09-29 21:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][900/1251] eta 0:04:33 lr 0.000944 time 0.7270 (0.7790) loss 3.7507 (3.9935) grad_norm 1.0052 (1.0848) [2022-09-29 21:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1000/1251] eta 0:03:15 lr 0.000944 time 0.9271 (0.7793) loss 4.5491 (3.9859) grad_norm 0.9843 (1.0844) [2022-09-29 21:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1100/1251] eta 0:01:57 lr 0.000944 time 0.7978 (0.7790) loss 2.8638 (3.9900) grad_norm 1.0544 (1.0847) [2022-09-29 21:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1200/1251] eta 0:00:39 lr 0.000944 time 0.8094 (0.7787) loss 4.7690 (3.9825) grad_norm 1.0456 (1.0857) [2022-09-29 21:39:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 45 training takes 0:16:13 [2022-09-29 21:39:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.916 (4.916) Loss 1.3366 (1.3366) Acc@1 67.773 (67.773) Acc@5 89.062 (89.062) [2022-09-29 21:39:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.238 Acc@5 89.694 [2022-09-29 21:39:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.2% [2022-09-29 21:39:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.40% [2022-09-29 21:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][0/1251] eta 1:42:07 lr 0.000944 time 4.8979 (4.8979) loss 3.0841 (3.0841) grad_norm 1.1925 (1.1925) [2022-09-29 21:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][100/1251] eta 0:15:37 lr 0.000943 time 0.7898 (0.8143) loss 4.2566 (4.0563) grad_norm 1.0431 (1.0771) [2022-09-29 21:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][200/1251] eta 0:13:59 lr 0.000943 time 0.7988 (0.7987) loss 4.1681 (4.0032) grad_norm 0.9474 (1.0860) [2022-09-29 21:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][300/1251] eta 0:12:32 lr 0.000943 time 0.6856 (0.7915) loss 4.3421 (3.9950) grad_norm 1.1945 (1.0902) [2022-09-29 21:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][400/1251] eta 0:11:11 lr 0.000943 time 0.8352 (0.7887) loss 4.2582 (3.9629) grad_norm 1.0779 (1.0901) [2022-09-29 21:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][500/1251] eta 0:09:49 lr 0.000943 time 0.6318 (0.7847) loss 4.6375 (3.9697) grad_norm 1.2378 (1.0885) [2022-09-29 21:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][600/1251] eta 0:08:31 lr 0.000943 time 0.7054 (0.7853) loss 3.4254 (3.9755) grad_norm 1.2337 (1.0890) [2022-09-29 21:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][700/1251] eta 0:07:12 lr 0.000942 time 0.7925 (0.7848) loss 3.9114 (3.9656) grad_norm 1.0331 (1.0855) [2022-09-29 21:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][800/1251] eta 0:05:53 lr 0.000942 time 0.8530 (0.7842) loss 4.2749 (3.9737) grad_norm 0.9903 (1.0879) [2022-09-29 21:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][900/1251] eta 0:04:35 lr 0.000942 time 0.8759 (0.7843) loss 4.3407 (3.9668) grad_norm 1.1157 (1.0872) [2022-09-29 21:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1000/1251] eta 0:03:16 lr 0.000942 time 0.8016 (0.7836) loss 3.2334 (3.9688) grad_norm 0.9663 (1.0838) [2022-09-29 21:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1100/1251] eta 0:01:58 lr 0.000942 time 0.8202 (0.7832) loss 2.7659 (3.9585) grad_norm 0.9746 (1.0827) [2022-09-29 21:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1200/1251] eta 0:00:39 lr 0.000941 time 0.9414 (0.7826) loss 4.3250 (3.9616) grad_norm 1.2956 (1.0817) [2022-09-29 21:55:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 46 training takes 0:16:18 [2022-09-29 21:55:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.581 (4.581) Loss 1.3115 (1.3115) Acc@1 68.555 (68.555) Acc@5 90.430 (90.430) [2022-09-29 21:56:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.838 Acc@5 90.010 [2022-09-29 21:56:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-09-29 21:56:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.84% [2022-09-29 21:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][0/1251] eta 1:47:09 lr 0.000941 time 5.1395 (5.1395) loss 4.2888 (4.2888) grad_norm 1.0193 (1.0193) [2022-09-29 21:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][100/1251] eta 0:15:39 lr 0.000941 time 0.7227 (0.8162) loss 3.3550 (3.9305) grad_norm 0.9843 (1.0876) [2022-09-29 21:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][200/1251] eta 0:13:58 lr 0.000941 time 0.7767 (0.7976) loss 4.0728 (3.9441) grad_norm 0.9933 (1.0902) [2022-09-29 22:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][300/1251] eta 0:12:29 lr 0.000941 time 0.8219 (0.7884) loss 4.4075 (3.9437) grad_norm 1.0358 (1.0874) [2022-09-29 22:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][400/1251] eta 0:11:10 lr 0.000940 time 0.8530 (0.7875) loss 4.2807 (3.9436) grad_norm 1.0670 (1.0854) [2022-09-29 22:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][500/1251] eta 0:09:49 lr 0.000940 time 0.8005 (0.7843) loss 4.0604 (3.9439) grad_norm 0.9603 (1.0796) [2022-09-29 22:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][600/1251] eta 0:08:28 lr 0.000940 time 0.8147 (0.7805) loss 3.3474 (3.9683) grad_norm 1.1926 (1.0784) [2022-09-29 22:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][700/1251] eta 0:07:09 lr 0.000940 time 0.6337 (0.7788) loss 4.2963 (3.9639) grad_norm 1.2165 (1.0806) [2022-09-29 22:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][800/1251] eta 0:05:51 lr 0.000940 time 0.7212 (0.7790) loss 4.3735 (3.9761) grad_norm 1.2260 (1.0827) [2022-09-29 22:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][900/1251] eta 0:04:32 lr 0.000939 time 0.7890 (0.7777) loss 2.7720 (3.9627) grad_norm 1.3440 (1.0847) [2022-09-29 22:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1000/1251] eta 0:03:15 lr 0.000939 time 0.7524 (0.7777) loss 4.8754 (3.9608) grad_norm 1.0762 (1.0858) [2022-09-29 22:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1100/1251] eta 0:01:57 lr 0.000939 time 0.8354 (0.7768) loss 4.0999 (3.9636) grad_norm 0.9420 (1.0847) [2022-09-29 22:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1200/1251] eta 0:00:39 lr 0.000939 time 0.7038 (0.7764) loss 4.1938 (3.9683) grad_norm 1.0079 (1.0858) [2022-09-29 22:12:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 47 training takes 0:16:10 [2022-09-29 22:12:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.577 (4.577) Loss 1.2838 (1.2838) Acc@1 70.215 (70.215) Acc@5 90.137 (90.137) [2022-09-29 22:12:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.798 Acc@5 89.846 [2022-09-29 22:12:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-09-29 22:12:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.84% [2022-09-29 22:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][0/1251] eta 1:44:00 lr 0.000939 time 4.9888 (4.9888) loss 4.2456 (4.2456) grad_norm 0.9676 (0.9676) [2022-09-29 22:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][100/1251] eta 0:15:42 lr 0.000939 time 0.8937 (0.8187) loss 3.2065 (3.9902) grad_norm 1.2672 (1.1075) [2022-09-29 22:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][200/1251] eta 0:14:00 lr 0.000938 time 0.7392 (0.7997) loss 4.2184 (3.9756) grad_norm 1.0267 (1.0945) [2022-09-29 22:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][300/1251] eta 0:12:36 lr 0.000938 time 0.8378 (0.7956) loss 4.3325 (3.9453) grad_norm 1.2522 (1.0956) [2022-09-29 22:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][400/1251] eta 0:11:11 lr 0.000938 time 0.6719 (0.7885) loss 4.2583 (3.9489) grad_norm 1.2498 (1.0991) [2022-09-29 22:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][500/1251] eta 0:09:50 lr 0.000938 time 0.7044 (0.7866) loss 4.1851 (3.9612) grad_norm 1.0888 (1.0934) [2022-09-29 22:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][600/1251] eta 0:08:31 lr 0.000938 time 0.9311 (0.7855) loss 4.0379 (3.9647) grad_norm 1.0524 (1.0953) [2022-09-29 22:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][700/1251] eta 0:07:11 lr 0.000937 time 0.8305 (0.7836) loss 3.9916 (3.9680) grad_norm 1.1715 (1.0909) [2022-09-29 22:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][800/1251] eta 0:05:52 lr 0.000937 time 0.7544 (0.7825) loss 4.1615 (3.9746) grad_norm 0.9573 (1.0934) [2022-09-29 22:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][900/1251] eta 0:04:34 lr 0.000937 time 0.7560 (0.7823) loss 4.0935 (3.9666) grad_norm 0.9454 (1.0935) [2022-09-29 22:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1000/1251] eta 0:03:16 lr 0.000937 time 0.8190 (0.7810) loss 3.5876 (3.9672) grad_norm 1.2027 (1.0931) [2022-09-29 22:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1100/1251] eta 0:01:57 lr 0.000937 time 0.7399 (0.7794) loss 4.2168 (3.9646) grad_norm 1.0145 (1.0924) [2022-09-29 22:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1200/1251] eta 0:00:39 lr 0.000936 time 0.6701 (0.7790) loss 3.6593 (3.9616) grad_norm 1.0659 (1.0944) [2022-09-29 22:28:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 48 training takes 0:16:15 [2022-09-29 22:28:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.024 (4.024) Loss 1.2888 (1.2888) Acc@1 70.508 (70.508) Acc@5 90.527 (90.527) [2022-09-29 22:29:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.864 Acc@5 89.946 [2022-09-29 22:29:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.9% [2022-09-29 22:29:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.86% [2022-09-29 22:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][0/1251] eta 1:37:56 lr 0.000936 time 4.6976 (4.6976) loss 4.3289 (4.3289) grad_norm 1.0047 (1.0047) [2022-09-29 22:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][100/1251] eta 0:15:37 lr 0.000936 time 0.8362 (0.8145) loss 3.4380 (3.9230) grad_norm 0.8730 (1.1150) [2022-09-29 22:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][200/1251] eta 0:13:56 lr 0.000936 time 0.7528 (0.7956) loss 4.0232 (3.9029) grad_norm 0.8988 (1.1041) [2022-09-29 22:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][300/1251] eta 0:12:32 lr 0.000936 time 0.8249 (0.7918) loss 3.9613 (3.9019) grad_norm 1.1218 (1.1021) [2022-09-29 22:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][400/1251] eta 0:11:11 lr 0.000935 time 0.7756 (0.7887) loss 4.0483 (3.9149) grad_norm 0.9591 (1.0979) [2022-09-29 22:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][500/1251] eta 0:09:50 lr 0.000935 time 0.7078 (0.7862) loss 3.5017 (3.9216) grad_norm 1.0039 (1.0933) [2022-09-29 22:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][600/1251] eta 0:08:30 lr 0.000935 time 0.7607 (0.7841) loss 4.0647 (3.9167) grad_norm 1.1577 (1.0953) [2022-09-29 22:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][700/1251] eta 0:07:11 lr 0.000935 time 0.8093 (0.7824) loss 3.9658 (3.9302) grad_norm 1.1978 (1.0924) [2022-09-29 22:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][800/1251] eta 0:05:52 lr 0.000935 time 0.9037 (0.7817) loss 4.0625 (3.9279) grad_norm 0.9000 (1.0894) [2022-09-29 22:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][900/1251] eta 0:04:34 lr 0.000934 time 0.8066 (0.7814) loss 2.5341 (3.9294) grad_norm 1.4242 (1.0897) [2022-09-29 22:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1000/1251] eta 0:03:15 lr 0.000934 time 0.8252 (0.7802) loss 4.5098 (3.9226) grad_norm 1.2257 (1.0903) [2022-09-29 22:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1100/1251] eta 0:01:57 lr 0.000934 time 0.8158 (0.7805) loss 3.6910 (3.9247) grad_norm 1.1634 (1.0881) [2022-09-29 22:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1200/1251] eta 0:00:39 lr 0.000934 time 0.7655 (0.7797) loss 4.5770 (3.9260) grad_norm 1.0678 (1.0921) [2022-09-29 22:45:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 49 training takes 0:16:15 [2022-09-29 22:45:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.944 (4.944) Loss 1.2984 (1.2984) Acc@1 70.410 (70.410) Acc@5 89.844 (89.844) [2022-09-29 22:45:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.998 Acc@5 90.098 [2022-09-29 22:45:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-09-29 22:45:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.00% [2022-09-29 22:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][0/1251] eta 1:48:04 lr 0.000934 time 5.1836 (5.1836) loss 3.4040 (3.4040) grad_norm 0.9879 (0.9879) [2022-09-29 22:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][100/1251] eta 0:15:40 lr 0.000933 time 0.7041 (0.8174) loss 4.2558 (3.8879) grad_norm 1.4534 (1.0854) [2022-09-29 22:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][200/1251] eta 0:14:00 lr 0.000933 time 0.7546 (0.7998) loss 3.9215 (3.9362) grad_norm 1.1276 (1.1014) [2022-09-29 22:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][300/1251] eta 0:12:31 lr 0.000933 time 0.8403 (0.7906) loss 2.6984 (3.9297) grad_norm 1.0160 (1.1035) [2022-09-29 22:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][400/1251] eta 0:11:09 lr 0.000933 time 0.6536 (0.7871) loss 3.7742 (3.9602) grad_norm 1.0840 (1.0982) [2022-09-29 22:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][500/1251] eta 0:09:49 lr 0.000933 time 0.8059 (0.7854) loss 4.1921 (3.9612) grad_norm 1.2218 (1.0976) [2022-09-29 22:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][600/1251] eta 0:08:29 lr 0.000932 time 0.8451 (0.7827) loss 3.4804 (3.9620) grad_norm 1.0300 (1.0987) [2022-09-29 22:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][700/1251] eta 0:07:10 lr 0.000932 time 0.7820 (0.7812) loss 3.4459 (3.9530) grad_norm 1.0241 (1.0974) [2022-09-29 22:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][800/1251] eta 0:05:51 lr 0.000932 time 0.8131 (0.7797) loss 3.0256 (3.9585) grad_norm 0.8932 (1.0982) [2022-09-29 22:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][900/1251] eta 0:04:33 lr 0.000932 time 0.7927 (0.7789) loss 4.6776 (3.9589) grad_norm 1.2569 (1.0964) [2022-09-29 22:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1000/1251] eta 0:03:15 lr 0.000932 time 0.9098 (0.7781) loss 3.1385 (3.9546) grad_norm 1.1744 (1.0972) [2022-09-29 23:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1100/1251] eta 0:01:57 lr 0.000931 time 0.6567 (0.7783) loss 3.6679 (3.9502) grad_norm 1.1539 (1.0976) [2022-09-29 23:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1200/1251] eta 0:00:39 lr 0.000931 time 0.8623 (0.7783) loss 3.3600 (3.9525) grad_norm 1.0907 (1.0972) [2022-09-29 23:01:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 50 training takes 0:16:06 [2022-09-29 23:01:53 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saving...... [2022-09-29 23:01:53 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saved !!! [2022-09-29 23:01:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.682 (3.682) Loss 1.2970 (1.2970) Acc@1 70.703 (70.703) Acc@5 89.648 (89.648) [2022-09-29 23:02:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.034 Acc@5 90.314 [2022-09-29 23:02:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-09-29 23:02:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.03% [2022-09-29 23:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][0/1251] eta 1:36:20 lr 0.000931 time 4.6210 (4.6210) loss 2.7872 (2.7872) grad_norm 1.0088 (1.0088) [2022-09-29 23:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][100/1251] eta 0:15:41 lr 0.000931 time 0.7596 (0.8177) loss 3.2364 (3.9976) grad_norm 1.2011 (1.1162) [2022-09-29 23:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][200/1251] eta 0:13:55 lr 0.000931 time 0.8894 (0.7948) loss 4.2767 (3.9604) grad_norm 0.9881 (1.1162) [2022-09-29 23:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][300/1251] eta 0:12:28 lr 0.000930 time 0.8383 (0.7874) loss 4.0371 (3.9474) grad_norm 1.0418 (1.1025) [2022-09-29 23:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][400/1251] eta 0:11:08 lr 0.000930 time 0.6950 (0.7853) loss 3.8834 (3.9248) grad_norm 0.9818 (1.1057) [2022-09-29 23:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][500/1251] eta 0:09:47 lr 0.000930 time 0.8337 (0.7826) loss 4.3918 (3.9192) grad_norm 1.0637 (1.1029) [2022-09-29 23:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][600/1251] eta 0:08:28 lr 0.000930 time 0.7339 (0.7813) loss 3.9287 (3.9204) grad_norm 1.1059 (1.0956) [2022-09-29 23:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][700/1251] eta 0:07:09 lr 0.000930 time 0.9020 (0.7803) loss 3.1847 (3.9222) grad_norm 0.9226 (1.0969) [2022-09-29 23:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][800/1251] eta 0:05:50 lr 0.000929 time 0.6099 (0.7782) loss 3.6280 (3.9200) grad_norm 1.2667 (1.0983) [2022-09-29 23:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][900/1251] eta 0:04:33 lr 0.000929 time 0.7272 (0.7785) loss 4.0308 (3.9197) grad_norm 0.9766 (1.0974) [2022-09-29 23:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1000/1251] eta 0:03:15 lr 0.000929 time 0.8449 (0.7775) loss 3.3624 (3.9145) grad_norm 1.2239 (1.1023) [2022-09-29 23:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1100/1251] eta 0:01:57 lr 0.000929 time 0.8165 (0.7786) loss 4.3151 (3.9180) grad_norm 1.1353 (1.1029) [2022-09-29 23:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1200/1251] eta 0:00:39 lr 0.000929 time 0.7935 (0.7779) loss 3.4008 (3.9202) grad_norm 1.1373 (1.0998) [2022-09-29 23:18:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 51 training takes 0:16:13 [2022-09-29 23:18:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.994 (3.994) Loss 1.3554 (1.3554) Acc@1 69.434 (69.434) Acc@5 89.160 (89.160) [2022-09-29 23:18:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.076 Acc@5 89.888 [2022-09-29 23:18:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.1% [2022-09-29 23:18:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.08% [2022-09-29 23:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][0/1251] eta 1:39:14 lr 0.000928 time 4.7598 (4.7598) loss 3.7932 (3.7932) grad_norm 1.2613 (1.2613) [2022-09-29 23:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][100/1251] eta 0:15:30 lr 0.000928 time 0.8471 (0.8086) loss 4.7184 (3.8835) grad_norm 1.1180 (1.0754) [2022-09-29 23:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][200/1251] eta 0:13:53 lr 0.000928 time 0.8420 (0.7929) loss 3.8149 (3.9567) grad_norm 0.9260 (1.0904) [2022-09-29 23:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][300/1251] eta 0:12:28 lr 0.000928 time 0.7529 (0.7870) loss 4.1487 (3.9281) grad_norm 1.1150 (1.1027) [2022-09-29 23:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][400/1251] eta 0:11:07 lr 0.000928 time 0.9186 (0.7843) loss 4.4884 (3.9457) grad_norm 1.0589 (1.0967) [2022-09-29 23:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][500/1251] eta 0:09:46 lr 0.000927 time 0.8220 (0.7810) loss 4.5861 (3.9607) grad_norm 0.9261 (1.0904) [2022-09-29 23:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][600/1251] eta 0:08:26 lr 0.000927 time 0.6510 (0.7788) loss 3.8928 (3.9498) grad_norm 1.3585 (1.0968) [2022-09-29 23:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][700/1251] eta 0:07:08 lr 0.000927 time 0.7692 (0.7786) loss 4.0004 (3.9382) grad_norm 1.1548 (1.0964) [2022-09-29 23:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][800/1251] eta 0:05:51 lr 0.000927 time 0.6607 (0.7787) loss 4.6189 (3.9305) grad_norm 1.1430 (1.0946) [2022-09-29 23:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][900/1251] eta 0:04:33 lr 0.000926 time 0.7401 (0.7789) loss 4.1959 (3.9359) grad_norm 1.0168 (1.0922) [2022-09-29 23:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1000/1251] eta 0:03:15 lr 0.000926 time 0.8553 (0.7798) loss 3.9674 (3.9335) grad_norm 0.9872 (1.0912) [2022-09-29 23:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1100/1251] eta 0:01:57 lr 0.000926 time 0.7913 (0.7796) loss 3.9102 (3.9367) grad_norm 1.1918 (1.0892) [2022-09-29 23:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1200/1251] eta 0:00:39 lr 0.000926 time 0.8468 (0.7794) loss 3.7688 (3.9388) grad_norm 0.9139 (1.0915) [2022-09-29 23:34:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 52 training takes 0:16:14 [2022-09-29 23:35:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.340 (4.340) Loss 1.1773 (1.1773) Acc@1 72.266 (72.266) Acc@5 92.285 (92.285) [2022-09-29 23:35:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.402 Acc@5 90.506 [2022-09-29 23:35:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-09-29 23:35:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.40% [2022-09-29 23:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][0/1251] eta 1:38:57 lr 0.000926 time 4.7465 (4.7465) loss 4.2059 (4.2059) grad_norm 1.1328 (1.1328) [2022-09-29 23:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][100/1251] eta 0:15:29 lr 0.000925 time 0.7638 (0.8075) loss 3.6174 (3.9139) grad_norm 1.0413 (1.0967) [2022-09-29 23:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][200/1251] eta 0:13:59 lr 0.000925 time 0.7111 (0.7990) loss 4.7047 (3.8753) grad_norm 1.0762 (1.0878) [2022-09-29 23:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][300/1251] eta 0:12:31 lr 0.000925 time 0.6784 (0.7898) loss 2.9465 (3.9002) grad_norm 1.0835 (1.1000) [2022-09-29 23:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][400/1251] eta 0:11:09 lr 0.000925 time 0.7953 (0.7863) loss 4.1819 (3.9046) grad_norm 1.1416 (1.1004) [2022-09-29 23:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][500/1251] eta 0:09:48 lr 0.000925 time 0.7361 (0.7839) loss 4.1445 (3.9149) grad_norm 1.1879 (1.1012) [2022-09-29 23:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][600/1251] eta 0:08:29 lr 0.000924 time 0.8046 (0.7824) loss 4.4658 (3.9304) grad_norm 0.9928 (1.1041) [2022-09-29 23:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][700/1251] eta 0:07:10 lr 0.000924 time 0.8455 (0.7814) loss 3.2570 (3.9299) grad_norm 0.9293 (1.1082) [2022-09-29 23:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][800/1251] eta 0:05:51 lr 0.000924 time 0.8351 (0.7804) loss 3.4670 (3.9207) grad_norm 1.4104 (1.1101) [2022-09-29 23:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][900/1251] eta 0:04:33 lr 0.000924 time 0.6023 (0.7788) loss 4.0734 (3.9194) grad_norm 0.9331 (1.1087) [2022-09-29 23:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1000/1251] eta 0:03:15 lr 0.000923 time 0.7153 (0.7787) loss 2.9730 (3.9171) grad_norm 1.1053 (1.1054) [2022-09-29 23:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1100/1251] eta 0:01:57 lr 0.000923 time 0.8345 (0.7786) loss 3.7398 (3.9121) grad_norm 0.9795 (1.1028) [2022-09-29 23:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1200/1251] eta 0:00:39 lr 0.000923 time 0.7034 (0.7788) loss 4.2172 (3.9115) grad_norm 1.1525 (1.1038) [2022-09-29 23:51:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 53 training takes 0:16:14 [2022-09-29 23:51:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.008 (4.008) Loss 1.2154 (1.2154) Acc@1 70.996 (70.996) Acc@5 91.016 (91.016) [2022-09-29 23:51:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.492 Acc@5 90.424 [2022-09-29 23:51:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-09-29 23:51:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.49% [2022-09-29 23:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][0/1251] eta 1:45:06 lr 0.000923 time 5.0411 (5.0411) loss 2.5622 (2.5622) grad_norm 1.0088 (1.0088) [2022-09-29 23:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][100/1251] eta 0:15:40 lr 0.000923 time 0.8285 (0.8175) loss 4.1532 (3.9297) grad_norm 0.9093 (1.0951) [2022-09-29 23:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][200/1251] eta 0:13:57 lr 0.000922 time 0.7479 (0.7968) loss 3.8726 (3.9416) grad_norm 1.0846 (1.0973) [2022-09-29 23:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][300/1251] eta 0:12:31 lr 0.000922 time 0.8902 (0.7904) loss 3.5870 (3.9156) grad_norm 1.1827 (1.1017) [2022-09-29 23:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][400/1251] eta 0:11:09 lr 0.000922 time 0.7952 (0.7865) loss 4.3113 (3.9063) grad_norm 0.9902 (1.0944) [2022-09-29 23:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][500/1251] eta 0:09:47 lr 0.000922 time 0.8147 (0.7828) loss 2.8827 (3.8990) grad_norm 1.2816 (1.0984) [2022-09-29 23:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][600/1251] eta 0:08:28 lr 0.000922 time 0.7466 (0.7810) loss 3.9537 (3.9048) grad_norm 1.4667 (1.0979) [2022-09-30 00:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][700/1251] eta 0:07:09 lr 0.000921 time 0.8315 (0.7794) loss 3.9713 (3.9240) grad_norm 1.0612 (1.0953) [2022-09-30 00:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][800/1251] eta 0:05:51 lr 0.000921 time 0.8551 (0.7793) loss 4.4510 (3.9304) grad_norm 1.1713 (1.0968) [2022-09-30 00:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][900/1251] eta 0:04:33 lr 0.000921 time 0.7886 (0.7783) loss 4.4080 (3.9385) grad_norm 1.2222 (1.0994) [2022-09-30 00:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1000/1251] eta 0:03:15 lr 0.000921 time 0.7503 (0.7776) loss 4.0986 (3.9403) grad_norm 1.0644 (1.1000) [2022-09-30 00:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1100/1251] eta 0:01:57 lr 0.000920 time 0.8321 (0.7776) loss 4.5305 (3.9429) grad_norm 1.1117 (1.0990) [2022-09-30 00:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1200/1251] eta 0:00:39 lr 0.000920 time 0.7415 (0.7767) loss 4.5195 (3.9399) grad_norm 1.0029 (1.0984) [2022-09-30 00:08:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 54 training takes 0:16:11 [2022-09-30 00:08:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.569 (4.569) Loss 1.2851 (1.2851) Acc@1 69.824 (69.824) Acc@5 90.430 (90.430) [2022-09-30 00:08:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.358 Acc@5 90.432 [2022-09-30 00:08:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-09-30 00:08:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.49% [2022-09-30 00:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][0/1251] eta 1:41:10 lr 0.000920 time 4.8524 (4.8524) loss 3.7761 (3.7761) grad_norm 1.1171 (1.1171) [2022-09-30 00:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][100/1251] eta 0:15:36 lr 0.000920 time 0.6800 (0.8137) loss 4.2561 (3.8856) grad_norm 1.0915 (1.0838) [2022-09-30 00:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][200/1251] eta 0:13:57 lr 0.000920 time 0.8623 (0.7970) loss 4.0825 (3.9394) grad_norm 1.2640 (1.0967) [2022-09-30 00:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][300/1251] eta 0:12:29 lr 0.000919 time 0.7055 (0.7880) loss 3.5142 (3.9042) grad_norm 1.0496 (1.1017) [2022-09-30 00:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][400/1251] eta 0:11:07 lr 0.000919 time 0.8467 (0.7845) loss 2.8380 (3.8974) grad_norm 1.0011 (1.1001) [2022-09-30 00:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][500/1251] eta 0:09:48 lr 0.000919 time 0.8379 (0.7835) loss 3.2500 (3.8978) grad_norm 1.0841 (1.1022) [2022-09-30 00:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][600/1251] eta 0:08:29 lr 0.000919 time 0.8259 (0.7826) loss 3.6525 (3.9008) grad_norm 1.0316 (1.1003) [2022-09-30 00:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][700/1251] eta 0:07:10 lr 0.000919 time 0.8287 (0.7812) loss 4.9595 (3.9060) grad_norm 1.3018 (1.1029) [2022-09-30 00:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][800/1251] eta 0:05:51 lr 0.000918 time 0.6788 (0.7791) loss 3.1731 (3.8964) grad_norm 0.9941 (1.1053) [2022-09-30 00:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][900/1251] eta 0:04:33 lr 0.000918 time 0.7279 (0.7778) loss 4.2125 (3.8955) grad_norm 1.0853 (1.1008) [2022-09-30 00:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1000/1251] eta 0:03:15 lr 0.000918 time 0.8148 (0.7770) loss 4.1175 (3.9067) grad_norm 1.2889 (1.1028) [2022-09-30 00:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1100/1251] eta 0:01:57 lr 0.000918 time 0.6838 (0.7763) loss 4.2278 (3.9005) grad_norm 1.2606 (1.1029) [2022-09-30 00:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1200/1251] eta 0:00:39 lr 0.000917 time 0.7791 (0.7763) loss 3.0480 (3.8971) grad_norm 1.2267 (1.1029) [2022-09-30 00:24:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 55 training takes 0:16:12 [2022-09-30 00:24:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.674 (4.674) Loss 1.2502 (1.2502) Acc@1 72.461 (72.461) Acc@5 90.527 (90.527) [2022-09-30 00:25:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.782 Acc@5 90.470 [2022-09-30 00:25:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.8% [2022-09-30 00:25:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.78% [2022-09-30 00:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][0/1251] eta 1:41:42 lr 0.000917 time 4.8780 (4.8780) loss 3.8195 (3.8195) grad_norm 0.9407 (0.9407) [2022-09-30 00:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][100/1251] eta 0:15:39 lr 0.000917 time 0.8478 (0.8165) loss 3.1185 (3.8704) grad_norm 0.9890 (1.0584) [2022-09-30 00:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][200/1251] eta 0:13:57 lr 0.000917 time 0.8544 (0.7967) loss 3.0201 (3.8651) grad_norm 1.2556 (1.0803) [2022-09-30 00:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][300/1251] eta 0:12:29 lr 0.000917 time 0.7881 (0.7884) loss 4.5381 (3.8874) grad_norm 1.4329 (1.0977) [2022-09-30 00:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][400/1251] eta 0:11:09 lr 0.000916 time 0.8113 (0.7870) loss 4.3743 (3.9085) grad_norm 1.0703 (1.0991) [2022-09-30 00:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][500/1251] eta 0:09:48 lr 0.000916 time 0.7426 (0.7836) loss 4.3569 (3.9086) grad_norm 0.9539 (1.0963) [2022-09-30 00:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][600/1251] eta 0:08:29 lr 0.000916 time 0.8370 (0.7831) loss 3.9234 (3.9156) grad_norm 0.9508 (1.0944) [2022-09-30 00:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][700/1251] eta 0:07:10 lr 0.000916 time 0.8685 (0.7815) loss 3.2744 (3.9199) grad_norm 0.9831 (1.1003) [2022-09-30 00:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][800/1251] eta 0:05:52 lr 0.000915 time 0.7218 (0.7806) loss 4.0893 (3.9106) grad_norm 1.0781 (1.0989) [2022-09-30 00:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][900/1251] eta 0:04:33 lr 0.000915 time 0.7665 (0.7788) loss 4.1899 (3.9163) grad_norm 1.4233 (1.1014) [2022-09-30 00:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1000/1251] eta 0:03:15 lr 0.000915 time 0.8292 (0.7777) loss 4.3114 (3.9232) grad_norm 1.0162 (1.1030) [2022-09-30 00:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1100/1251] eta 0:01:57 lr 0.000915 time 0.5983 (0.7770) loss 3.2885 (3.9166) grad_norm 1.0268 (1.1042) [2022-09-30 00:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1200/1251] eta 0:00:39 lr 0.000915 time 0.9007 (0.7768) loss 3.1396 (3.9149) grad_norm 1.0522 (1.1037) [2022-09-30 00:41:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 56 training takes 0:16:12 [2022-09-30 00:41:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.285 (4.285) Loss 1.2576 (1.2576) Acc@1 70.898 (70.898) Acc@5 91.016 (91.016) [2022-09-30 00:41:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.818 Acc@5 90.384 [2022-09-30 00:41:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.8% [2022-09-30 00:41:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.82% [2022-09-30 00:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][0/1251] eta 1:32:06 lr 0.000914 time 4.4174 (4.4174) loss 2.5354 (2.5354) grad_norm 1.2707 (1.2707) [2022-09-30 00:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][100/1251] eta 0:15:21 lr 0.000914 time 0.6584 (0.8008) loss 4.2142 (3.8155) grad_norm 1.2841 (1.1010) [2022-09-30 00:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][200/1251] eta 0:13:48 lr 0.000914 time 0.8287 (0.7886) loss 3.2820 (3.8760) grad_norm 0.9550 (1.1038) [2022-09-30 00:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][300/1251] eta 0:12:25 lr 0.000914 time 0.8271 (0.7835) loss 3.4789 (3.8602) grad_norm 1.1237 (1.1084) [2022-09-30 00:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][400/1251] eta 0:11:02 lr 0.000913 time 0.9614 (0.7787) loss 4.0208 (3.8891) grad_norm 1.1399 (1.1105) [2022-09-30 00:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][500/1251] eta 0:09:44 lr 0.000913 time 0.8125 (0.7779) loss 4.3451 (3.8845) grad_norm 1.2535 (1.1104) [2022-09-30 00:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][600/1251] eta 0:08:26 lr 0.000913 time 0.8661 (0.7780) loss 3.0156 (3.8739) grad_norm 1.1241 (1.1130) [2022-09-30 00:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][700/1251] eta 0:07:07 lr 0.000913 time 0.6962 (0.7760) loss 4.8585 (3.8776) grad_norm 1.0366 (1.1094) [2022-09-30 00:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][800/1251] eta 0:05:49 lr 0.000913 time 0.7420 (0.7750) loss 3.6765 (3.8829) grad_norm 1.3943 (1.1068) [2022-09-30 00:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][900/1251] eta 0:04:32 lr 0.000912 time 0.7027 (0.7751) loss 4.3015 (3.8751) grad_norm 1.2073 (1.1050) [2022-09-30 00:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1000/1251] eta 0:03:14 lr 0.000912 time 0.8447 (0.7753) loss 3.7486 (3.8867) grad_norm 1.1540 (1.1047) [2022-09-30 00:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1100/1251] eta 0:01:57 lr 0.000912 time 0.7260 (0.7751) loss 4.3278 (3.8808) grad_norm 1.2268 (1.1051) [2022-09-30 00:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1200/1251] eta 0:00:39 lr 0.000912 time 0.8359 (0.7746) loss 4.2330 (3.8781) grad_norm 1.0397 (1.1061) [2022-09-30 00:57:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 57 training takes 0:16:09 [2022-09-30 00:57:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.891 (3.891) Loss 1.1958 (1.1958) Acc@1 73.047 (73.047) Acc@5 90.918 (90.918) [2022-09-30 00:58:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.886 Acc@5 90.674 [2022-09-30 00:58:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-09-30 00:58:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.89% [2022-09-30 00:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][0/1251] eta 1:37:40 lr 0.000911 time 4.6848 (4.6848) loss 4.1724 (4.1724) grad_norm 1.1173 (1.1173) [2022-09-30 00:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][100/1251] eta 0:15:24 lr 0.000911 time 0.9300 (0.8032) loss 4.3007 (3.8270) grad_norm 1.1494 (1.1178) [2022-09-30 01:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][200/1251] eta 0:13:50 lr 0.000911 time 0.7737 (0.7905) loss 3.4111 (3.8607) grad_norm 1.2416 (1.1060) [2022-09-30 01:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][300/1251] eta 0:12:24 lr 0.000911 time 0.7736 (0.7827) loss 4.1737 (3.8505) grad_norm 0.9932 (1.1104) [2022-09-30 01:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][400/1251] eta 0:11:04 lr 0.000911 time 0.7953 (0.7807) loss 3.5341 (3.8681) grad_norm 1.1196 (1.1025) [2022-09-30 01:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][500/1251] eta 0:09:44 lr 0.000910 time 0.7586 (0.7785) loss 2.8108 (3.8831) grad_norm 1.3143 (1.1021) [2022-09-30 01:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][600/1251] eta 0:08:26 lr 0.000910 time 0.9257 (0.7784) loss 4.2438 (3.8990) grad_norm 1.1282 (1.1051) [2022-09-30 01:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][700/1251] eta 0:07:08 lr 0.000910 time 0.6563 (0.7785) loss 4.1177 (3.9046) grad_norm 1.1250 (1.1041) [2022-09-30 01:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][800/1251] eta 0:05:51 lr 0.000910 time 0.6670 (0.7792) loss 2.7533 (3.9042) grad_norm 1.0965 (1.1019) [2022-09-30 01:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][900/1251] eta 0:04:33 lr 0.000909 time 0.7212 (0.7792) loss 3.8164 (3.9029) grad_norm 1.0450 (1.1013) [2022-09-30 01:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1000/1251] eta 0:03:15 lr 0.000909 time 0.7596 (0.7789) loss 3.7611 (3.8996) grad_norm 1.1092 (1.1004) [2022-09-30 01:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1100/1251] eta 0:01:57 lr 0.000909 time 0.8928 (0.7789) loss 4.2201 (3.8960) grad_norm 0.9486 (1.1029) [2022-09-30 01:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1200/1251] eta 0:00:39 lr 0.000909 time 0.8059 (0.7793) loss 4.1449 (3.8887) grad_norm 1.0533 (1.1053) [2022-09-30 01:14:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 58 training takes 0:16:14 [2022-09-30 01:14:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.101 (4.101) Loss 1.2890 (1.2890) Acc@1 69.922 (69.922) Acc@5 89.844 (89.844) [2022-09-30 01:14:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.992 Acc@5 90.772 [2022-09-30 01:14:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.0% [2022-09-30 01:14:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.99% [2022-09-30 01:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][0/1251] eta 1:41:52 lr 0.000908 time 4.8863 (4.8863) loss 4.4074 (4.4074) grad_norm 1.1228 (1.1228) [2022-09-30 01:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][100/1251] eta 0:15:54 lr 0.000908 time 0.8023 (0.8294) loss 3.6287 (3.8239) grad_norm 1.1455 (1.0980) [2022-09-30 01:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][200/1251] eta 0:14:02 lr 0.000908 time 0.8323 (0.8017) loss 4.3129 (3.8376) grad_norm 0.9813 (1.1024) [2022-09-30 01:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][300/1251] eta 0:12:33 lr 0.000908 time 0.8392 (0.7924) loss 4.1175 (3.8607) grad_norm 0.9706 (1.1039) [2022-09-30 01:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][400/1251] eta 0:11:12 lr 0.000908 time 0.8067 (0.7897) loss 3.4970 (3.8642) grad_norm 1.0357 (1.1004) [2022-09-30 01:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][500/1251] eta 0:09:51 lr 0.000907 time 0.6350 (0.7873) loss 3.5785 (3.8593) grad_norm 1.3569 (1.1006) [2022-09-30 01:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][600/1251] eta 0:08:30 lr 0.000907 time 0.7765 (0.7834) loss 3.9857 (3.8752) grad_norm 1.1861 (1.1034) [2022-09-30 01:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][700/1251] eta 0:07:11 lr 0.000907 time 0.7953 (0.7826) loss 4.2604 (3.8733) grad_norm 1.1746 (1.1020) [2022-09-30 01:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][800/1251] eta 0:05:45 lr 0.000907 time 0.7171 (0.7667) loss 2.9751 (3.8757) grad_norm 1.1567 (1.1015) [2022-09-30 01:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][900/1251] eta 0:04:28 lr 0.000906 time 0.6626 (0.7647) loss 3.7236 (3.8786) grad_norm 1.1708 (1.1012) [2022-09-30 01:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1000/1251] eta 0:03:12 lr 0.000906 time 0.7938 (0.7657) loss 3.1352 (3.8882) grad_norm 0.9402 (1.1021) [2022-09-30 01:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1100/1251] eta 0:01:55 lr 0.000906 time 0.8535 (0.7673) loss 4.3239 (3.8885) grad_norm 1.1181 (1.1015) [2022-09-30 01:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1200/1251] eta 0:00:39 lr 0.000906 time 0.8573 (0.7673) loss 4.1925 (3.8836) grad_norm 1.0560 (1.1040) [2022-09-30 01:30:42 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 59 training takes 0:16:00 [2022-09-30 01:30:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.084 (4.084) Loss 1.2770 (1.2770) Acc@1 72.363 (72.363) Acc@5 90.820 (90.820) [2022-09-30 01:31:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.406 Acc@5 90.620 [2022-09-30 01:31:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-09-30 01:31:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.41% [2022-09-30 01:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][0/1251] eta 1:41:55 lr 0.000905 time 4.8888 (4.8888) loss 3.7653 (3.7653) grad_norm 1.1852 (1.1852) [2022-09-30 01:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][100/1251] eta 0:15:28 lr 0.000905 time 0.7898 (0.8070) loss 4.9302 (3.8616) grad_norm 1.3688 (1.1280) [2022-09-30 01:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][200/1251] eta 0:13:54 lr 0.000905 time 0.7985 (0.7937) loss 4.1452 (3.8809) grad_norm 1.0451 (1.1297) [2022-09-30 01:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][300/1251] eta 0:12:27 lr 0.000905 time 0.7828 (0.7859) loss 4.5124 (3.8791) grad_norm 1.0596 (1.1205) [2022-09-30 01:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][400/1251] eta 0:11:06 lr 0.000904 time 0.8168 (0.7836) loss 4.5967 (3.8883) grad_norm 1.1340 (1.1229) [2022-09-30 01:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][500/1251] eta 0:09:46 lr 0.000904 time 0.9213 (0.7809) loss 3.7959 (3.8919) grad_norm 0.9613 (1.1164) [2022-09-30 01:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][600/1251] eta 0:08:27 lr 0.000904 time 0.7778 (0.7795) loss 3.6437 (3.9048) grad_norm 1.1628 (1.1174) [2022-09-30 01:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][700/1251] eta 0:07:08 lr 0.000904 time 0.8544 (0.7778) loss 3.9072 (3.9102) grad_norm 1.0669 (1.1141) [2022-09-30 01:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][800/1251] eta 0:05:50 lr 0.000904 time 0.6539 (0.7777) loss 2.7181 (3.9040) grad_norm 1.3558 (1.1169) [2022-09-30 01:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][900/1251] eta 0:04:32 lr 0.000903 time 0.8242 (0.7774) loss 2.7951 (3.8986) grad_norm 1.2218 (1.1148) [2022-09-30 01:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1000/1251] eta 0:03:15 lr 0.000903 time 0.8837 (0.7770) loss 2.9405 (3.8874) grad_norm 1.0347 (1.1148) [2022-09-30 01:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1100/1251] eta 0:01:57 lr 0.000903 time 0.8007 (0.7768) loss 2.7817 (3.8926) grad_norm 1.0039 (1.1123) [2022-09-30 01:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1200/1251] eta 0:00:39 lr 0.000903 time 0.7630 (0.7763) loss 4.4058 (3.8940) grad_norm 1.1657 (1.1129) [2022-09-30 01:47:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 60 training takes 0:16:11 [2022-09-30 01:47:15 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saving...... [2022-09-30 01:47:15 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saved !!! [2022-09-30 01:47:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.836 (3.836) Loss 1.2034 (1.2034) Acc@1 71.777 (71.777) Acc@5 90.723 (90.723) [2022-09-30 01:47:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.102 Acc@5 90.766 [2022-09-30 01:47:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-09-30 01:47:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.41% [2022-09-30 01:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][0/1251] eta 1:33:10 lr 0.000902 time 4.4685 (4.4685) loss 4.4195 (4.4195) grad_norm 1.0475 (1.0475) [2022-09-30 01:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][100/1251] eta 0:15:38 lr 0.000902 time 0.7371 (0.8156) loss 4.4943 (3.8571) grad_norm 0.9718 (1.1168) [2022-09-30 01:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][200/1251] eta 0:14:02 lr 0.000902 time 0.8997 (0.8013) loss 3.9483 (3.8475) grad_norm 1.0661 (1.1339) [2022-09-30 01:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][300/1251] eta 0:12:33 lr 0.000902 time 0.6424 (0.7928) loss 4.5973 (3.8622) grad_norm 1.3410 (1.1266) [2022-09-30 01:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][400/1251] eta 0:11:12 lr 0.000901 time 0.8597 (0.7908) loss 4.4956 (3.8718) grad_norm 1.2397 (1.1212) [2022-09-30 01:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][500/1251] eta 0:09:51 lr 0.000901 time 0.5846 (0.7870) loss 2.8990 (3.8703) grad_norm 1.1533 (1.1185) [2022-09-30 01:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][600/1251] eta 0:08:31 lr 0.000901 time 0.8460 (0.7856) loss 3.3999 (3.8843) grad_norm 1.0740 (1.1174) [2022-09-30 01:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][700/1251] eta 0:07:12 lr 0.000901 time 0.8566 (0.7851) loss 3.2482 (3.8747) grad_norm 1.0975 (1.1187) [2022-09-30 01:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][800/1251] eta 0:05:53 lr 0.000900 time 0.7780 (0.7844) loss 3.9890 (3.8846) grad_norm 1.0509 (1.1157) [2022-09-30 01:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][900/1251] eta 0:04:34 lr 0.000900 time 0.8565 (0.7834) loss 4.3281 (3.8933) grad_norm 1.0869 (1.1179) [2022-09-30 02:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1000/1251] eta 0:03:16 lr 0.000900 time 0.8269 (0.7823) loss 4.1708 (3.8930) grad_norm 1.0532 (1.1162) [2022-09-30 02:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1100/1251] eta 0:01:58 lr 0.000900 time 0.7042 (0.7818) loss 3.5990 (3.8984) grad_norm 1.1346 (1.1144) [2022-09-30 02:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1200/1251] eta 0:00:39 lr 0.000899 time 0.8970 (0.7811) loss 2.8740 (3.8942) grad_norm 1.0195 (1.1128) [2022-09-30 02:03:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 61 training takes 0:16:17 [2022-09-30 02:03:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.849 (4.849) Loss 1.2069 (1.2069) Acc@1 71.875 (71.875) Acc@5 91.113 (91.113) [2022-09-30 02:04:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.022 Acc@5 90.742 [2022-09-30 02:04:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.0% [2022-09-30 02:04:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.41% [2022-09-30 02:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][0/1251] eta 1:40:26 lr 0.000899 time 4.8175 (4.8175) loss 4.1053 (4.1053) grad_norm 1.1054 (1.1054) [2022-09-30 02:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][100/1251] eta 0:15:38 lr 0.000899 time 0.6931 (0.8155) loss 4.4757 (3.8104) grad_norm 1.0543 (1.0994) [2022-09-30 02:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][200/1251] eta 0:14:02 lr 0.000899 time 0.7417 (0.8015) loss 4.1001 (3.7883) grad_norm 1.2173 (1.1087) [2022-09-30 02:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][300/1251] eta 0:12:33 lr 0.000899 time 0.8356 (0.7924) loss 3.9636 (3.8158) grad_norm 1.1089 (1.1215) [2022-09-30 02:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][400/1251] eta 0:11:11 lr 0.000898 time 0.7673 (0.7885) loss 4.4339 (3.8352) grad_norm 1.1120 (1.1184) [2022-09-30 02:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][500/1251] eta 0:09:50 lr 0.000898 time 0.8377 (0.7864) loss 3.7770 (3.8375) grad_norm 0.9748 (1.1244) [2022-09-30 02:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][600/1251] eta 0:08:29 lr 0.000898 time 0.6831 (0.7827) loss 3.5009 (3.8478) grad_norm 1.0292 (1.1222) [2022-09-30 02:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][700/1251] eta 0:07:11 lr 0.000898 time 0.6626 (0.7829) loss 4.2038 (3.8525) grad_norm 1.0361 (1.1207) [2022-09-30 02:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][800/1251] eta 0:05:52 lr 0.000897 time 0.8520 (0.7819) loss 2.9298 (3.8514) grad_norm 1.3176 (1.1204) [2022-09-30 02:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][900/1251] eta 0:04:34 lr 0.000897 time 0.9212 (0.7814) loss 3.2439 (3.8460) grad_norm 0.9881 (1.1190) [2022-09-30 02:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1000/1251] eta 0:03:15 lr 0.000897 time 0.7612 (0.7804) loss 3.6824 (3.8463) grad_norm 0.9888 (1.1186) [2022-09-30 02:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1100/1251] eta 0:01:57 lr 0.000897 time 0.8235 (0.7797) loss 4.2437 (3.8529) grad_norm 1.1592 (1.1166) [2022-09-30 02:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1200/1251] eta 0:00:39 lr 0.000896 time 0.8261 (0.7790) loss 4.3510 (3.8596) grad_norm 0.9310 (1.1170) [2022-09-30 02:20:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 62 training takes 0:16:15 [2022-09-30 02:20:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.347 (4.347) Loss 1.2287 (1.2287) Acc@1 70.703 (70.703) Acc@5 90.430 (90.430) [2022-09-30 02:20:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.546 Acc@5 90.812 [2022-09-30 02:20:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-09-30 02:20:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.55% [2022-09-30 02:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][0/1251] eta 1:24:45 lr 0.000896 time 4.0651 (4.0651) loss 4.0212 (4.0212) grad_norm 1.1188 (1.1188) [2022-09-30 02:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][100/1251] eta 0:15:39 lr 0.000896 time 0.9095 (0.8158) loss 4.3800 (3.8507) grad_norm 0.9423 (1.1120) [2022-09-30 02:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][200/1251] eta 0:13:50 lr 0.000896 time 0.7904 (0.7903) loss 3.7074 (3.8235) grad_norm 1.1714 (1.1279) [2022-09-30 02:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][300/1251] eta 0:12:25 lr 0.000895 time 0.8369 (0.7835) loss 3.3306 (3.8188) grad_norm 1.0379 (1.1259) [2022-09-30 02:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][400/1251] eta 0:11:06 lr 0.000895 time 0.6525 (0.7830) loss 3.5636 (3.8413) grad_norm 1.0059 (1.1274) [2022-09-30 02:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][500/1251] eta 0:09:46 lr 0.000895 time 0.6846 (0.7812) loss 3.8868 (3.8654) grad_norm 1.0934 (1.1316) [2022-09-30 02:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][600/1251] eta 0:08:28 lr 0.000895 time 0.8069 (0.7806) loss 3.1296 (3.8648) grad_norm 1.1185 (1.1310) [2022-09-30 02:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][700/1251] eta 0:07:09 lr 0.000894 time 0.7083 (0.7792) loss 4.1753 (3.8803) grad_norm 1.1301 (1.1294) [2022-09-30 02:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][800/1251] eta 0:05:50 lr 0.000894 time 0.8141 (0.7780) loss 4.5877 (3.8696) grad_norm 1.1399 (1.1288) [2022-09-30 02:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][900/1251] eta 0:04:32 lr 0.000894 time 0.8017 (0.7777) loss 3.6853 (3.8617) grad_norm 1.1938 (1.1284) [2022-09-30 02:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1000/1251] eta 0:03:15 lr 0.000894 time 0.8370 (0.7776) loss 4.3971 (3.8600) grad_norm 1.0466 (1.1278) [2022-09-30 02:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1100/1251] eta 0:01:57 lr 0.000893 time 0.7328 (0.7770) loss 3.4821 (3.8558) grad_norm 1.0956 (1.1272) [2022-09-30 02:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1200/1251] eta 0:00:39 lr 0.000893 time 0.8286 (0.7758) loss 3.2507 (3.8627) grad_norm 1.1458 (1.1252) [2022-09-30 02:37:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 63 training takes 0:16:10 [2022-09-30 02:37:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.440 (4.440) Loss 1.1724 (1.1724) Acc@1 72.559 (72.559) Acc@5 92.676 (92.676) [2022-09-30 02:37:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.536 Acc@5 90.872 [2022-09-30 02:37:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-09-30 02:37:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.55% [2022-09-30 02:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][0/1251] eta 1:37:13 lr 0.000893 time 4.6633 (4.6633) loss 3.8379 (3.8379) grad_norm 0.9389 (0.9389) [2022-09-30 02:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][100/1251] eta 0:15:37 lr 0.000893 time 0.7679 (0.8141) loss 3.9381 (3.8490) grad_norm 1.2919 (1.1248) [2022-09-30 02:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][200/1251] eta 0:13:54 lr 0.000892 time 0.6985 (0.7938) loss 3.5440 (3.8488) grad_norm 1.1337 (1.1341) [2022-09-30 02:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][300/1251] eta 0:12:30 lr 0.000892 time 0.7447 (0.7888) loss 3.3577 (3.8521) grad_norm 1.0345 (1.1427) [2022-09-30 02:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][400/1251] eta 0:11:06 lr 0.000892 time 0.8256 (0.7829) loss 3.7337 (3.8459) grad_norm 1.4672 (1.1315) [2022-09-30 02:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][500/1251] eta 0:09:45 lr 0.000892 time 0.8496 (0.7791) loss 3.5639 (3.8418) grad_norm 0.9717 (1.1341) [2022-09-30 02:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][600/1251] eta 0:08:26 lr 0.000891 time 0.5950 (0.7782) loss 4.7693 (3.8524) grad_norm 1.1486 (1.1275) [2022-09-30 02:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][700/1251] eta 0:07:07 lr 0.000891 time 0.7910 (0.7761) loss 3.5469 (3.8440) grad_norm 1.1139 (1.1262) [2022-09-30 02:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][800/1251] eta 0:05:50 lr 0.000891 time 0.9111 (0.7769) loss 4.7908 (3.8515) grad_norm 1.0259 (1.1247) [2022-09-30 02:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][900/1251] eta 0:04:32 lr 0.000891 time 0.8243 (0.7764) loss 4.3197 (3.8631) grad_norm 1.2917 (1.1247) [2022-09-30 02:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1000/1251] eta 0:03:14 lr 0.000890 time 0.8762 (0.7766) loss 3.8607 (3.8679) grad_norm 1.3554 (1.1239) [2022-09-30 02:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1100/1251] eta 0:01:57 lr 0.000890 time 0.7587 (0.7773) loss 3.7994 (3.8654) grad_norm 1.2206 (1.1216) [2022-09-30 02:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1200/1251] eta 0:00:39 lr 0.000890 time 0.7466 (0.7768) loss 3.9806 (3.8623) grad_norm 1.0876 (1.1219) [2022-09-30 02:53:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 64 training takes 0:16:11 [2022-09-30 02:53:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.797 (3.797) Loss 1.2621 (1.2621) Acc@1 69.922 (69.922) Acc@5 89.941 (89.941) [2022-09-30 02:53:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.576 Acc@5 90.910 [2022-09-30 02:53:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-09-30 02:53:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.58% [2022-09-30 02:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][0/1251] eta 1:47:32 lr 0.000890 time 5.1575 (5.1575) loss 3.3158 (3.3158) grad_norm 1.0824 (1.0824) [2022-09-30 02:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][100/1251] eta 0:15:40 lr 0.000889 time 0.8327 (0.8170) loss 4.7497 (3.7677) grad_norm 1.2089 (1.1314) [2022-09-30 02:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][200/1251] eta 0:13:52 lr 0.000889 time 0.6751 (0.7919) loss 4.6195 (3.7895) grad_norm 1.2139 (1.1333) [2022-09-30 02:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][300/1251] eta 0:12:27 lr 0.000889 time 0.8516 (0.7863) loss 3.8781 (3.7891) grad_norm 1.0173 (1.1203) [2022-09-30 02:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][400/1251] eta 0:11:07 lr 0.000889 time 0.6794 (0.7849) loss 3.0022 (3.8093) grad_norm 1.1201 (1.1229) [2022-09-30 03:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][500/1251] eta 0:09:47 lr 0.000888 time 0.8732 (0.7826) loss 2.9542 (3.8156) grad_norm 1.0670 (1.1301) [2022-09-30 03:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][600/1251] eta 0:08:27 lr 0.000888 time 0.6848 (0.7800) loss 4.1098 (3.8268) grad_norm 1.1625 (1.1281) [2022-09-30 03:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][700/1251] eta 0:07:08 lr 0.000888 time 0.8101 (0.7777) loss 4.1295 (3.8407) grad_norm 1.2913 (1.1252) [2022-09-30 03:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][800/1251] eta 0:05:50 lr 0.000888 time 0.8073 (0.7774) loss 3.5617 (3.8565) grad_norm 1.2314 (1.1221) [2022-09-30 03:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][900/1251] eta 0:04:33 lr 0.000887 time 0.6988 (0.7781) loss 3.4004 (3.8558) grad_norm 1.2438 (1.1229) [2022-09-30 03:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1000/1251] eta 0:03:15 lr 0.000887 time 0.8208 (0.7787) loss 2.4761 (3.8545) grad_norm 1.0465 (1.1247) [2022-09-30 03:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1100/1251] eta 0:01:57 lr 0.000887 time 0.7607 (0.7785) loss 3.9325 (3.8523) grad_norm 1.1781 (1.1257) [2022-09-30 03:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1200/1251] eta 0:00:39 lr 0.000887 time 0.8207 (0.7776) loss 4.4924 (3.8607) grad_norm 1.1825 (1.1269) [2022-09-30 03:10:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 65 training takes 0:16:12 [2022-09-30 03:10:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.003 (4.003) Loss 1.1757 (1.1757) Acc@1 72.363 (72.363) Acc@5 90.723 (90.723) [2022-09-30 03:10:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.400 Acc@5 90.886 [2022-09-30 03:10:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-09-30 03:10:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.58% [2022-09-30 03:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][0/1251] eta 1:42:24 lr 0.000886 time 4.9117 (4.9117) loss 3.1187 (3.1187) grad_norm 1.2075 (1.2075) [2022-09-30 03:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][100/1251] eta 0:15:36 lr 0.000886 time 0.7170 (0.8136) loss 3.7665 (3.8543) grad_norm 1.0371 (1.1101) [2022-09-30 03:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][200/1251] eta 0:13:50 lr 0.000886 time 0.8820 (0.7899) loss 3.0819 (3.8657) grad_norm 1.3081 (1.1248) [2022-09-30 03:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][300/1251] eta 0:12:28 lr 0.000886 time 0.8287 (0.7869) loss 3.8502 (3.8559) grad_norm 1.2544 (1.1234) [2022-09-30 03:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][400/1251] eta 0:11:07 lr 0.000885 time 0.7440 (0.7839) loss 4.1946 (3.8652) grad_norm 1.3617 (1.1247) [2022-09-30 03:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][500/1251] eta 0:09:47 lr 0.000885 time 0.7860 (0.7828) loss 3.9885 (3.8414) grad_norm 1.0899 (1.1242) [2022-09-30 03:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][600/1251] eta 0:08:28 lr 0.000885 time 0.6562 (0.7808) loss 3.7392 (3.8512) grad_norm 1.1703 (1.1241) [2022-09-30 03:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][700/1251] eta 0:07:09 lr 0.000885 time 0.9284 (0.7792) loss 3.1033 (3.8552) grad_norm 1.1442 (1.1242) [2022-09-30 03:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][800/1251] eta 0:05:51 lr 0.000884 time 0.8661 (0.7790) loss 3.8900 (3.8520) grad_norm 1.1474 (1.1279) [2022-09-30 03:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][900/1251] eta 0:04:33 lr 0.000884 time 0.7823 (0.7789) loss 2.9545 (3.8513) grad_norm 1.1909 (1.1288) [2022-09-30 03:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1000/1251] eta 0:03:15 lr 0.000884 time 0.7360 (0.7792) loss 2.9585 (3.8517) grad_norm 1.2589 (1.1266) [2022-09-30 03:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1100/1251] eta 0:01:57 lr 0.000883 time 0.6613 (0.7794) loss 4.1087 (3.8569) grad_norm 1.1243 (1.1279) [2022-09-30 03:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1200/1251] eta 0:00:39 lr 0.000883 time 0.9011 (0.7780) loss 4.0160 (3.8571) grad_norm 0.9571 (1.1286) [2022-09-30 03:26:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 66 training takes 0:16:12 [2022-09-30 03:26:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.748 (3.748) Loss 1.2427 (1.2427) Acc@1 72.559 (72.559) Acc@5 91.113 (91.113) [2022-09-30 03:27:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.496 Acc@5 90.896 [2022-09-30 03:27:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-09-30 03:27:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.58% [2022-09-30 03:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][0/1251] eta 1:37:23 lr 0.000883 time 4.6708 (4.6708) loss 3.9372 (3.9372) grad_norm 1.1920 (1.1920) [2022-09-30 03:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][100/1251] eta 0:15:36 lr 0.000883 time 0.7887 (0.8137) loss 3.1561 (3.8134) grad_norm 1.0813 (1.1474) [2022-09-30 03:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][200/1251] eta 0:13:53 lr 0.000883 time 0.8340 (0.7929) loss 4.2440 (3.8258) grad_norm 1.3078 (1.1387) [2022-09-30 03:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][300/1251] eta 0:12:30 lr 0.000882 time 0.8289 (0.7896) loss 4.4509 (3.8323) grad_norm 1.2299 (1.1324) [2022-09-30 03:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][400/1251] eta 0:11:08 lr 0.000882 time 0.8995 (0.7851) loss 3.0433 (3.8435) grad_norm 1.1872 (1.1295) [2022-09-30 03:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][500/1251] eta 0:09:49 lr 0.000882 time 0.7997 (0.7846) loss 4.1765 (3.8499) grad_norm 1.2213 (1.1338) [2022-09-30 03:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][600/1251] eta 0:08:28 lr 0.000881 time 0.6140 (0.7817) loss 4.1067 (3.8586) grad_norm 1.0122 (1.1296) [2022-09-30 03:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][700/1251] eta 0:07:10 lr 0.000881 time 0.7158 (0.7805) loss 3.8457 (3.8571) grad_norm 1.2458 (1.1301) [2022-09-30 03:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][800/1251] eta 0:05:52 lr 0.000881 time 0.8099 (0.7806) loss 2.6923 (3.8549) grad_norm 1.1440 (1.1300) [2022-09-30 03:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][900/1251] eta 0:04:33 lr 0.000881 time 0.8545 (0.7798) loss 2.9655 (3.8643) grad_norm 1.1294 (1.1290) [2022-09-30 03:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1000/1251] eta 0:03:15 lr 0.000880 time 0.6952 (0.7789) loss 3.6071 (3.8631) grad_norm 1.2974 (1.1272) [2022-09-30 03:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1100/1251] eta 0:01:57 lr 0.000880 time 0.8288 (0.7785) loss 4.1844 (3.8684) grad_norm 1.0251 (1.1263) [2022-09-30 03:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1200/1251] eta 0:00:39 lr 0.000880 time 0.7930 (0.7780) loss 4.4098 (3.8698) grad_norm 1.1188 (1.1269) [2022-09-30 03:43:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 67 training takes 0:16:14 [2022-09-30 03:43:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.500 (4.500) Loss 1.2224 (1.2224) Acc@1 71.680 (71.680) Acc@5 91.406 (91.406) [2022-09-30 03:43:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.738 Acc@5 91.140 [2022-09-30 03:43:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-09-30 03:43:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.74% [2022-09-30 03:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][0/1251] eta 1:43:22 lr 0.000880 time 4.9583 (4.9583) loss 4.7703 (4.7703) grad_norm 1.1718 (1.1718) [2022-09-30 03:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][100/1251] eta 0:15:40 lr 0.000879 time 0.7537 (0.8174) loss 2.5954 (3.8504) grad_norm 1.0800 (1.0935) [2022-09-30 03:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][200/1251] eta 0:14:01 lr 0.000879 time 0.7878 (0.8006) loss 3.1560 (3.8235) grad_norm 1.1607 (1.1112) [2022-09-30 03:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][300/1251] eta 0:12:05 lr 0.000879 time 0.5203 (0.7631) loss 4.6027 (3.8288) grad_norm 0.9872 (1.1122) [2022-09-30 03:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][400/1251] eta 0:10:39 lr 0.000879 time 0.8538 (0.7515) loss 4.1000 (3.8353) grad_norm 1.1423 (1.1209) [2022-09-30 03:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][500/1251] eta 0:09:28 lr 0.000878 time 0.7214 (0.7572) loss 4.1140 (3.8420) grad_norm 1.0810 (1.1210) [2022-09-30 03:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][600/1251] eta 0:08:14 lr 0.000878 time 0.8368 (0.7603) loss 3.4125 (3.8403) grad_norm 1.0750 (1.1206) [2022-09-30 03:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][700/1251] eta 0:06:59 lr 0.000878 time 0.8420 (0.7622) loss 4.2830 (3.8466) grad_norm 1.3115 (1.1233) [2022-09-30 03:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][800/1251] eta 0:05:44 lr 0.000878 time 0.8340 (0.7630) loss 2.9856 (3.8510) grad_norm 1.0157 (1.1254) [2022-09-30 03:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][900/1251] eta 0:04:28 lr 0.000877 time 0.6536 (0.7642) loss 3.8445 (3.8435) grad_norm 1.0150 (1.1303) [2022-09-30 03:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1000/1251] eta 0:03:11 lr 0.000877 time 0.6958 (0.7641) loss 4.7157 (3.8394) grad_norm 1.1447 (1.1285) [2022-09-30 03:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1100/1251] eta 0:01:55 lr 0.000877 time 0.7472 (0.7652) loss 2.8548 (3.8355) grad_norm 1.0420 (1.1304) [2022-09-30 03:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1200/1251] eta 0:00:39 lr 0.000876 time 0.7618 (0.7660) loss 2.5688 (3.8332) grad_norm 1.0167 (1.1326) [2022-09-30 03:59:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 68 training takes 0:15:58 [2022-09-30 03:59:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.772 (3.772) Loss 1.2279 (1.2279) Acc@1 71.680 (71.680) Acc@5 91.504 (91.504) [2022-09-30 03:59:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.504 Acc@5 91.144 [2022-09-30 03:59:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-09-30 03:59:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.74% [2022-09-30 04:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][0/1251] eta 1:27:04 lr 0.000876 time 4.1765 (4.1765) loss 3.5344 (3.5344) grad_norm 0.9573 (0.9573) [2022-09-30 04:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][100/1251] eta 0:15:29 lr 0.000876 time 0.8159 (0.8073) loss 4.1928 (3.8573) grad_norm 1.1835 (1.1131) [2022-09-30 04:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][200/1251] eta 0:13:53 lr 0.000876 time 0.7282 (0.7935) loss 4.1769 (3.8230) grad_norm 1.1061 (1.1240) [2022-09-30 04:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][300/1251] eta 0:12:32 lr 0.000875 time 0.9016 (0.7915) loss 4.0643 (3.8190) grad_norm 1.0049 (1.1257) [2022-09-30 04:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][400/1251] eta 0:11:08 lr 0.000875 time 0.6506 (0.7857) loss 3.9528 (3.8028) grad_norm 1.1768 (1.1260) [2022-09-30 04:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][500/1251] eta 0:09:48 lr 0.000875 time 0.8199 (0.7830) loss 3.9482 (3.7944) grad_norm 1.0293 (1.1262) [2022-09-30 04:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][600/1251] eta 0:08:29 lr 0.000875 time 0.8072 (0.7820) loss 4.1807 (3.8119) grad_norm 1.4352 (1.1310) [2022-09-30 04:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][700/1251] eta 0:07:10 lr 0.000874 time 0.7696 (0.7806) loss 3.8606 (3.8101) grad_norm 1.0524 (1.1311) [2022-09-30 04:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][800/1251] eta 0:05:51 lr 0.000874 time 0.9501 (0.7804) loss 4.3219 (3.8157) grad_norm 1.0613 (1.1311) [2022-09-30 04:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][900/1251] eta 0:04:33 lr 0.000874 time 0.8427 (0.7799) loss 3.3392 (3.8150) grad_norm 1.2626 (1.1328) [2022-09-30 04:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1000/1251] eta 0:03:15 lr 0.000874 time 0.8687 (0.7793) loss 3.8500 (3.8253) grad_norm 1.0568 (1.1336) [2022-09-30 04:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1100/1251] eta 0:01:57 lr 0.000873 time 0.7434 (0.7795) loss 3.6411 (3.8285) grad_norm 0.9932 (1.1375) [2022-09-30 04:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1200/1251] eta 0:00:39 lr 0.000873 time 0.7981 (0.7790) loss 2.9110 (3.8267) grad_norm 1.1033 (1.1395) [2022-09-30 04:16:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 69 training takes 0:16:15 [2022-09-30 04:16:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.511 (4.511) Loss 1.1493 (1.1493) Acc@1 73.730 (73.730) Acc@5 92.188 (92.188) [2022-09-30 04:16:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.092 Acc@5 91.208 [2022-09-30 04:16:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-09-30 04:16:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.09% [2022-09-30 04:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][0/1251] eta 1:48:26 lr 0.000873 time 5.2013 (5.2013) loss 4.0082 (4.0082) grad_norm 0.9343 (0.9343) [2022-09-30 04:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][100/1251] eta 0:15:32 lr 0.000873 time 0.6614 (0.8101) loss 3.1939 (3.8101) grad_norm 1.1012 (1.1396) [2022-09-30 04:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][200/1251] eta 0:13:53 lr 0.000872 time 0.8250 (0.7933) loss 3.2566 (3.8153) grad_norm 1.0642 (1.1325) [2022-09-30 04:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][300/1251] eta 0:12:26 lr 0.000872 time 0.8187 (0.7847) loss 4.6570 (3.8146) grad_norm 0.9966 (1.1283) [2022-09-30 04:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][400/1251] eta 0:11:05 lr 0.000872 time 0.6320 (0.7826) loss 4.1544 (3.8141) grad_norm 1.2027 (1.1282) [2022-09-30 04:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][500/1251] eta 0:09:47 lr 0.000871 time 0.8595 (0.7823) loss 3.6001 (3.8155) grad_norm 1.2038 (1.1321) [2022-09-30 04:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][600/1251] eta 0:08:28 lr 0.000871 time 0.7612 (0.7818) loss 4.0994 (3.8056) grad_norm 1.0629 (1.1340) [2022-09-30 04:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][700/1251] eta 0:07:10 lr 0.000871 time 0.8451 (0.7813) loss 3.9342 (3.8103) grad_norm 1.4952 (1.1329) [2022-09-30 04:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][800/1251] eta 0:05:51 lr 0.000871 time 0.7919 (0.7801) loss 4.1490 (3.8089) grad_norm 1.1804 (1.1352) [2022-09-30 04:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][900/1251] eta 0:04:33 lr 0.000870 time 0.7274 (0.7796) loss 4.4561 (3.8236) grad_norm 1.2577 (1.1350) [2022-09-30 04:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1000/1251] eta 0:03:15 lr 0.000870 time 0.8957 (0.7780) loss 4.6323 (3.8292) grad_norm 1.0590 (1.1371) [2022-09-30 04:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1100/1251] eta 0:01:57 lr 0.000870 time 0.7349 (0.7785) loss 4.5483 (3.8400) grad_norm 1.0569 (1.1384) [2022-09-30 04:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1200/1251] eta 0:00:39 lr 0.000870 time 0.8127 (0.7785) loss 3.7045 (3.8379) grad_norm 0.9591 (1.1369) [2022-09-30 04:32:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 70 training takes 0:16:14 [2022-09-30 04:32:47 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saving...... [2022-09-30 04:32:47 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saved !!! [2022-09-30 04:32:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.234 (4.234) Loss 1.2299 (1.2299) Acc@1 70.410 (70.410) Acc@5 90.234 (90.234) [2022-09-30 04:33:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.710 Acc@5 91.058 [2022-09-30 04:33:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-09-30 04:33:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.09% [2022-09-30 04:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][0/1251] eta 1:30:43 lr 0.000869 time 4.3514 (4.3514) loss 3.3080 (3.3080) grad_norm 0.9973 (0.9973) [2022-09-30 04:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][100/1251] eta 0:15:33 lr 0.000869 time 0.8289 (0.8108) loss 4.2587 (3.8297) grad_norm 1.0132 (1.1365) [2022-09-30 04:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][200/1251] eta 0:13:59 lr 0.000869 time 0.9286 (0.7985) loss 3.9476 (3.8017) grad_norm 1.2932 (1.1464) [2022-09-30 04:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][300/1251] eta 0:12:33 lr 0.000869 time 0.8149 (0.7923) loss 4.3905 (3.7840) grad_norm 1.0411 (1.1395) [2022-09-30 04:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][400/1251] eta 0:11:10 lr 0.000868 time 0.6595 (0.7875) loss 3.3879 (3.7762) grad_norm 1.1170 (1.1350) [2022-09-30 04:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][500/1251] eta 0:09:50 lr 0.000868 time 0.8460 (0.7859) loss 3.5902 (3.7794) grad_norm 1.0625 (1.1349) [2022-09-30 04:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][600/1251] eta 0:08:30 lr 0.000868 time 0.8180 (0.7841) loss 3.0643 (3.7768) grad_norm 1.0819 (1.1391) [2022-09-30 04:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][700/1251] eta 0:07:11 lr 0.000867 time 0.7571 (0.7834) loss 4.0991 (3.7961) grad_norm 1.1774 (1.1415) [2022-09-30 04:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][800/1251] eta 0:05:52 lr 0.000867 time 0.7964 (0.7819) loss 4.5875 (3.8101) grad_norm 1.3812 (1.1419) [2022-09-30 04:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][900/1251] eta 0:04:33 lr 0.000867 time 0.6841 (0.7804) loss 2.7473 (3.8057) grad_norm 1.2917 (1.1411) [2022-09-30 04:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1000/1251] eta 0:03:15 lr 0.000867 time 0.8341 (0.7800) loss 3.6991 (3.8113) grad_norm 1.0546 (1.1384) [2022-09-30 04:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1100/1251] eta 0:01:57 lr 0.000866 time 0.8352 (0.7803) loss 3.9413 (3.8107) grad_norm 1.0488 (1.1383) [2022-09-30 04:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1200/1251] eta 0:00:39 lr 0.000866 time 0.8063 (0.7807) loss 4.2473 (3.8075) grad_norm 1.2081 (1.1390) [2022-09-30 04:49:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 71 training takes 0:16:15 [2022-09-30 04:49:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.281 (4.281) Loss 1.2201 (1.2201) Acc@1 72.070 (72.070) Acc@5 90.625 (90.625) [2022-09-30 04:49:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.930 Acc@5 91.196 [2022-09-30 04:49:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-09-30 04:49:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.09% [2022-09-30 04:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][0/1251] eta 1:39:08 lr 0.000866 time 4.7548 (4.7548) loss 4.0602 (4.0602) grad_norm 1.1725 (1.1725) [2022-09-30 04:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][100/1251] eta 0:15:42 lr 0.000866 time 0.8273 (0.8187) loss 2.7617 (3.7951) grad_norm 1.1855 (1.1309) [2022-09-30 04:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][200/1251] eta 0:13:51 lr 0.000865 time 0.8272 (0.7914) loss 4.4987 (3.8121) grad_norm 1.0847 (1.1315) [2022-09-30 04:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][300/1251] eta 0:12:30 lr 0.000865 time 0.8049 (0.7888) loss 4.4718 (3.7894) grad_norm 1.0589 (1.1422) [2022-09-30 04:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][400/1251] eta 0:11:09 lr 0.000865 time 0.9248 (0.7861) loss 4.5114 (3.7990) grad_norm 1.1358 (1.1386) [2022-09-30 04:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][500/1251] eta 0:09:48 lr 0.000864 time 0.7602 (0.7841) loss 3.1768 (3.8067) grad_norm 1.1504 (1.1470) [2022-09-30 04:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][600/1251] eta 0:08:29 lr 0.000864 time 0.8171 (0.7825) loss 4.3359 (3.7908) grad_norm 1.4933 (1.1479) [2022-09-30 04:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][700/1251] eta 0:07:11 lr 0.000864 time 0.7880 (0.7827) loss 4.5026 (3.8047) grad_norm 1.1802 (1.1435) [2022-09-30 05:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][800/1251] eta 0:05:52 lr 0.000864 time 0.7975 (0.7823) loss 4.1869 (3.8125) grad_norm 1.0962 (1.1419) [2022-09-30 05:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][900/1251] eta 0:04:34 lr 0.000863 time 0.8208 (0.7821) loss 3.9999 (3.8167) grad_norm 1.2388 (1.1387) [2022-09-30 05:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1000/1251] eta 0:03:16 lr 0.000863 time 0.6711 (0.7818) loss 4.0445 (3.8133) grad_norm 1.0729 (1.1383) [2022-09-30 05:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1100/1251] eta 0:01:57 lr 0.000863 time 0.7513 (0.7811) loss 4.5662 (3.8206) grad_norm 1.2181 (1.1391) [2022-09-30 05:05:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1200/1251] eta 0:00:39 lr 0.000862 time 0.8328 (0.7793) loss 3.2271 (3.8164) grad_norm 1.0440 (1.1357) [2022-09-30 05:06:00 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 72 training takes 0:16:15 [2022-09-30 05:06:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.285 (4.285) Loss 1.2305 (1.2305) Acc@1 72.363 (72.363) Acc@5 90.820 (90.820) [2022-09-30 05:06:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.102 Acc@5 91.302 [2022-09-30 05:06:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-09-30 05:06:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.10% [2022-09-30 05:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][0/1251] eta 1:24:53 lr 0.000862 time 4.0718 (4.0718) loss 4.0050 (4.0050) grad_norm 1.3176 (1.3176) [2022-09-30 05:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][100/1251] eta 0:15:27 lr 0.000862 time 0.9375 (0.8061) loss 3.6242 (3.6865) grad_norm 1.2205 (1.1413) [2022-09-30 05:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][200/1251] eta 0:13:51 lr 0.000862 time 0.7734 (0.7916) loss 3.3511 (3.7368) grad_norm 1.0684 (1.1513) [2022-09-30 05:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][300/1251] eta 0:12:25 lr 0.000861 time 0.7733 (0.7839) loss 3.7276 (3.7689) grad_norm 1.0827 (1.1562) [2022-09-30 05:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][400/1251] eta 0:11:04 lr 0.000861 time 0.8618 (0.7811) loss 3.3966 (3.7711) grad_norm 0.9582 (1.1444) [2022-09-30 05:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][500/1251] eta 0:09:45 lr 0.000861 time 0.7784 (0.7800) loss 4.5059 (3.7548) grad_norm 1.0343 (1.1444) [2022-09-30 05:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][600/1251] eta 0:08:26 lr 0.000861 time 0.8450 (0.7788) loss 3.8956 (3.7474) grad_norm 1.1103 (1.1463) [2022-09-30 05:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][700/1251] eta 0:07:09 lr 0.000860 time 0.8315 (0.7791) loss 4.4651 (3.7580) grad_norm 1.2247 (1.1470) [2022-09-30 05:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][800/1251] eta 0:05:50 lr 0.000860 time 0.7508 (0.7781) loss 4.4378 (3.7585) grad_norm 1.1869 (1.1505) [2022-09-30 05:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][900/1251] eta 0:04:33 lr 0.000860 time 0.8163 (0.7781) loss 4.4954 (3.7672) grad_norm 1.2390 (1.1486) [2022-09-30 05:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1000/1251] eta 0:03:15 lr 0.000859 time 0.6797 (0.7769) loss 4.1289 (3.7682) grad_norm 1.0768 (1.1494) [2022-09-30 05:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1100/1251] eta 0:01:57 lr 0.000859 time 0.8940 (0.7768) loss 4.0972 (3.7666) grad_norm 1.0927 (1.1493) [2022-09-30 05:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1200/1251] eta 0:00:39 lr 0.000859 time 0.8438 (0.7761) loss 2.8729 (3.7674) grad_norm 1.1247 (1.1486) [2022-09-30 05:22:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 73 training takes 0:16:11 [2022-09-30 05:22:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.734 (4.734) Loss 1.2323 (1.2323) Acc@1 70.703 (70.703) Acc@5 90.332 (90.332) [2022-09-30 05:22:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.950 Acc@5 91.288 [2022-09-30 05:22:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-09-30 05:22:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.10% [2022-09-30 05:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][0/1251] eta 1:46:59 lr 0.000859 time 5.1313 (5.1313) loss 3.6715 (3.6715) grad_norm 1.1324 (1.1324) [2022-09-30 05:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][100/1251] eta 0:15:44 lr 0.000858 time 0.8399 (0.8203) loss 3.9962 (3.7888) grad_norm 1.1325 (1.1518) [2022-09-30 05:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][200/1251] eta 0:14:02 lr 0.000858 time 0.8679 (0.8013) loss 2.6903 (3.7955) grad_norm 0.9877 (1.1417) [2022-09-30 05:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][300/1251] eta 0:12:35 lr 0.000858 time 0.8737 (0.7942) loss 4.4707 (3.7691) grad_norm 1.1182 (1.1403) [2022-09-30 05:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][400/1251] eta 0:11:13 lr 0.000858 time 0.7147 (0.7912) loss 3.9372 (3.8017) grad_norm 1.1541 (1.1397) [2022-09-30 05:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][500/1251] eta 0:09:50 lr 0.000857 time 0.8328 (0.7864) loss 3.4197 (3.8043) grad_norm 1.0191 (1.1435) [2022-09-30 05:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][600/1251] eta 0:08:31 lr 0.000857 time 0.8396 (0.7855) loss 3.1281 (3.8129) grad_norm 1.2792 (1.1453) [2022-09-30 05:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][700/1251] eta 0:07:11 lr 0.000857 time 0.8398 (0.7838) loss 4.1557 (3.8052) grad_norm 1.0443 (1.1506) [2022-09-30 05:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][800/1251] eta 0:05:53 lr 0.000856 time 0.8850 (0.7840) loss 4.5264 (3.7996) grad_norm 1.3755 (1.1493) [2022-09-30 05:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][900/1251] eta 0:04:34 lr 0.000856 time 0.8324 (0.7822) loss 4.4293 (3.8043) grad_norm 1.1364 (1.1513) [2022-09-30 05:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1000/1251] eta 0:03:16 lr 0.000856 time 0.6230 (0.7820) loss 4.0312 (3.8049) grad_norm 1.1549 (1.1489) [2022-09-30 05:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1100/1251] eta 0:01:57 lr 0.000855 time 0.8550 (0.7812) loss 4.2149 (3.8064) grad_norm 1.2041 (1.1448) [2022-09-30 05:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1200/1251] eta 0:00:39 lr 0.000855 time 0.8540 (0.7807) loss 3.2438 (3.8039) grad_norm 1.0633 (1.1433) [2022-09-30 05:39:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 74 training takes 0:16:15 [2022-09-30 05:39:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.651 (4.651) Loss 1.1427 (1.1427) Acc@1 74.902 (74.902) Acc@5 93.164 (93.164) [2022-09-30 05:39:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.244 Acc@5 91.386 [2022-09-30 05:39:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-09-30 05:39:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.24% [2022-09-30 05:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][0/1251] eta 1:44:26 lr 0.000855 time 5.0093 (5.0093) loss 4.2645 (4.2645) grad_norm 1.1520 (1.1520) [2022-09-30 05:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][100/1251] eta 0:15:37 lr 0.000855 time 0.6825 (0.8146) loss 3.8260 (3.7969) grad_norm 1.0522 (1.1744) [2022-09-30 05:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][200/1251] eta 0:13:57 lr 0.000854 time 0.8184 (0.7971) loss 4.2668 (3.8287) grad_norm 1.1977 (1.1662) [2022-09-30 05:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][300/1251] eta 0:12:29 lr 0.000854 time 0.8159 (0.7884) loss 3.8904 (3.8324) grad_norm 0.9179 (1.1591) [2022-09-30 05:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][400/1251] eta 0:11:07 lr 0.000854 time 0.6860 (0.7844) loss 4.3313 (3.8281) grad_norm 1.1064 (1.1541) [2022-09-30 05:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][500/1251] eta 0:09:48 lr 0.000854 time 0.9469 (0.7834) loss 3.4999 (3.8260) grad_norm 1.2472 (1.1521) [2022-09-30 05:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][600/1251] eta 0:08:29 lr 0.000853 time 0.8362 (0.7820) loss 3.4805 (3.8362) grad_norm 1.0594 (1.1480) [2022-09-30 05:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][700/1251] eta 0:07:09 lr 0.000853 time 0.8346 (0.7804) loss 3.5288 (3.8449) grad_norm 1.1444 (1.1474) [2022-09-30 05:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][800/1251] eta 0:05:51 lr 0.000853 time 0.7707 (0.7784) loss 3.3938 (3.8410) grad_norm 1.3392 (1.1502) [2022-09-30 05:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][900/1251] eta 0:04:33 lr 0.000852 time 0.6547 (0.7778) loss 4.0446 (3.8389) grad_norm 1.0787 (1.1483) [2022-09-30 05:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1000/1251] eta 0:03:15 lr 0.000852 time 0.7883 (0.7778) loss 4.4549 (3.8410) grad_norm 1.1063 (1.1512) [2022-09-30 05:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1100/1251] eta 0:01:57 lr 0.000852 time 0.7923 (0.7783) loss 3.7156 (3.8406) grad_norm 1.0969 (1.1512) [2022-09-30 05:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1200/1251] eta 0:00:39 lr 0.000851 time 0.6880 (0.7783) loss 4.5094 (3.8364) grad_norm 1.3160 (1.1516) [2022-09-30 05:55:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 75 training takes 0:16:12 [2022-09-30 05:55:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.602 (4.602) Loss 1.2294 (1.2294) Acc@1 71.875 (71.875) Acc@5 91.504 (91.504) [2022-09-30 05:56:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.110 Acc@5 91.206 [2022-09-30 05:56:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-09-30 05:56:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.24% [2022-09-30 05:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][0/1251] eta 1:26:21 lr 0.000851 time 4.1416 (4.1416) loss 4.0508 (4.0508) grad_norm 1.0333 (1.0333) [2022-09-30 05:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][100/1251] eta 0:15:36 lr 0.000851 time 0.6834 (0.8137) loss 3.9661 (3.8517) grad_norm 1.0462 (1.1687) [2022-09-30 05:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][200/1251] eta 0:13:56 lr 0.000851 time 0.9692 (0.7961) loss 2.6843 (3.8332) grad_norm 1.1344 (1.1588) [2022-09-30 06:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][300/1251] eta 0:12:36 lr 0.000850 time 0.8283 (0.7951) loss 3.4299 (3.7876) grad_norm 1.0926 (1.1572) [2022-09-30 06:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][400/1251] eta 0:11:13 lr 0.000850 time 0.8264 (0.7917) loss 2.8639 (3.7779) grad_norm 1.1726 (1.1526) [2022-09-30 06:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][500/1251] eta 0:09:53 lr 0.000850 time 0.7911 (0.7897) loss 3.1143 (3.7871) grad_norm 1.0918 (1.1575) [2022-09-30 06:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][600/1251] eta 0:08:33 lr 0.000850 time 0.5736 (0.7884) loss 4.0870 (3.7929) grad_norm 1.3209 (1.1555) [2022-09-30 06:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][700/1251] eta 0:07:14 lr 0.000849 time 0.8112 (0.7883) loss 4.5316 (3.8086) grad_norm 1.0661 (1.1559) [2022-09-30 06:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][800/1251] eta 0:05:54 lr 0.000849 time 0.7991 (0.7870) loss 3.9849 (3.8131) grad_norm 1.0170 (1.1521) [2022-09-30 06:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][900/1251] eta 0:04:35 lr 0.000849 time 0.7663 (0.7858) loss 4.1711 (3.8131) grad_norm 1.3059 (1.1517) [2022-09-30 06:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1000/1251] eta 0:03:17 lr 0.000848 time 0.7239 (0.7858) loss 4.1130 (3.8047) grad_norm 1.0501 (1.1504) [2022-09-30 06:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1100/1251] eta 0:01:58 lr 0.000848 time 0.8403 (0.7851) loss 3.8530 (3.8071) grad_norm 1.1893 (1.1525) [2022-09-30 06:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1200/1251] eta 0:00:39 lr 0.000848 time 0.4717 (0.7839) loss 4.4940 (3.7991) grad_norm 1.3325 (1.1543) [2022-09-30 06:12:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 76 training takes 0:16:07 [2022-09-30 06:12:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.933 (3.933) Loss 1.2193 (1.2193) Acc@1 71.387 (71.387) Acc@5 90.039 (90.039) [2022-09-30 06:12:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.370 Acc@5 91.472 [2022-09-30 06:12:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-09-30 06:12:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.37% [2022-09-30 06:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][0/1251] eta 1:34:04 lr 0.000848 time 4.5117 (4.5117) loss 2.9521 (2.9521) grad_norm 0.9786 (0.9786) [2022-09-30 06:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][100/1251] eta 0:15:27 lr 0.000847 time 0.8409 (0.8055) loss 3.6191 (3.8505) grad_norm 1.0566 (1.1198) [2022-09-30 06:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][200/1251] eta 0:13:51 lr 0.000847 time 0.8275 (0.7907) loss 2.9315 (3.8372) grad_norm 1.0589 (1.1303) [2022-09-30 06:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][300/1251] eta 0:12:28 lr 0.000847 time 0.6697 (0.7871) loss 3.8580 (3.8438) grad_norm 0.9981 (1.1438) [2022-09-30 06:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][400/1251] eta 0:11:06 lr 0.000846 time 0.8121 (0.7833) loss 3.7189 (3.8416) grad_norm 1.1258 (1.1433) [2022-09-30 06:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][500/1251] eta 0:09:47 lr 0.000846 time 0.8163 (0.7820) loss 2.8544 (3.8453) grad_norm 1.1646 (1.1466) [2022-09-30 06:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][600/1251] eta 0:08:27 lr 0.000846 time 0.7721 (0.7799) loss 3.8239 (3.8326) grad_norm 1.1923 (1.1487) [2022-09-30 06:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][700/1251] eta 0:07:08 lr 0.000846 time 0.6961 (0.7779) loss 4.3173 (3.8289) grad_norm 1.1486 (1.1455) [2022-09-30 06:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][800/1251] eta 0:05:50 lr 0.000845 time 0.7757 (0.7774) loss 2.8422 (3.8296) grad_norm 1.4516 (1.1465) [2022-09-30 06:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][900/1251] eta 0:04:32 lr 0.000845 time 0.9154 (0.7773) loss 2.7474 (3.8230) grad_norm 1.1433 (1.1520) [2022-09-30 06:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1000/1251] eta 0:03:15 lr 0.000845 time 0.7031 (0.7779) loss 2.9218 (3.8222) grad_norm 1.2046 (1.1551) [2022-09-30 06:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1100/1251] eta 0:01:57 lr 0.000844 time 0.7475 (0.7776) loss 4.3798 (3.8251) grad_norm 1.3718 (1.1552) [2022-09-30 06:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1200/1251] eta 0:00:39 lr 0.000844 time 0.6905 (0.7773) loss 4.0098 (3.8127) grad_norm 0.9936 (1.1563) [2022-09-30 06:28:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 77 training takes 0:16:11 [2022-09-30 06:28:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.444 (4.444) Loss 1.1615 (1.1615) Acc@1 72.754 (72.754) Acc@5 92.578 (92.578) [2022-09-30 06:29:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.610 Acc@5 91.498 [2022-09-30 06:29:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-09-30 06:29:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.61% [2022-09-30 06:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][0/1251] eta 1:38:27 lr 0.000844 time 4.7223 (4.7223) loss 4.1068 (4.1068) grad_norm 1.2713 (1.2713) [2022-09-30 06:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][100/1251] eta 0:15:39 lr 0.000844 time 0.9123 (0.8159) loss 4.0839 (3.8184) grad_norm 1.0551 (1.1798) [2022-09-30 06:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][200/1251] eta 0:14:02 lr 0.000843 time 0.7671 (0.8018) loss 4.2803 (3.7656) grad_norm 1.0875 (1.1620) [2022-09-30 06:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][300/1251] eta 0:12:34 lr 0.000843 time 0.8508 (0.7930) loss 2.6583 (3.7598) grad_norm 1.2524 (1.1639) [2022-09-30 06:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][400/1251] eta 0:11:09 lr 0.000843 time 0.7910 (0.7869) loss 4.4539 (3.7560) grad_norm 1.3603 (1.1652) [2022-09-30 06:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][500/1251] eta 0:09:48 lr 0.000842 time 0.7317 (0.7841) loss 4.6199 (3.7706) grad_norm 1.2054 (1.1594) [2022-09-30 06:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][600/1251] eta 0:08:29 lr 0.000842 time 0.8058 (0.7830) loss 2.2764 (3.7594) grad_norm 1.2349 (1.1545) [2022-09-30 06:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][700/1251] eta 0:07:09 lr 0.000842 time 0.6995 (0.7798) loss 4.3128 (3.7629) grad_norm 1.0735 (1.1559) [2022-09-30 06:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][800/1251] eta 0:05:51 lr 0.000841 time 0.8587 (0.7794) loss 3.0204 (3.7766) grad_norm 1.2625 (1.1573) [2022-09-30 06:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][900/1251] eta 0:04:33 lr 0.000841 time 0.7717 (0.7783) loss 3.8479 (3.7807) grad_norm 1.1622 (1.1558) [2022-09-30 06:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1000/1251] eta 0:03:14 lr 0.000841 time 0.6722 (0.7768) loss 2.6472 (3.7904) grad_norm 1.1680 (1.1573) [2022-09-30 06:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1100/1251] eta 0:01:57 lr 0.000841 time 0.8795 (0.7763) loss 2.8361 (3.7941) grad_norm 1.0324 (1.1570) [2022-09-30 06:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1200/1251] eta 0:00:39 lr 0.000840 time 0.7414 (0.7763) loss 4.7949 (3.7978) grad_norm 1.2878 (1.1571) [2022-09-30 06:45:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 78 training takes 0:16:10 [2022-09-30 06:45:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.057 (4.057) Loss 1.1055 (1.1055) Acc@1 73.047 (73.047) Acc@5 92.871 (92.871) [2022-09-30 06:45:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.796 Acc@5 91.582 [2022-09-30 06:45:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-09-30 06:45:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 06:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][0/1251] eta 1:46:44 lr 0.000840 time 5.1199 (5.1199) loss 2.9715 (2.9715) grad_norm 1.1647 (1.1647) [2022-09-30 06:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][100/1251] eta 0:15:35 lr 0.000840 time 0.7933 (0.8128) loss 3.7019 (3.7105) grad_norm 1.2883 (1.1473) [2022-09-30 06:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][200/1251] eta 0:13:56 lr 0.000839 time 0.6915 (0.7959) loss 3.7778 (3.7793) grad_norm 1.2278 (1.1547) [2022-09-30 06:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][300/1251] eta 0:12:30 lr 0.000839 time 0.8747 (0.7887) loss 3.6135 (3.7733) grad_norm 1.2070 (1.1624) [2022-09-30 06:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][400/1251] eta 0:11:08 lr 0.000839 time 0.7838 (0.7856) loss 2.9631 (3.7675) grad_norm 1.2266 (1.1605) [2022-09-30 06:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][500/1251] eta 0:09:47 lr 0.000839 time 0.8221 (0.7817) loss 3.6280 (3.7912) grad_norm 1.0903 (1.1567) [2022-09-30 06:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][600/1251] eta 0:08:27 lr 0.000838 time 0.8344 (0.7801) loss 3.5490 (3.8048) grad_norm 1.2321 (1.1608) [2022-09-30 06:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][700/1251] eta 0:07:08 lr 0.000838 time 0.8160 (0.7781) loss 4.2760 (3.8101) grad_norm 1.1010 (1.1600) [2022-09-30 06:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][800/1251] eta 0:05:50 lr 0.000838 time 0.6719 (0.7765) loss 3.8300 (3.7993) grad_norm 1.2236 (1.1602) [2022-09-30 06:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][900/1251] eta 0:04:32 lr 0.000837 time 0.7903 (0.7768) loss 3.7042 (3.7944) grad_norm 1.1382 (1.1621) [2022-09-30 06:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1000/1251] eta 0:03:15 lr 0.000837 time 0.8474 (0.7771) loss 3.3864 (3.7913) grad_norm 1.0536 (1.1637) [2022-09-30 06:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1100/1251] eta 0:01:57 lr 0.000837 time 0.7967 (0.7769) loss 4.2613 (3.7933) grad_norm 0.9635 (1.1622) [2022-09-30 07:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1200/1251] eta 0:00:39 lr 0.000836 time 0.7786 (0.7768) loss 3.4449 (3.7951) grad_norm 1.1527 (1.1616) [2022-09-30 07:01:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 79 training takes 0:16:11 [2022-09-30 07:01:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.579 (4.579) Loss 1.1757 (1.1757) Acc@1 72.754 (72.754) Acc@5 91.992 (91.992) [2022-09-30 07:02:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.438 Acc@5 91.444 [2022-09-30 07:02:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-09-30 07:02:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 07:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][0/1251] eta 1:42:28 lr 0.000836 time 4.9148 (4.9148) loss 3.1242 (3.1242) grad_norm 1.1620 (1.1620) [2022-09-30 07:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][100/1251] eta 0:15:39 lr 0.000836 time 0.6165 (0.8160) loss 4.1611 (3.7450) grad_norm 1.2301 (1.1311) [2022-09-30 07:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][200/1251] eta 0:13:52 lr 0.000836 time 0.8104 (0.7924) loss 4.0401 (3.7760) grad_norm 1.1382 (1.1443) [2022-09-30 07:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][300/1251] eta 0:12:30 lr 0.000835 time 0.7948 (0.7888) loss 3.1346 (3.7950) grad_norm 1.2641 (1.1539) [2022-09-30 07:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][400/1251] eta 0:11:09 lr 0.000835 time 0.7756 (0.7861) loss 3.6345 (3.7746) grad_norm 1.3768 (1.1538) [2022-09-30 07:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][500/1251] eta 0:09:48 lr 0.000835 time 0.9008 (0.7832) loss 4.0523 (3.7691) grad_norm 1.1130 (1.1585) [2022-09-30 07:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][600/1251] eta 0:08:28 lr 0.000834 time 0.8447 (0.7816) loss 3.1915 (3.7631) grad_norm 1.0227 (1.1555) [2022-09-30 07:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][700/1251] eta 0:07:09 lr 0.000834 time 0.6747 (0.7793) loss 3.0509 (3.7671) grad_norm 0.9462 (1.1548) [2022-09-30 07:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][800/1251] eta 0:05:51 lr 0.000834 time 0.7142 (0.7789) loss 4.1471 (3.7728) grad_norm 1.2660 (1.1588) [2022-09-30 07:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][900/1251] eta 0:04:33 lr 0.000833 time 0.6885 (0.7790) loss 4.0545 (3.7642) grad_norm 1.1390 (1.1610) [2022-09-30 07:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1000/1251] eta 0:03:15 lr 0.000833 time 0.7893 (0.7795) loss 3.0338 (3.7660) grad_norm 1.1031 (1.1593) [2022-09-30 07:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1100/1251] eta 0:01:57 lr 0.000833 time 0.6500 (0.7790) loss 2.8534 (3.7683) grad_norm 1.0483 (1.1605) [2022-09-30 07:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1200/1251] eta 0:00:39 lr 0.000833 time 0.8367 (0.7790) loss 4.3110 (3.7677) grad_norm 1.1658 (1.1613) [2022-09-30 07:18:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 80 training takes 0:16:14 [2022-09-30 07:18:25 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saving...... [2022-09-30 07:18:26 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saved !!! [2022-09-30 07:18:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.200 (4.200) Loss 1.2040 (1.2040) Acc@1 70.410 (70.410) Acc@5 91.211 (91.211) [2022-09-30 07:18:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.682 Acc@5 91.584 [2022-09-30 07:18:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-09-30 07:18:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 07:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][0/1251] eta 1:38:49 lr 0.000832 time 4.7400 (4.7400) loss 3.0728 (3.0728) grad_norm 1.0656 (1.0656) [2022-09-30 07:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][100/1251] eta 0:15:34 lr 0.000832 time 0.6797 (0.8119) loss 3.8007 (3.7573) grad_norm 0.9750 (1.1815) [2022-09-30 07:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][200/1251] eta 0:13:56 lr 0.000832 time 0.8577 (0.7955) loss 3.4888 (3.7582) grad_norm 1.2671 (1.1891) [2022-09-30 07:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][300/1251] eta 0:12:32 lr 0.000831 time 0.7341 (0.7911) loss 4.0007 (3.7572) grad_norm 1.1302 (1.1818) [2022-09-30 07:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][400/1251] eta 0:11:09 lr 0.000831 time 0.6804 (0.7862) loss 4.2965 (3.7737) grad_norm 1.0594 (1.1778) [2022-09-30 07:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][500/1251] eta 0:09:47 lr 0.000831 time 0.7032 (0.7828) loss 4.3231 (3.7857) grad_norm 1.3491 (1.1764) [2022-09-30 07:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][600/1251] eta 0:08:27 lr 0.000830 time 0.6876 (0.7803) loss 4.1707 (3.7852) grad_norm 1.1540 (1.1743) [2022-09-30 07:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][700/1251] eta 0:07:10 lr 0.000830 time 0.8999 (0.7807) loss 4.0832 (3.7923) grad_norm 1.0439 (1.1746) [2022-09-30 07:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][800/1251] eta 0:05:51 lr 0.000830 time 0.7814 (0.7803) loss 4.3229 (3.7816) grad_norm 1.4093 (1.1703) [2022-09-30 07:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][900/1251] eta 0:04:33 lr 0.000830 time 0.7234 (0.7798) loss 4.1931 (3.7904) grad_norm 0.9456 (1.1709) [2022-09-30 07:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1000/1251] eta 0:03:15 lr 0.000829 time 0.7154 (0.7795) loss 4.0037 (3.7884) grad_norm 1.2053 (1.1690) [2022-09-30 07:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1100/1251] eta 0:01:57 lr 0.000829 time 0.8360 (0.7794) loss 3.5814 (3.7848) grad_norm 1.1784 (1.1679) [2022-09-30 07:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1200/1251] eta 0:00:39 lr 0.000829 time 0.9074 (0.7794) loss 3.2961 (3.7828) grad_norm 1.1491 (1.1678) [2022-09-30 07:35:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 81 training takes 0:16:15 [2022-09-30 07:35:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.622 (4.622) Loss 1.0860 (1.0860) Acc@1 73.438 (73.438) Acc@5 93.652 (93.652) [2022-09-30 07:35:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.622 Acc@5 91.594 [2022-09-30 07:35:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-09-30 07:35:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 07:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][0/1251] eta 1:32:39 lr 0.000828 time 4.4443 (4.4443) loss 2.8405 (2.8405) grad_norm 1.0102 (1.0102) [2022-09-30 07:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][100/1251] eta 0:15:31 lr 0.000828 time 0.7830 (0.8094) loss 4.3063 (3.7523) grad_norm 1.1617 (1.1504) [2022-09-30 07:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][200/1251] eta 0:13:52 lr 0.000828 time 0.7788 (0.7920) loss 3.0339 (3.7764) grad_norm 0.9942 (1.1714) [2022-09-30 07:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][300/1251] eta 0:12:23 lr 0.000828 time 0.8071 (0.7816) loss 4.0402 (3.7876) grad_norm 1.2042 (1.1728) [2022-09-30 07:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][400/1251] eta 0:11:02 lr 0.000827 time 0.8341 (0.7786) loss 2.8316 (3.7819) grad_norm 1.1992 (1.1723) [2022-09-30 07:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][500/1251] eta 0:09:45 lr 0.000827 time 0.5993 (0.7790) loss 3.9312 (3.7667) grad_norm 1.0013 (1.1730) [2022-09-30 07:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][600/1251] eta 0:08:27 lr 0.000827 time 0.8259 (0.7789) loss 4.4646 (3.7645) grad_norm 1.0758 (1.1710) [2022-09-30 07:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][700/1251] eta 0:07:09 lr 0.000826 time 0.7999 (0.7795) loss 4.6024 (3.7760) grad_norm 1.1779 (1.1705) [2022-09-30 07:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][800/1251] eta 0:05:51 lr 0.000826 time 0.7745 (0.7788) loss 3.0560 (3.7730) grad_norm 1.0488 (1.1696) [2022-09-30 07:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][900/1251] eta 0:04:33 lr 0.000826 time 0.7305 (0.7787) loss 3.9565 (3.7785) grad_norm 1.2150 (1.1699) [2022-09-30 07:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1000/1251] eta 0:03:15 lr 0.000825 time 0.8016 (0.7785) loss 3.2936 (3.7837) grad_norm 1.0881 (1.1695) [2022-09-30 07:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1100/1251] eta 0:01:57 lr 0.000825 time 0.7369 (0.7790) loss 4.4442 (3.7804) grad_norm 1.2350 (1.1692) [2022-09-30 07:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1200/1251] eta 0:00:39 lr 0.000825 time 0.8232 (0.7789) loss 4.6512 (3.7739) grad_norm 1.5174 (1.1710) [2022-09-30 07:51:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 82 training takes 0:16:13 [2022-09-30 07:51:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.406 (4.406) Loss 1.1615 (1.1615) Acc@1 73.047 (73.047) Acc@5 91.895 (91.895) [2022-09-30 07:51:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.738 Acc@5 91.728 [2022-09-30 07:51:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-09-30 07:51:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 07:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][0/1251] eta 1:47:34 lr 0.000825 time 5.1594 (5.1594) loss 3.4716 (3.4716) grad_norm 1.1940 (1.1940) [2022-09-30 07:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][100/1251] eta 0:15:49 lr 0.000824 time 0.8917 (0.8246) loss 3.4319 (3.8360) grad_norm 1.4426 (1.1424) [2022-09-30 07:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][200/1251] eta 0:14:03 lr 0.000824 time 0.7780 (0.8029) loss 3.3761 (3.8016) grad_norm 1.1493 (1.1566) [2022-09-30 07:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][300/1251] eta 0:12:32 lr 0.000824 time 0.8176 (0.7916) loss 4.1716 (3.7859) grad_norm 1.1122 (1.1687) [2022-09-30 07:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][400/1251] eta 0:11:09 lr 0.000823 time 0.7294 (0.7862) loss 3.5121 (3.7657) grad_norm 1.1207 (1.1655) [2022-09-30 07:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][500/1251] eta 0:09:48 lr 0.000823 time 0.6860 (0.7831) loss 3.2638 (3.7562) grad_norm 1.1412 (1.1674) [2022-09-30 07:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][600/1251] eta 0:08:29 lr 0.000823 time 0.8725 (0.7831) loss 2.5863 (3.7515) grad_norm 1.0799 (1.1684) [2022-09-30 08:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][700/1251] eta 0:07:10 lr 0.000822 time 0.8032 (0.7813) loss 2.6953 (3.7576) grad_norm 1.1607 (1.1658) [2022-09-30 08:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][800/1251] eta 0:05:52 lr 0.000822 time 0.7284 (0.7809) loss 3.3716 (3.7544) grad_norm 1.1699 (1.1667) [2022-09-30 08:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][900/1251] eta 0:04:33 lr 0.000822 time 0.8438 (0.7799) loss 3.3696 (3.7540) grad_norm 1.0743 (1.1699) [2022-09-30 08:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1000/1251] eta 0:03:15 lr 0.000821 time 0.8489 (0.7789) loss 4.3772 (3.7569) grad_norm 1.2624 (1.1706) [2022-09-30 08:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1100/1251] eta 0:01:57 lr 0.000821 time 0.9402 (0.7782) loss 3.5277 (3.7505) grad_norm 1.0408 (1.1721) [2022-09-30 08:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1200/1251] eta 0:00:39 lr 0.000821 time 0.7248 (0.7772) loss 3.4087 (3.7550) grad_norm 1.1174 (1.1731) [2022-09-30 08:08:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 83 training takes 0:16:12 [2022-09-30 08:08:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.653 (4.653) Loss 1.1929 (1.1929) Acc@1 70.996 (70.996) Acc@5 91.699 (91.699) [2022-09-30 08:08:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.644 Acc@5 91.714 [2022-09-30 08:08:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-09-30 08:08:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 08:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][0/1251] eta 1:39:16 lr 0.000821 time 4.7614 (4.7614) loss 3.4942 (3.4942) grad_norm 0.9929 (0.9929) [2022-09-30 08:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][100/1251] eta 0:15:38 lr 0.000820 time 0.8517 (0.8154) loss 2.7270 (3.7432) grad_norm 1.0344 (1.1674) [2022-09-30 08:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][200/1251] eta 0:13:56 lr 0.000820 time 0.8443 (0.7958) loss 4.0921 (3.7487) grad_norm 1.1574 (1.1800) [2022-09-30 08:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][300/1251] eta 0:12:30 lr 0.000820 time 0.7529 (0.7889) loss 2.6285 (3.7315) grad_norm 1.3046 (1.1786) [2022-09-30 08:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][400/1251] eta 0:11:08 lr 0.000819 time 0.7535 (0.7856) loss 2.5820 (3.7578) grad_norm 1.0944 (1.1773) [2022-09-30 08:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][500/1251] eta 0:09:46 lr 0.000819 time 0.7828 (0.7807) loss 3.8870 (3.7718) grad_norm 1.3556 (1.1772) [2022-09-30 08:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][600/1251] eta 0:08:27 lr 0.000819 time 0.6862 (0.7795) loss 4.0571 (3.7651) grad_norm 1.4093 (1.1714) [2022-09-30 08:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][700/1251] eta 0:07:09 lr 0.000818 time 0.8606 (0.7791) loss 2.8757 (3.7731) grad_norm 1.0291 (1.1715) [2022-09-30 08:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][800/1251] eta 0:05:51 lr 0.000818 time 0.8867 (0.7796) loss 3.8586 (3.7742) grad_norm 1.0703 (1.1729) [2022-09-30 08:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][900/1251] eta 0:04:33 lr 0.000818 time 0.7772 (0.7786) loss 3.9590 (3.7814) grad_norm 1.2708 (1.1724) [2022-09-30 08:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1000/1251] eta 0:03:15 lr 0.000817 time 0.8447 (0.7783) loss 2.9859 (3.7821) grad_norm 1.1750 (1.1722) [2022-09-30 08:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1100/1251] eta 0:01:57 lr 0.000817 time 0.8540 (0.7785) loss 3.2028 (3.7797) grad_norm 1.0038 (1.1732) [2022-09-30 08:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1200/1251] eta 0:00:39 lr 0.000817 time 0.7604 (0.7779) loss 4.3975 (3.7789) grad_norm 1.0529 (1.1754) [2022-09-30 08:24:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 84 training takes 0:16:12 [2022-09-30 08:24:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.395 (4.395) Loss 1.2106 (1.2106) Acc@1 72.559 (72.559) Acc@5 90.820 (90.820) [2022-09-30 08:25:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.642 Acc@5 91.520 [2022-09-30 08:25:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-09-30 08:25:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.80% [2022-09-30 08:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][0/1251] eta 1:47:00 lr 0.000817 time 5.1321 (5.1321) loss 2.9530 (2.9530) grad_norm 1.2214 (1.2214) [2022-09-30 08:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][100/1251] eta 0:15:41 lr 0.000816 time 0.7578 (0.8182) loss 2.8841 (3.7590) grad_norm 1.0790 (1.1850) [2022-09-30 08:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][200/1251] eta 0:13:56 lr 0.000816 time 0.6675 (0.7956) loss 3.9935 (3.7509) grad_norm 1.0505 (1.1841) [2022-09-30 08:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][300/1251] eta 0:12:28 lr 0.000816 time 0.8043 (0.7875) loss 4.3247 (3.7544) grad_norm 1.2031 (1.1870) [2022-09-30 08:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][400/1251] eta 0:11:07 lr 0.000815 time 0.6816 (0.7847) loss 4.0743 (3.7659) grad_norm 1.1412 (1.1861) [2022-09-30 08:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][500/1251] eta 0:09:48 lr 0.000815 time 0.7948 (0.7837) loss 3.4311 (3.7551) grad_norm 1.2168 (1.1861) [2022-09-30 08:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][600/1251] eta 0:08:28 lr 0.000815 time 0.8194 (0.7818) loss 3.2721 (3.7552) grad_norm 1.2211 (1.1853) [2022-09-30 08:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][700/1251] eta 0:07:10 lr 0.000814 time 0.8335 (0.7812) loss 4.7067 (3.7568) grad_norm 1.1123 (1.1847) [2022-09-30 08:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][800/1251] eta 0:05:44 lr 0.000814 time 0.8220 (0.7634) loss 3.9239 (3.7518) grad_norm 1.3829 (1.1860) [2022-09-30 08:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][900/1251] eta 0:04:28 lr 0.000814 time 0.7181 (0.7647) loss 4.1933 (3.7492) grad_norm 1.0499 (1.1842) [2022-09-30 08:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1000/1251] eta 0:03:12 lr 0.000813 time 0.8617 (0.7661) loss 3.3825 (3.7532) grad_norm 1.2479 (1.1841) [2022-09-30 08:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1100/1251] eta 0:01:55 lr 0.000813 time 0.8372 (0.7674) loss 4.4089 (3.7529) grad_norm 0.9426 (1.1818) [2022-09-30 08:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1200/1251] eta 0:00:39 lr 0.000813 time 0.8232 (0.7673) loss 4.2320 (3.7579) grad_norm 1.0709 (1.1797) [2022-09-30 08:41:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 85 training takes 0:15:59 [2022-09-30 08:41:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.741 (4.741) Loss 1.1713 (1.1713) Acc@1 71.387 (71.387) Acc@5 92.090 (92.090) [2022-09-30 08:41:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.988 Acc@5 91.654 [2022-09-30 08:41:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-09-30 08:41:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.99% [2022-09-30 08:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][0/1251] eta 1:29:51 lr 0.000812 time 4.3099 (4.3099) loss 4.3812 (4.3812) grad_norm 1.0875 (1.0875) [2022-09-30 08:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][100/1251] eta 0:15:26 lr 0.000812 time 0.7857 (0.8050) loss 4.4300 (3.8322) grad_norm 1.0976 (1.2224) [2022-09-30 08:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][200/1251] eta 0:13:54 lr 0.000812 time 0.8883 (0.7941) loss 3.9329 (3.8079) grad_norm 1.1053 (1.1938) [2022-09-30 08:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][300/1251] eta 0:12:29 lr 0.000811 time 0.7173 (0.7877) loss 3.8834 (3.8470) grad_norm 1.0350 (1.1831) [2022-09-30 08:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][400/1251] eta 0:11:07 lr 0.000811 time 0.7004 (0.7840) loss 4.0385 (3.8226) grad_norm 1.1918 (1.1851) [2022-09-30 08:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][500/1251] eta 0:09:47 lr 0.000811 time 0.7865 (0.7827) loss 3.8230 (3.7987) grad_norm 1.0914 (1.1882) [2022-09-30 08:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][600/1251] eta 0:08:28 lr 0.000811 time 0.7783 (0.7816) loss 4.5363 (3.7933) grad_norm 1.2243 (1.1867) [2022-09-30 08:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][700/1251] eta 0:07:10 lr 0.000810 time 0.8307 (0.7804) loss 3.0148 (3.7986) grad_norm 1.2227 (1.1849) [2022-09-30 08:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][800/1251] eta 0:05:51 lr 0.000810 time 0.7847 (0.7787) loss 4.4733 (3.7917) grad_norm 1.0480 (1.1841) [2022-09-30 08:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][900/1251] eta 0:04:33 lr 0.000810 time 0.8333 (0.7782) loss 4.3731 (3.7929) grad_norm 1.2519 (1.1846) [2022-09-30 08:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1000/1251] eta 0:03:15 lr 0.000809 time 0.8390 (0.7782) loss 3.8305 (3.7839) grad_norm 1.1050 (1.1894) [2022-09-30 08:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1100/1251] eta 0:01:57 lr 0.000809 time 0.8204 (0.7782) loss 3.8132 (3.7803) grad_norm 1.0846 (1.1883) [2022-09-30 08:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1200/1251] eta 0:00:39 lr 0.000809 time 0.8989 (0.7778) loss 4.4072 (3.7752) grad_norm 1.3252 (1.1876) [2022-09-30 08:57:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 86 training takes 0:16:13 [2022-09-30 08:57:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 5.024 (5.024) Loss 1.1853 (1.1853) Acc@1 73.633 (73.633) Acc@5 91.699 (91.699) [2022-09-30 08:58:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.924 Acc@5 91.828 [2022-09-30 08:58:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-09-30 08:58:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.99% [2022-09-30 08:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][0/1251] eta 1:32:02 lr 0.000808 time 4.4141 (4.4141) loss 3.9200 (3.9200) grad_norm 1.1875 (1.1875) [2022-09-30 08:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][100/1251] eta 0:15:34 lr 0.000808 time 0.7294 (0.8118) loss 4.0367 (3.7885) grad_norm 1.0826 (1.1660) [2022-09-30 09:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][200/1251] eta 0:13:57 lr 0.000808 time 0.7126 (0.7969) loss 3.2992 (3.6918) grad_norm 1.1576 (1.1752) [2022-09-30 09:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][300/1251] eta 0:12:30 lr 0.000807 time 0.7815 (0.7897) loss 3.5093 (3.7151) grad_norm 1.1176 (1.1825) [2022-09-30 09:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][400/1251] eta 0:11:07 lr 0.000807 time 0.8207 (0.7845) loss 2.3901 (3.7195) grad_norm 1.0756 (1.1832) [2022-09-30 09:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][500/1251] eta 0:09:45 lr 0.000807 time 0.8017 (0.7802) loss 4.0924 (3.7270) grad_norm 1.0931 (1.1826) [2022-09-30 09:05:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][600/1251] eta 0:08:27 lr 0.000806 time 0.6743 (0.7798) loss 3.4595 (3.7402) grad_norm 1.3576 (1.1842) [2022-09-30 09:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][700/1251] eta 0:07:09 lr 0.000806 time 0.6554 (0.7793) loss 3.1085 (3.7405) grad_norm 1.1497 (1.1859) [2022-09-30 09:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][800/1251] eta 0:05:51 lr 0.000806 time 0.7114 (0.7785) loss 3.6005 (3.7416) grad_norm 1.2707 (1.1835) [2022-09-30 09:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][900/1251] eta 0:04:33 lr 0.000805 time 0.7603 (0.7779) loss 3.6815 (3.7296) grad_norm 1.3917 (1.1830) [2022-09-30 09:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1000/1251] eta 0:03:14 lr 0.000805 time 0.6291 (0.7769) loss 2.5369 (3.7303) grad_norm 1.0886 (1.1820) [2022-09-30 09:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1100/1251] eta 0:01:57 lr 0.000805 time 0.8181 (0.7760) loss 4.4691 (3.7375) grad_norm 1.1148 (1.1821) [2022-09-30 09:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1200/1251] eta 0:00:39 lr 0.000804 time 0.7293 (0.7755) loss 3.8478 (3.7429) grad_norm 1.0799 (1.1827) [2022-09-30 09:14:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 87 training takes 0:16:10 [2022-09-30 09:14:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.852 (3.852) Loss 1.2320 (1.2320) Acc@1 71.875 (71.875) Acc@5 91.016 (91.016) [2022-09-30 09:14:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.868 Acc@5 91.828 [2022-09-30 09:14:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-09-30 09:14:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.99% [2022-09-30 09:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][0/1251] eta 1:33:24 lr 0.000804 time 4.4797 (4.4797) loss 4.2315 (4.2315) grad_norm 1.2340 (1.2340) [2022-09-30 09:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][100/1251] eta 0:15:52 lr 0.000804 time 0.8154 (0.8277) loss 4.1168 (3.6975) grad_norm 1.1253 (1.1908) [2022-09-30 09:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][200/1251] eta 0:14:03 lr 0.000804 time 0.8226 (0.8022) loss 4.4178 (3.7133) grad_norm 1.1915 (1.1769) [2022-09-30 09:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][300/1251] eta 0:12:33 lr 0.000803 time 0.8322 (0.7927) loss 4.0282 (3.7152) grad_norm 1.0508 (1.1777) [2022-09-30 09:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][400/1251] eta 0:11:10 lr 0.000803 time 0.7791 (0.7877) loss 3.5289 (3.7041) grad_norm 1.5437 (1.1768) [2022-09-30 09:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][500/1251] eta 0:09:49 lr 0.000803 time 0.7906 (0.7845) loss 2.9111 (3.7219) grad_norm 1.0315 (1.1810) [2022-09-30 09:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][600/1251] eta 0:08:30 lr 0.000802 time 0.8990 (0.7835) loss 4.0933 (3.7250) grad_norm 1.1083 (1.1814) [2022-09-30 09:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][700/1251] eta 0:07:10 lr 0.000802 time 0.6864 (0.7819) loss 4.5409 (3.7235) grad_norm 1.1762 (1.1792) [2022-09-30 09:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][800/1251] eta 0:05:51 lr 0.000802 time 0.6226 (0.7793) loss 2.7839 (3.7250) grad_norm 1.0258 (1.1810) [2022-09-30 09:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][900/1251] eta 0:04:33 lr 0.000801 time 0.7750 (0.7801) loss 3.2504 (3.7328) grad_norm 1.1181 (1.1852) [2022-09-30 09:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1000/1251] eta 0:03:15 lr 0.000801 time 0.8540 (0.7799) loss 4.2517 (3.7452) grad_norm 1.2639 (1.1855) [2022-09-30 09:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1100/1251] eta 0:01:57 lr 0.000801 time 0.9281 (0.7799) loss 3.2580 (3.7523) grad_norm 1.6536 (1.1861) [2022-09-30 09:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1200/1251] eta 0:00:39 lr 0.000800 time 0.8417 (0.7795) loss 3.8644 (3.7514) grad_norm 1.2467 (1.1875) [2022-09-30 09:30:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 88 training takes 0:16:15 [2022-09-30 09:30:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.248 (4.248) Loss 1.2263 (1.2263) Acc@1 71.191 (71.191) Acc@5 91.699 (91.699) [2022-09-30 09:31:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.894 Acc@5 91.746 [2022-09-30 09:31:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-09-30 09:31:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.99% [2022-09-30 09:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][0/1251] eta 1:45:46 lr 0.000800 time 5.0730 (5.0730) loss 3.8526 (3.8526) grad_norm 1.1594 (1.1594) [2022-09-30 09:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][100/1251] eta 0:15:32 lr 0.000800 time 0.6901 (0.8102) loss 3.6718 (3.6298) grad_norm 0.9380 (1.1971) [2022-09-30 09:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][200/1251] eta 0:13:54 lr 0.000799 time 0.8396 (0.7945) loss 4.1397 (3.6993) grad_norm 1.2121 (1.2080) [2022-09-30 09:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][300/1251] eta 0:12:27 lr 0.000799 time 0.8404 (0.7859) loss 4.1907 (3.7344) grad_norm 1.5301 (1.2004) [2022-09-30 09:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][400/1251] eta 0:11:06 lr 0.000799 time 0.7646 (0.7838) loss 4.3307 (3.7237) grad_norm 1.0942 (1.1965) [2022-09-30 09:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][500/1251] eta 0:09:46 lr 0.000798 time 0.8437 (0.7814) loss 4.0582 (3.7234) grad_norm 1.2639 (1.1974) [2022-09-30 09:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][600/1251] eta 0:08:28 lr 0.000798 time 0.7638 (0.7808) loss 4.3693 (3.7260) grad_norm 1.0895 (1.1992) [2022-09-30 09:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][700/1251] eta 0:07:08 lr 0.000798 time 0.8138 (0.7786) loss 3.8201 (3.7374) grad_norm 1.1238 (1.1973) [2022-09-30 09:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][800/1251] eta 0:05:50 lr 0.000797 time 0.7663 (0.7767) loss 3.9202 (3.7445) grad_norm 1.5509 (1.1991) [2022-09-30 09:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][900/1251] eta 0:04:32 lr 0.000797 time 0.6587 (0.7774) loss 3.5149 (3.7469) grad_norm 1.5119 (1.1975) [2022-09-30 09:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1000/1251] eta 0:03:15 lr 0.000797 time 0.8031 (0.7775) loss 3.2266 (3.7488) grad_norm 1.0935 (1.1973) [2022-09-30 09:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1100/1251] eta 0:01:57 lr 0.000796 time 0.8426 (0.7778) loss 4.1769 (3.7508) grad_norm 1.0696 (1.2004) [2022-09-30 09:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1200/1251] eta 0:00:39 lr 0.000796 time 0.8548 (0.7768) loss 4.0987 (3.7438) grad_norm 1.1016 (1.2002) [2022-09-30 09:47:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 89 training takes 0:16:11 [2022-09-30 09:47:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.960 (3.960) Loss 1.1448 (1.1448) Acc@1 73.828 (73.828) Acc@5 91.504 (91.504) [2022-09-30 09:47:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.048 Acc@5 91.960 [2022-09-30 09:47:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-09-30 09:47:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.05% [2022-09-30 09:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][0/1251] eta 1:48:23 lr 0.000796 time 5.1988 (5.1988) loss 4.0713 (4.0713) grad_norm 1.0981 (1.0981) [2022-09-30 09:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][100/1251] eta 0:15:41 lr 0.000796 time 0.8176 (0.8179) loss 4.5166 (3.7699) grad_norm 1.3209 (1.1924) [2022-09-30 09:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][200/1251] eta 0:13:55 lr 0.000795 time 0.8000 (0.7953) loss 3.8143 (3.7069) grad_norm 0.9897 (1.1950) [2022-09-30 09:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][300/1251] eta 0:12:29 lr 0.000795 time 0.8122 (0.7876) loss 4.2265 (3.7252) grad_norm 1.0777 (1.1935) [2022-09-30 09:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][400/1251] eta 0:11:07 lr 0.000795 time 0.8336 (0.7846) loss 3.1351 (3.7391) grad_norm 1.1602 (1.2010) [2022-09-30 09:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][500/1251] eta 0:09:49 lr 0.000794 time 0.7785 (0.7850) loss 3.7836 (3.7453) grad_norm 1.0901 (1.1968) [2022-09-30 09:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][600/1251] eta 0:08:29 lr 0.000794 time 0.8326 (0.7824) loss 3.8073 (3.7359) grad_norm 1.1189 (1.1987) [2022-09-30 09:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][700/1251] eta 0:07:10 lr 0.000794 time 0.7514 (0.7813) loss 4.1523 (3.7376) grad_norm 1.1619 (1.1999) [2022-09-30 09:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][800/1251] eta 0:05:51 lr 0.000793 time 0.8200 (0.7799) loss 4.3369 (3.7359) grad_norm 1.1249 (1.2027) [2022-09-30 09:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][900/1251] eta 0:04:33 lr 0.000793 time 0.7896 (0.7789) loss 3.7400 (3.7399) grad_norm 1.2727 (1.1994) [2022-09-30 10:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1000/1251] eta 0:03:15 lr 0.000793 time 0.8524 (0.7789) loss 3.3191 (3.7408) grad_norm 1.0432 (1.1973) [2022-09-30 10:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1100/1251] eta 0:01:57 lr 0.000792 time 0.8317 (0.7780) loss 3.5754 (3.7386) grad_norm 1.0538 (1.1970) [2022-09-30 10:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1200/1251] eta 0:00:39 lr 0.000792 time 0.8298 (0.7782) loss 3.0949 (3.7329) grad_norm 1.2556 (1.1968) [2022-09-30 10:03:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 90 training takes 0:16:12 [2022-09-30 10:03:55 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saving...... [2022-09-30 10:03:55 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saved !!! [2022-09-30 10:04:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.342 (4.342) Loss 1.1311 (1.1311) Acc@1 73.438 (73.438) Acc@5 91.797 (91.797) [2022-09-30 10:04:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.380 Acc@5 91.976 [2022-09-30 10:04:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-09-30 10:04:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.38% [2022-09-30 10:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][0/1251] eta 1:45:28 lr 0.000792 time 5.0589 (5.0589) loss 3.4551 (3.4551) grad_norm 1.0955 (1.0955) [2022-09-30 10:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][100/1251] eta 0:15:41 lr 0.000791 time 0.6057 (0.8183) loss 4.0044 (3.7779) grad_norm 1.0664 (1.1980) [2022-09-30 10:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][200/1251] eta 0:13:58 lr 0.000791 time 0.7527 (0.7981) loss 4.1643 (3.7568) grad_norm 1.1020 (1.2030) [2022-09-30 10:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][300/1251] eta 0:12:34 lr 0.000791 time 0.8223 (0.7931) loss 3.8953 (3.7405) grad_norm 1.1315 (1.2158) [2022-09-30 10:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][400/1251] eta 0:11:09 lr 0.000790 time 0.7169 (0.7871) loss 2.9805 (3.7618) grad_norm 1.0745 (1.2096) [2022-09-30 10:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][500/1251] eta 0:09:50 lr 0.000790 time 0.7965 (0.7861) loss 3.5798 (3.7552) grad_norm 1.4002 (1.2064) [2022-09-30 10:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][600/1251] eta 0:08:30 lr 0.000790 time 0.7812 (0.7849) loss 4.2077 (3.7604) grad_norm 1.3077 (1.2101) [2022-09-30 10:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][700/1251] eta 0:07:11 lr 0.000789 time 0.8846 (0.7828) loss 4.3835 (3.7706) grad_norm 1.5110 (1.2126) [2022-09-30 10:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][800/1251] eta 0:05:52 lr 0.000789 time 0.8056 (0.7805) loss 4.0169 (3.7737) grad_norm 1.2154 (1.2118) [2022-09-30 10:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][900/1251] eta 0:04:33 lr 0.000789 time 0.8283 (0.7802) loss 4.1293 (3.7758) grad_norm 1.1139 (1.2096) [2022-09-30 10:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1000/1251] eta 0:03:15 lr 0.000788 time 0.7929 (0.7794) loss 4.1077 (3.7788) grad_norm 1.3682 (1.2090) [2022-09-30 10:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1100/1251] eta 0:01:57 lr 0.000788 time 0.7946 (0.7795) loss 3.7957 (3.7796) grad_norm 1.2130 (1.2078) [2022-09-30 10:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1200/1251] eta 0:00:39 lr 0.000788 time 0.8161 (0.7792) loss 2.8943 (3.7732) grad_norm 1.2193 (1.2074) [2022-09-30 10:20:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 91 training takes 0:16:13 [2022-09-30 10:20:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.096 (4.096) Loss 1.0424 (1.0424) Acc@1 76.758 (76.758) Acc@5 93.262 (93.262) [2022-09-30 10:20:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.580 Acc@5 92.084 [2022-09-30 10:20:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 10:20:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 10:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][0/1251] eta 1:36:55 lr 0.000788 time 4.6483 (4.6483) loss 3.7314 (3.7314) grad_norm 1.2828 (1.2828) [2022-09-30 10:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][100/1251] eta 0:15:42 lr 0.000787 time 0.8459 (0.8188) loss 4.0267 (3.6498) grad_norm 0.9789 (1.2139) [2022-09-30 10:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][200/1251] eta 0:14:04 lr 0.000787 time 0.8328 (0.8033) loss 3.3545 (3.7304) grad_norm 1.2730 (1.2000) [2022-09-30 10:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][300/1251] eta 0:12:35 lr 0.000786 time 0.7592 (0.7947) loss 3.6216 (3.7404) grad_norm 1.0175 (1.1913) [2022-09-30 10:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][400/1251] eta 0:11:12 lr 0.000786 time 0.8574 (0.7903) loss 3.7277 (3.7616) grad_norm 1.1770 (1.1921) [2022-09-30 10:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][500/1251] eta 0:09:51 lr 0.000786 time 0.8372 (0.7880) loss 2.8958 (3.7679) grad_norm 1.1484 (1.1899) [2022-09-30 10:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][600/1251] eta 0:08:31 lr 0.000785 time 0.7260 (0.7859) loss 4.3863 (3.7652) grad_norm 1.1547 (1.1876) [2022-09-30 10:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][700/1251] eta 0:07:12 lr 0.000785 time 0.8732 (0.7843) loss 3.7698 (3.7658) grad_norm 1.1860 (1.1913) [2022-09-30 10:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][800/1251] eta 0:05:53 lr 0.000785 time 0.8063 (0.7835) loss 4.1200 (3.7637) grad_norm 1.4685 (1.1925) [2022-09-30 10:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][900/1251] eta 0:04:34 lr 0.000784 time 0.9340 (0.7834) loss 4.3049 (3.7600) grad_norm 1.1609 (1.1937) [2022-09-30 10:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1000/1251] eta 0:03:16 lr 0.000784 time 0.7102 (0.7822) loss 4.4355 (3.7546) grad_norm 1.0511 (1.1958) [2022-09-30 10:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1100/1251] eta 0:01:58 lr 0.000784 time 0.6400 (0.7817) loss 4.0398 (3.7522) grad_norm 1.0402 (1.1939) [2022-09-30 10:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1200/1251] eta 0:00:39 lr 0.000783 time 0.8533 (0.7812) loss 3.3288 (3.7523) grad_norm 1.2405 (1.1975) [2022-09-30 10:37:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 92 training takes 0:16:16 [2022-09-30 10:37:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.375 (4.375) Loss 1.1727 (1.1727) Acc@1 73.145 (73.145) Acc@5 92.188 (92.188) [2022-09-30 10:37:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.572 Acc@5 92.038 [2022-09-30 10:37:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 10:37:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 10:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][0/1251] eta 1:29:57 lr 0.000783 time 4.3147 (4.3147) loss 4.3660 (4.3660) grad_norm 1.2264 (1.2264) [2022-09-30 10:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][100/1251] eta 0:15:35 lr 0.000783 time 0.7982 (0.8124) loss 4.2688 (3.8164) grad_norm 1.1517 (1.2107) [2022-09-30 10:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][200/1251] eta 0:13:51 lr 0.000783 time 0.8197 (0.7914) loss 4.0279 (3.7959) grad_norm 1.0134 (1.2098) [2022-09-30 10:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][300/1251] eta 0:12:30 lr 0.000782 time 0.6661 (0.7890) loss 2.5482 (3.8022) grad_norm 1.3535 (1.2119) [2022-09-30 10:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][400/1251] eta 0:11:08 lr 0.000782 time 0.8173 (0.7853) loss 3.2025 (3.7920) grad_norm 1.2767 (1.2114) [2022-09-30 10:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][500/1251] eta 0:09:49 lr 0.000782 time 0.8346 (0.7851) loss 4.2462 (3.7899) grad_norm 1.7306 (1.2160) [2022-09-30 10:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][600/1251] eta 0:08:29 lr 0.000781 time 0.9152 (0.7833) loss 3.8212 (3.7803) grad_norm 1.0080 (1.2150) [2022-09-30 10:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][700/1251] eta 0:07:11 lr 0.000781 time 0.8084 (0.7823) loss 4.4677 (3.7765) grad_norm 1.0855 (1.2125) [2022-09-30 10:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][800/1251] eta 0:05:52 lr 0.000780 time 0.7931 (0.7820) loss 2.9450 (3.7783) grad_norm 1.2379 (1.2150) [2022-09-30 10:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][900/1251] eta 0:04:33 lr 0.000780 time 0.8108 (0.7804) loss 2.6806 (3.7718) grad_norm 1.0797 (1.2112) [2022-09-30 10:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1000/1251] eta 0:03:15 lr 0.000780 time 0.8044 (0.7794) loss 4.5330 (3.7673) grad_norm 1.1867 (1.2081) [2022-09-30 10:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1100/1251] eta 0:01:57 lr 0.000779 time 0.9408 (0.7790) loss 3.7114 (3.7664) grad_norm 1.1929 (1.2092) [2022-09-30 10:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1200/1251] eta 0:00:39 lr 0.000779 time 0.7061 (0.7786) loss 3.4788 (3.7639) grad_norm 1.0725 (1.2090) [2022-09-30 10:53:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 93 training takes 0:16:13 [2022-09-30 10:53:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.464 (4.464) Loss 1.1194 (1.1194) Acc@1 73.730 (73.730) Acc@5 92.480 (92.480) [2022-09-30 10:54:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.350 Acc@5 91.920 [2022-09-30 10:54:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-09-30 10:54:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 10:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][0/1251] eta 1:31:03 lr 0.000779 time 4.3676 (4.3676) loss 3.4200 (3.4200) grad_norm 1.1522 (1.1522) [2022-09-30 10:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][100/1251] eta 0:15:36 lr 0.000779 time 0.8396 (0.8138) loss 4.4294 (3.6869) grad_norm 1.1665 (1.2265) [2022-09-30 10:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][200/1251] eta 0:13:55 lr 0.000778 time 0.8483 (0.7949) loss 2.7618 (3.7294) grad_norm 1.1135 (1.2131) [2022-09-30 10:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][300/1251] eta 0:12:19 lr 0.000778 time 0.5759 (0.7779) loss 3.2438 (3.7271) grad_norm 1.0932 (1.2096) [2022-09-30 10:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][400/1251] eta 0:10:41 lr 0.000778 time 0.6694 (0.7543) loss 3.8033 (3.7577) grad_norm 1.3292 (1.2160) [2022-09-30 11:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][500/1251] eta 0:09:29 lr 0.000777 time 0.7995 (0.7586) loss 4.3542 (3.7391) grad_norm 1.3000 (1.2113) [2022-09-30 11:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][600/1251] eta 0:08:15 lr 0.000777 time 0.8655 (0.7612) loss 2.6567 (3.7357) grad_norm 1.0484 (1.2096) [2022-09-30 11:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][700/1251] eta 0:06:59 lr 0.000777 time 0.7389 (0.7621) loss 3.6571 (3.7473) grad_norm 1.2441 (1.2112) [2022-09-30 11:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][800/1251] eta 0:05:44 lr 0.000776 time 0.9127 (0.7633) loss 3.6227 (3.7531) grad_norm 1.3422 (1.2128) [2022-09-30 11:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][900/1251] eta 0:04:28 lr 0.000776 time 0.8513 (0.7653) loss 4.2703 (3.7484) grad_norm 1.3324 (1.2140) [2022-09-30 11:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1000/1251] eta 0:03:12 lr 0.000775 time 0.8584 (0.7663) loss 4.2768 (3.7441) grad_norm 1.0612 (1.2120) [2022-09-30 11:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1100/1251] eta 0:01:55 lr 0.000775 time 0.8172 (0.7669) loss 3.5689 (3.7430) grad_norm 1.1765 (1.2117) [2022-09-30 11:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1200/1251] eta 0:00:39 lr 0.000775 time 0.8274 (0.7666) loss 3.1913 (3.7379) grad_norm 1.1592 (1.2123) [2022-09-30 11:10:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 94 training takes 0:15:59 [2022-09-30 11:10:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.728 (4.728) Loss 1.2253 (1.2253) Acc@1 72.070 (72.070) Acc@5 91.113 (91.113) [2022-09-30 11:10:25 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.404 Acc@5 92.014 [2022-09-30 11:10:25 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-09-30 11:10:25 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 11:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][0/1251] eta 1:47:13 lr 0.000775 time 5.1429 (5.1429) loss 4.1692 (4.1692) grad_norm 1.0659 (1.0659) [2022-09-30 11:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][100/1251] eta 0:15:36 lr 0.000774 time 0.8280 (0.8132) loss 3.0750 (3.7466) grad_norm 1.3496 (1.1883) [2022-09-30 11:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][200/1251] eta 0:13:52 lr 0.000774 time 0.7738 (0.7918) loss 3.2503 (3.7252) grad_norm 1.1591 (1.2159) [2022-09-30 11:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][300/1251] eta 0:12:28 lr 0.000774 time 0.8353 (0.7870) loss 4.1160 (3.7570) grad_norm 1.0068 (1.2085) [2022-09-30 11:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][400/1251] eta 0:11:08 lr 0.000773 time 0.7866 (0.7852) loss 4.4158 (3.7549) grad_norm 1.0643 (1.2074) [2022-09-30 11:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][500/1251] eta 0:09:47 lr 0.000773 time 0.9071 (0.7816) loss 2.5558 (3.7422) grad_norm 1.2663 (1.2078) [2022-09-30 11:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][600/1251] eta 0:08:27 lr 0.000773 time 0.7074 (0.7803) loss 2.4362 (3.7437) grad_norm 1.4170 (1.2067) [2022-09-30 11:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][700/1251] eta 0:07:10 lr 0.000772 time 0.6919 (0.7807) loss 3.3707 (3.7387) grad_norm 1.2177 (1.2111) [2022-09-30 11:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][800/1251] eta 0:05:51 lr 0.000772 time 0.7930 (0.7798) loss 3.7155 (3.7315) grad_norm 1.0401 (1.2143) [2022-09-30 11:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][900/1251] eta 0:04:33 lr 0.000771 time 0.7923 (0.7793) loss 3.7868 (3.7321) grad_norm 1.1919 (1.2136) [2022-09-30 11:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1000/1251] eta 0:03:15 lr 0.000771 time 0.6559 (0.7790) loss 3.0860 (3.7295) grad_norm 1.1723 (1.2136) [2022-09-30 11:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1100/1251] eta 0:01:57 lr 0.000771 time 0.6783 (0.7777) loss 4.3803 (3.7301) grad_norm 1.3019 (1.2127) [2022-09-30 11:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1200/1251] eta 0:00:39 lr 0.000770 time 0.8650 (0.7781) loss 3.9123 (3.7273) grad_norm 1.4142 (1.2121) [2022-09-30 11:26:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 95 training takes 0:16:13 [2022-09-30 11:26:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.878 (3.878) Loss 1.1951 (1.1951) Acc@1 71.484 (71.484) Acc@5 92.480 (92.480) [2022-09-30 11:26:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.216 Acc@5 91.968 [2022-09-30 11:26:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.2% [2022-09-30 11:26:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 11:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][0/1251] eta 1:34:32 lr 0.000770 time 4.5346 (4.5346) loss 4.3705 (4.3705) grad_norm 1.3764 (1.3764) [2022-09-30 11:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][100/1251] eta 0:15:28 lr 0.000770 time 0.8544 (0.8068) loss 4.3418 (3.6192) grad_norm 1.3176 (1.2027) [2022-09-30 11:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][200/1251] eta 0:13:51 lr 0.000770 time 0.8278 (0.7916) loss 2.7415 (3.6930) grad_norm 1.2744 (1.2087) [2022-09-30 11:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][300/1251] eta 0:12:26 lr 0.000769 time 0.8011 (0.7854) loss 2.4792 (3.6979) grad_norm 1.0653 (1.2092) [2022-09-30 11:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][400/1251] eta 0:11:06 lr 0.000769 time 0.8455 (0.7836) loss 2.6454 (3.7117) grad_norm 1.4415 (1.2222) [2022-09-30 11:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][500/1251] eta 0:09:47 lr 0.000768 time 0.6815 (0.7821) loss 3.0338 (3.7095) grad_norm 1.2040 (1.2268) [2022-09-30 11:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][600/1251] eta 0:08:27 lr 0.000768 time 0.8594 (0.7802) loss 3.9255 (3.7335) grad_norm 1.2964 (1.2256) [2022-09-30 11:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][700/1251] eta 0:07:08 lr 0.000768 time 0.8394 (0.7782) loss 4.2937 (3.7350) grad_norm 1.3165 (1.2241) [2022-09-30 11:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][800/1251] eta 0:05:51 lr 0.000767 time 0.7345 (0.7784) loss 4.3792 (3.7434) grad_norm 1.2698 (1.2249) [2022-09-30 11:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][900/1251] eta 0:04:32 lr 0.000767 time 0.6883 (0.7767) loss 4.4640 (3.7454) grad_norm 1.1934 (1.2265) [2022-09-30 11:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1000/1251] eta 0:03:15 lr 0.000767 time 0.6591 (0.7773) loss 3.4603 (3.7379) grad_norm 1.0215 (1.2248) [2022-09-30 11:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1100/1251] eta 0:01:57 lr 0.000766 time 0.7219 (0.7773) loss 3.3180 (3.7389) grad_norm 1.2143 (1.2262) [2022-09-30 11:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1200/1251] eta 0:00:39 lr 0.000766 time 0.8252 (0.7769) loss 3.9234 (3.7424) grad_norm 1.2718 (1.2269) [2022-09-30 11:43:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 96 training takes 0:16:11 [2022-09-30 11:43:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.370 (4.370) Loss 1.0685 (1.0685) Acc@1 74.609 (74.609) Acc@5 92.285 (92.285) [2022-09-30 11:43:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.572 Acc@5 91.998 [2022-09-30 11:43:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 11:43:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.58% [2022-09-30 11:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][0/1251] eta 1:37:22 lr 0.000766 time 4.6699 (4.6699) loss 3.5076 (3.5076) grad_norm 1.1577 (1.1577) [2022-09-30 11:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][100/1251] eta 0:15:41 lr 0.000765 time 0.8247 (0.8182) loss 4.1698 (3.6896) grad_norm 1.2296 (1.2215) [2022-09-30 11:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][200/1251] eta 0:13:58 lr 0.000765 time 0.7822 (0.7981) loss 3.3735 (3.7599) grad_norm 1.1650 (1.2256) [2022-09-30 11:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][300/1251] eta 0:12:32 lr 0.000765 time 0.8011 (0.7909) loss 4.5093 (3.7537) grad_norm 1.2446 (1.2206) [2022-09-30 11:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][400/1251] eta 0:11:09 lr 0.000764 time 0.8822 (0.7867) loss 4.0968 (3.7584) grad_norm 1.6242 (1.2148) [2022-09-30 11:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][500/1251] eta 0:09:48 lr 0.000764 time 0.7748 (0.7833) loss 4.3024 (3.7447) grad_norm 1.4666 (1.2143) [2022-09-30 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][600/1251] eta 0:08:28 lr 0.000764 time 0.7958 (0.7818) loss 4.3114 (3.7385) grad_norm 1.2007 (1.2117) [2022-09-30 11:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][700/1251] eta 0:07:10 lr 0.000763 time 0.8405 (0.7810) loss 3.7042 (3.7409) grad_norm 1.3249 (1.2136) [2022-09-30 11:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][800/1251] eta 0:05:52 lr 0.000763 time 0.8060 (0.7806) loss 4.7512 (3.7323) grad_norm 1.3894 (1.2160) [2022-09-30 11:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][900/1251] eta 0:04:33 lr 0.000763 time 0.8853 (0.7801) loss 4.1588 (3.7365) grad_norm 1.0048 (1.2187) [2022-09-30 11:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1000/1251] eta 0:03:15 lr 0.000762 time 0.6402 (0.7794) loss 2.6980 (3.7250) grad_norm 1.2562 (1.2190) [2022-09-30 11:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1100/1251] eta 0:01:57 lr 0.000762 time 0.8187 (0.7806) loss 3.9507 (3.7191) grad_norm 1.2242 (1.2196) [2022-09-30 11:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1200/1251] eta 0:00:39 lr 0.000762 time 0.8116 (0.7800) loss 3.4203 (3.7234) grad_norm 1.0467 (1.2222) [2022-09-30 11:59:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 97 training takes 0:16:15 [2022-09-30 11:59:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.887 (3.887) Loss 1.1778 (1.1778) Acc@1 72.070 (72.070) Acc@5 91.309 (91.309) [2022-09-30 12:00:09 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.602 Acc@5 92.068 [2022-09-30 12:00:09 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 12:00:09 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.60% [2022-09-30 12:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][0/1251] eta 1:23:32 lr 0.000761 time 4.0071 (4.0071) loss 4.4315 (4.4315) grad_norm 1.2827 (1.2827) [2022-09-30 12:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][100/1251] eta 0:15:29 lr 0.000761 time 0.9077 (0.8071) loss 4.4094 (3.7073) grad_norm 1.1721 (1.2315) [2022-09-30 12:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][200/1251] eta 0:13:52 lr 0.000761 time 0.8123 (0.7925) loss 4.4748 (3.6927) grad_norm 1.2462 (1.2160) [2022-09-30 12:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][300/1251] eta 0:12:30 lr 0.000760 time 0.8356 (0.7896) loss 3.6728 (3.7075) grad_norm 1.2834 (1.2229) [2022-09-30 12:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][400/1251] eta 0:11:09 lr 0.000760 time 0.7780 (0.7862) loss 3.3029 (3.7141) grad_norm 1.1069 (1.2217) [2022-09-30 12:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][500/1251] eta 0:09:48 lr 0.000760 time 0.8438 (0.7830) loss 3.6604 (3.7063) grad_norm 1.4839 (1.2215) [2022-09-30 12:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][600/1251] eta 0:08:28 lr 0.000759 time 0.9146 (0.7805) loss 3.7414 (3.7005) grad_norm 1.1314 (1.2211) [2022-09-30 12:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][700/1251] eta 0:07:09 lr 0.000759 time 0.6502 (0.7800) loss 2.9632 (3.6989) grad_norm 1.1526 (1.2181) [2022-09-30 12:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][800/1251] eta 0:05:51 lr 0.000759 time 0.8037 (0.7789) loss 3.2665 (3.7018) grad_norm 1.1356 (1.2185) [2022-09-30 12:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][900/1251] eta 0:04:32 lr 0.000758 time 0.8552 (0.7777) loss 3.5642 (3.6969) grad_norm 1.1680 (1.2215) [2022-09-30 12:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1000/1251] eta 0:03:15 lr 0.000758 time 0.8206 (0.7770) loss 4.0255 (3.7044) grad_norm 1.2098 (1.2231) [2022-09-30 12:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1100/1251] eta 0:01:57 lr 0.000758 time 0.9112 (0.7765) loss 3.9177 (3.7094) grad_norm 1.1411 (1.2218) [2022-09-30 12:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1200/1251] eta 0:00:39 lr 0.000757 time 0.7878 (0.7760) loss 2.7007 (3.7110) grad_norm 1.1237 (1.2240) [2022-09-30 12:16:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 98 training takes 0:16:10 [2022-09-30 12:16:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.210 (4.210) Loss 1.1631 (1.1631) Acc@1 72.949 (72.949) Acc@5 91.895 (91.895) [2022-09-30 12:16:41 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.736 Acc@5 92.038 [2022-09-30 12:16:41 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-09-30 12:16:41 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 12:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][0/1251] eta 1:41:58 lr 0.000757 time 4.8911 (4.8911) loss 3.0120 (3.0120) grad_norm 1.0944 (1.0944) [2022-09-30 12:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][100/1251] eta 0:15:32 lr 0.000757 time 0.8112 (0.8098) loss 2.4822 (3.6638) grad_norm 1.5271 (1.2416) [2022-09-30 12:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][200/1251] eta 0:13:52 lr 0.000756 time 0.7981 (0.7921) loss 3.0332 (3.6554) grad_norm 1.3048 (1.2344) [2022-09-30 12:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][300/1251] eta 0:12:31 lr 0.000756 time 0.9154 (0.7897) loss 2.6996 (3.6598) grad_norm 1.4238 (1.2259) [2022-09-30 12:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][400/1251] eta 0:11:10 lr 0.000756 time 0.8639 (0.7881) loss 3.1527 (3.6983) grad_norm 1.0360 (1.2348) [2022-09-30 12:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][500/1251] eta 0:09:48 lr 0.000755 time 0.6165 (0.7841) loss 2.8540 (3.6991) grad_norm 1.3737 (1.2344) [2022-09-30 12:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][600/1251] eta 0:08:29 lr 0.000755 time 0.7455 (0.7822) loss 4.6118 (3.6873) grad_norm 1.1205 (1.2283) [2022-09-30 12:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][700/1251] eta 0:07:10 lr 0.000754 time 0.8351 (0.7818) loss 3.5180 (3.6872) grad_norm 1.4389 (1.2271) [2022-09-30 12:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][800/1251] eta 0:05:52 lr 0.000754 time 0.9179 (0.7808) loss 4.1617 (3.6876) grad_norm 1.5658 (1.2309) [2022-09-30 12:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][900/1251] eta 0:04:33 lr 0.000754 time 0.7869 (0.7796) loss 3.3746 (3.7004) grad_norm 1.2336 (1.2311) [2022-09-30 12:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1000/1251] eta 0:03:15 lr 0.000753 time 0.7495 (0.7804) loss 3.3408 (3.7058) grad_norm 1.2187 (1.2326) [2022-09-30 12:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1100/1251] eta 0:01:57 lr 0.000753 time 0.8518 (0.7802) loss 2.8965 (3.7098) grad_norm 1.2155 (1.2320) [2022-09-30 12:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1200/1251] eta 0:00:39 lr 0.000753 time 0.8272 (0.7797) loss 3.7831 (3.7152) grad_norm 1.2138 (1.2301) [2022-09-30 12:32:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 99 training takes 0:16:15 [2022-09-30 12:33:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.806 (3.806) Loss 1.1333 (1.1333) Acc@1 74.316 (74.316) Acc@5 91.895 (91.895) [2022-09-30 12:33:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.588 Acc@5 91.942 [2022-09-30 12:33:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 12:33:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 12:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][0/1251] eta 1:44:26 lr 0.000753 time 5.0096 (5.0096) loss 2.8817 (2.8817) grad_norm 1.3178 (1.3178) [2022-09-30 12:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][100/1251] eta 0:15:35 lr 0.000752 time 0.8454 (0.8125) loss 2.8441 (3.7152) grad_norm 1.1859 (1.2104) [2022-09-30 12:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][200/1251] eta 0:13:52 lr 0.000752 time 0.8089 (0.7920) loss 3.1989 (3.6562) grad_norm 1.1536 (1.2305) [2022-09-30 12:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][300/1251] eta 0:12:28 lr 0.000751 time 0.7732 (0.7873) loss 3.4713 (3.6828) grad_norm 1.3166 (1.2285) [2022-09-30 12:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][400/1251] eta 0:11:05 lr 0.000751 time 0.8067 (0.7825) loss 3.5610 (3.6960) grad_norm 1.2564 (1.2333) [2022-09-30 12:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][500/1251] eta 0:09:46 lr 0.000751 time 0.9480 (0.7815) loss 3.8134 (3.7110) grad_norm 1.2167 (1.2371) [2022-09-30 12:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][600/1251] eta 0:08:26 lr 0.000750 time 0.7107 (0.7784) loss 3.6354 (3.7213) grad_norm 1.1658 (1.2370) [2022-09-30 12:42:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][700/1251] eta 0:07:08 lr 0.000750 time 0.7602 (0.7783) loss 4.1748 (3.7258) grad_norm 1.2185 (1.2412) [2022-09-30 12:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][800/1251] eta 0:05:50 lr 0.000750 time 0.6453 (0.7777) loss 3.7684 (3.7138) grad_norm 1.1567 (1.2425) [2022-09-30 12:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][900/1251] eta 0:04:32 lr 0.000749 time 0.6314 (0.7770) loss 4.4073 (3.7117) grad_norm 1.1731 (1.2416) [2022-09-30 12:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1000/1251] eta 0:03:14 lr 0.000749 time 0.7635 (0.7757) loss 3.6229 (3.7105) grad_norm 1.1275 (1.2424) [2022-09-30 12:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1100/1251] eta 0:01:57 lr 0.000749 time 0.8200 (0.7760) loss 3.4472 (3.7139) grad_norm 1.2031 (1.2433) [2022-09-30 12:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1200/1251] eta 0:00:39 lr 0.000748 time 0.8088 (0.7751) loss 3.2560 (3.7137) grad_norm 1.0884 (1.2439) [2022-09-30 12:49:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 100 training takes 0:16:09 [2022-09-30 12:49:26 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saving...... [2022-09-30 12:49:27 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saved !!! [2022-09-30 12:49:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.985 (3.985) Loss 1.1278 (1.1278) Acc@1 74.219 (74.219) Acc@5 92.480 (92.480) [2022-09-30 12:49:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.622 Acc@5 92.140 [2022-09-30 12:49:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 12:49:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 12:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][0/1251] eta 1:32:47 lr 0.000748 time 4.4504 (4.4504) loss 3.2348 (3.2348) grad_norm 1.1081 (1.1081) [2022-09-30 12:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][100/1251] eta 0:15:37 lr 0.000748 time 0.6672 (0.8144) loss 4.0911 (3.7229) grad_norm 1.2441 (1.2433) [2022-09-30 12:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][200/1251] eta 0:13:58 lr 0.000747 time 0.9007 (0.7979) loss 3.4817 (3.7194) grad_norm 1.3520 (1.2300) [2022-09-30 12:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][300/1251] eta 0:12:30 lr 0.000747 time 0.6663 (0.7896) loss 3.7894 (3.6852) grad_norm 1.2262 (1.2329) [2022-09-30 12:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][400/1251] eta 0:11:08 lr 0.000747 time 0.8486 (0.7852) loss 2.8543 (3.6854) grad_norm 1.0943 (1.2277) [2022-09-30 12:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][500/1251] eta 0:09:47 lr 0.000746 time 0.6954 (0.7823) loss 2.8078 (3.6828) grad_norm 1.1192 (1.2198) [2022-09-30 12:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][600/1251] eta 0:08:27 lr 0.000746 time 0.8065 (0.7800) loss 3.3480 (3.6954) grad_norm 1.3563 (1.2170) [2022-09-30 12:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][700/1251] eta 0:07:09 lr 0.000745 time 0.8849 (0.7795) loss 3.9380 (3.6941) grad_norm 1.1716 (1.2183) [2022-09-30 13:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][800/1251] eta 0:05:51 lr 0.000745 time 0.8139 (0.7790) loss 3.9415 (3.7037) grad_norm 1.2258 (1.2180) [2022-09-30 13:01:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][900/1251] eta 0:04:33 lr 0.000745 time 0.7432 (0.7784) loss 3.5263 (3.7022) grad_norm 1.1975 (1.2186) [2022-09-30 13:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1000/1251] eta 0:03:15 lr 0.000744 time 0.6450 (0.7780) loss 3.3048 (3.7009) grad_norm 1.1204 (1.2210) [2022-09-30 13:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1100/1251] eta 0:01:57 lr 0.000744 time 0.8136 (0.7771) loss 3.7595 (3.7032) grad_norm 1.2176 (1.2213) [2022-09-30 13:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1200/1251] eta 0:00:39 lr 0.000744 time 0.7409 (0.7768) loss 4.0849 (3.7071) grad_norm 1.1738 (1.2250) [2022-09-30 13:05:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 101 training takes 0:16:11 [2022-09-30 13:06:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.013 (4.013) Loss 1.1741 (1.1741) Acc@1 74.023 (74.023) Acc@5 91.211 (91.211) [2022-09-30 13:06:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.482 Acc@5 92.030 [2022-09-30 13:06:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-09-30 13:06:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 13:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][0/1251] eta 1:39:03 lr 0.000743 time 4.7510 (4.7510) loss 3.9199 (3.9199) grad_norm 1.1966 (1.1966) [2022-09-30 13:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][100/1251] eta 0:15:39 lr 0.000743 time 0.7741 (0.8160) loss 4.3890 (3.6615) grad_norm 1.1614 (1.2452) [2022-09-30 13:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][200/1251] eta 0:14:03 lr 0.000743 time 0.7734 (0.8022) loss 4.1738 (3.6432) grad_norm 1.2027 (1.2291) [2022-09-30 13:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][300/1251] eta 0:12:36 lr 0.000742 time 0.6233 (0.7958) loss 3.7830 (3.7024) grad_norm 1.0441 (1.2272) [2022-09-30 13:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][400/1251] eta 0:11:13 lr 0.000742 time 0.9476 (0.7915) loss 3.9634 (3.7171) grad_norm 1.0264 (1.2297) [2022-09-30 13:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][500/1251] eta 0:09:52 lr 0.000742 time 0.8277 (0.7889) loss 3.3679 (3.7089) grad_norm 1.6244 (1.2321) [2022-09-30 13:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][600/1251] eta 0:08:32 lr 0.000741 time 0.7312 (0.7867) loss 3.5162 (3.6987) grad_norm 1.1144 (1.2335) [2022-09-30 13:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][700/1251] eta 0:07:12 lr 0.000741 time 0.6685 (0.7848) loss 4.5174 (3.6958) grad_norm 1.4803 (1.2304) [2022-09-30 13:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][800/1251] eta 0:05:53 lr 0.000741 time 0.6548 (0.7835) loss 3.9017 (3.6999) grad_norm 1.3525 (1.2323) [2022-09-30 13:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][900/1251] eta 0:04:34 lr 0.000740 time 0.8851 (0.7825) loss 4.2153 (3.7018) grad_norm 1.1513 (1.2336) [2022-09-30 13:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1000/1251] eta 0:03:16 lr 0.000740 time 0.7610 (0.7814) loss 4.2255 (3.7045) grad_norm 1.1341 (1.2325) [2022-09-30 13:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1100/1251] eta 0:01:57 lr 0.000739 time 0.8277 (0.7805) loss 4.3508 (3.7037) grad_norm 1.4147 (1.2316) [2022-09-30 13:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1200/1251] eta 0:00:39 lr 0.000739 time 0.5194 (0.7726) loss 3.3227 (3.6988) grad_norm 1.1099 (1.2319) [2022-09-30 13:22:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 102 training takes 0:16:00 [2022-09-30 13:22:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.684 (4.684) Loss 1.1807 (1.1807) Acc@1 72.559 (72.559) Acc@5 91.699 (91.699) [2022-09-30 13:22:41 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.590 Acc@5 92.090 [2022-09-30 13:22:41 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-09-30 13:22:41 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 13:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][0/1251] eta 1:39:28 lr 0.000739 time 4.7710 (4.7710) loss 3.8587 (3.8587) grad_norm 1.2802 (1.2802) [2022-09-30 13:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][100/1251] eta 0:15:33 lr 0.000739 time 0.9287 (0.8108) loss 3.6835 (3.6767) grad_norm 1.2457 (1.2574) [2022-09-30 13:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][200/1251] eta 0:13:56 lr 0.000738 time 0.7607 (0.7955) loss 3.8128 (3.6762) grad_norm 1.2073 (1.2320) [2022-09-30 13:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][300/1251] eta 0:12:31 lr 0.000738 time 0.6809 (0.7897) loss 3.0231 (3.6621) grad_norm 1.2172 (1.2372) [2022-09-30 13:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][400/1251] eta 0:11:10 lr 0.000737 time 0.8252 (0.7875) loss 4.1573 (3.6645) grad_norm 1.1480 (1.2417) [2022-09-30 13:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][500/1251] eta 0:09:48 lr 0.000737 time 0.8552 (0.7839) loss 3.9132 (3.6635) grad_norm 1.1113 (1.2395) [2022-09-30 13:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][600/1251] eta 0:08:28 lr 0.000737 time 0.8529 (0.7813) loss 2.7108 (3.6652) grad_norm 1.1226 (1.2409) [2022-09-30 13:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][700/1251] eta 0:07:10 lr 0.000736 time 0.7672 (0.7807) loss 2.5809 (3.6800) grad_norm 1.3690 (1.2463) [2022-09-30 13:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][800/1251] eta 0:05:51 lr 0.000736 time 0.7788 (0.7798) loss 3.9999 (3.6727) grad_norm 1.4272 (1.2446) [2022-09-30 13:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][900/1251] eta 0:04:33 lr 0.000736 time 0.6310 (0.7784) loss 3.8987 (3.6722) grad_norm 1.4156 (1.2469) [2022-09-30 13:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1000/1251] eta 0:03:15 lr 0.000735 time 0.8159 (0.7780) loss 3.7225 (3.6863) grad_norm 1.1723 (1.2457) [2022-09-30 13:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1100/1251] eta 0:01:57 lr 0.000735 time 0.8072 (0.7781) loss 4.0284 (3.6853) grad_norm 1.5140 (1.2437) [2022-09-30 13:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1200/1251] eta 0:00:39 lr 0.000735 time 0.7098 (0.7773) loss 4.3297 (3.6921) grad_norm 1.4552 (1.2433) [2022-09-30 13:38:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 103 training takes 0:16:13 [2022-09-30 13:38:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.005 (4.005) Loss 1.0886 (1.0886) Acc@1 75.098 (75.098) Acc@5 93.262 (93.262) [2022-09-30 13:39:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.732 Acc@5 92.166 [2022-09-30 13:39:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-09-30 13:39:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 13:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][0/1251] eta 1:30:25 lr 0.000734 time 4.3367 (4.3367) loss 3.6991 (3.6991) grad_norm 1.3850 (1.3850) [2022-09-30 13:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][100/1251] eta 0:15:35 lr 0.000734 time 0.8305 (0.8127) loss 3.9376 (3.6529) grad_norm 1.3763 (1.2576) [2022-09-30 13:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][200/1251] eta 0:13:56 lr 0.000734 time 0.6946 (0.7958) loss 3.2213 (3.6960) grad_norm 1.3441 (1.2415) [2022-09-30 13:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][300/1251] eta 0:12:28 lr 0.000733 time 0.8515 (0.7875) loss 4.0366 (3.6996) grad_norm 1.1407 (1.2332) [2022-09-30 13:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][400/1251] eta 0:11:05 lr 0.000733 time 0.6477 (0.7824) loss 3.5314 (3.6949) grad_norm 1.1301 (1.2381) [2022-09-30 13:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][500/1251] eta 0:09:46 lr 0.000732 time 0.8133 (0.7811) loss 3.9033 (3.6903) grad_norm 1.1926 (1.2355) [2022-09-30 13:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][600/1251] eta 0:08:27 lr 0.000732 time 0.6806 (0.7797) loss 3.5768 (3.6940) grad_norm 1.1254 (1.2375) [2022-09-30 13:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][700/1251] eta 0:07:08 lr 0.000732 time 0.8714 (0.7776) loss 3.8884 (3.6898) grad_norm 1.3187 (1.2395) [2022-09-30 13:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][800/1251] eta 0:05:50 lr 0.000731 time 0.7772 (0.7768) loss 4.3045 (3.6919) grad_norm 1.3107 (1.2404) [2022-09-30 13:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][900/1251] eta 0:04:32 lr 0.000731 time 0.8569 (0.7769) loss 4.0207 (3.6896) grad_norm 1.1527 (1.2454) [2022-09-30 13:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1000/1251] eta 0:03:14 lr 0.000731 time 0.8026 (0.7759) loss 3.6700 (3.6814) grad_norm 1.1543 (1.2455) [2022-09-30 13:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1100/1251] eta 0:01:57 lr 0.000730 time 0.7930 (0.7751) loss 4.3293 (3.6809) grad_norm 1.1063 (1.2452) [2022-09-30 13:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1200/1251] eta 0:00:39 lr 0.000730 time 0.7119 (0.7752) loss 2.6558 (3.6800) grad_norm 1.1101 (1.2448) [2022-09-30 13:55:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 104 training takes 0:16:10 [2022-09-30 13:55:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.860 (3.860) Loss 1.1071 (1.1071) Acc@1 75.488 (75.488) Acc@5 92.871 (92.871) [2022-09-30 13:55:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.704 Acc@5 92.194 [2022-09-30 13:55:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-09-30 13:55:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.74% [2022-09-30 13:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][0/1251] eta 1:45:18 lr 0.000730 time 5.0507 (5.0507) loss 3.7611 (3.7611) grad_norm 1.5626 (1.5626) [2022-09-30 13:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][100/1251] eta 0:15:39 lr 0.000729 time 0.7734 (0.8166) loss 4.2350 (3.6170) grad_norm 1.1332 (1.2695) [2022-09-30 13:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][200/1251] eta 0:13:58 lr 0.000729 time 0.8483 (0.7976) loss 4.0833 (3.6223) grad_norm 1.2411 (1.2568) [2022-09-30 13:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][300/1251] eta 0:12:28 lr 0.000729 time 0.8258 (0.7875) loss 4.1328 (3.6527) grad_norm 1.3195 (1.2602) [2022-09-30 14:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][400/1251] eta 0:11:07 lr 0.000728 time 0.7896 (0.7848) loss 3.3564 (3.6423) grad_norm 1.2019 (1.2596) [2022-09-30 14:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][500/1251] eta 0:09:49 lr 0.000728 time 0.7373 (0.7845) loss 2.9483 (3.6493) grad_norm 1.2700 (1.2616) [2022-09-30 14:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][600/1251] eta 0:08:29 lr 0.000728 time 0.7124 (0.7828) loss 3.4788 (3.6419) grad_norm 1.2801 (1.2591) [2022-09-30 14:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][700/1251] eta 0:07:10 lr 0.000727 time 0.7968 (0.7808) loss 4.1833 (3.6517) grad_norm 1.1696 (1.2592) [2022-09-30 14:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][800/1251] eta 0:05:50 lr 0.000727 time 0.8491 (0.7775) loss 4.1314 (3.6598) grad_norm 1.2970 (1.2591) [2022-09-30 14:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][900/1251] eta 0:04:32 lr 0.000726 time 0.8161 (0.7775) loss 3.4887 (3.6657) grad_norm 1.1572 (1.2603) [2022-09-30 14:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1000/1251] eta 0:03:15 lr 0.000726 time 0.8505 (0.7774) loss 3.8915 (3.6688) grad_norm 1.1528 (1.2583) [2022-09-30 14:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1100/1251] eta 0:01:57 lr 0.000726 time 0.7557 (0.7771) loss 3.0367 (3.6739) grad_norm 1.2665 (1.2577) [2022-09-30 14:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1200/1251] eta 0:00:39 lr 0.000725 time 0.8301 (0.7764) loss 3.9584 (3.6829) grad_norm 1.0531 (1.2562) [2022-09-30 14:11:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 105 training takes 0:16:11 [2022-09-30 14:12:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.631 (4.631) Loss 1.0857 (1.0857) Acc@1 73.633 (73.633) Acc@5 92.676 (92.676) [2022-09-30 14:12:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.944 Acc@5 92.308 [2022-09-30 14:12:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-09-30 14:12:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.94% [2022-09-30 14:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][0/1251] eta 1:36:41 lr 0.000725 time 4.6378 (4.6378) loss 3.8593 (3.8593) grad_norm 1.1105 (1.1105) [2022-09-30 14:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][100/1251] eta 0:15:42 lr 0.000725 time 0.8332 (0.8186) loss 3.6377 (3.6147) grad_norm 1.2808 (1.2551) [2022-09-30 14:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][200/1251] eta 0:13:53 lr 0.000724 time 0.7671 (0.7934) loss 3.9074 (3.6212) grad_norm 1.1511 (1.2454) [2022-09-30 14:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][300/1251] eta 0:12:30 lr 0.000724 time 0.7976 (0.7887) loss 3.5308 (3.6650) grad_norm 1.2817 (1.2464) [2022-09-30 14:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][400/1251] eta 0:11:07 lr 0.000724 time 0.8417 (0.7848) loss 3.3759 (3.6796) grad_norm 1.0947 (1.2497) [2022-09-30 14:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][500/1251] eta 0:09:48 lr 0.000723 time 0.8369 (0.7834) loss 4.0691 (3.6894) grad_norm 1.1256 (1.2511) [2022-09-30 14:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][600/1251] eta 0:08:28 lr 0.000723 time 0.8219 (0.7807) loss 3.4419 (3.6886) grad_norm 1.3056 (1.2568) [2022-09-30 14:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][700/1251] eta 0:07:09 lr 0.000722 time 0.9260 (0.7797) loss 2.7318 (3.7038) grad_norm 1.2422 (1.2568) [2022-09-30 14:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][800/1251] eta 0:05:51 lr 0.000722 time 0.8115 (0.7793) loss 4.0428 (3.7056) grad_norm 1.1306 (1.2591) [2022-09-30 14:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][900/1251] eta 0:04:33 lr 0.000722 time 0.8445 (0.7794) loss 3.7764 (3.7098) grad_norm 1.0308 (1.2581) [2022-09-30 14:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1000/1251] eta 0:03:15 lr 0.000721 time 0.8418 (0.7782) loss 3.9566 (3.7110) grad_norm 1.3119 (1.2572) [2022-09-30 14:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1100/1251] eta 0:01:57 lr 0.000721 time 0.7663 (0.7788) loss 3.9732 (3.7112) grad_norm 1.5767 (1.2583) [2022-09-30 14:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1200/1251] eta 0:00:39 lr 0.000721 time 0.8479 (0.7791) loss 2.5117 (3.7031) grad_norm 1.2241 (1.2575) [2022-09-30 14:28:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 106 training takes 0:16:14 [2022-09-30 14:28:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.440 (4.440) Loss 1.0662 (1.0662) Acc@1 75.293 (75.293) Acc@5 92.188 (92.188) [2022-09-30 14:28:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.996 Acc@5 92.270 [2022-09-30 14:28:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-09-30 14:28:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.00% [2022-09-30 14:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][0/1251] eta 1:32:05 lr 0.000720 time 4.4171 (4.4171) loss 2.6174 (2.6174) grad_norm 1.2858 (1.2858) [2022-09-30 14:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][100/1251] eta 0:15:21 lr 0.000720 time 0.8103 (0.8002) loss 3.8808 (3.6395) grad_norm 1.3388 (1.2457) [2022-09-30 14:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][200/1251] eta 0:13:51 lr 0.000720 time 0.6523 (0.7916) loss 3.0311 (3.6302) grad_norm 1.3546 (1.2559) [2022-09-30 14:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][300/1251] eta 0:12:29 lr 0.000719 time 0.7348 (0.7880) loss 3.8858 (3.6639) grad_norm 1.6058 (1.2583) [2022-09-30 14:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][400/1251] eta 0:11:06 lr 0.000719 time 0.9449 (0.7830) loss 4.4470 (3.6765) grad_norm 1.2060 (1.2578) [2022-09-30 14:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][500/1251] eta 0:09:45 lr 0.000719 time 0.6629 (0.7795) loss 2.8232 (3.6772) grad_norm 1.2990 (1.2549) [2022-09-30 14:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][600/1251] eta 0:08:27 lr 0.000718 time 0.7871 (0.7791) loss 3.0817 (3.6714) grad_norm 1.3420 (1.2551) [2022-09-30 14:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][700/1251] eta 0:07:08 lr 0.000718 time 0.8186 (0.7774) loss 3.9973 (3.6719) grad_norm 1.2259 (1.2573) [2022-09-30 14:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][800/1251] eta 0:05:50 lr 0.000717 time 0.7995 (0.7770) loss 3.8761 (3.6700) grad_norm 1.4457 (1.2553) [2022-09-30 14:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][900/1251] eta 0:04:33 lr 0.000717 time 0.8026 (0.7783) loss 3.4810 (3.6720) grad_norm 1.1669 (1.2587) [2022-09-30 14:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1000/1251] eta 0:03:14 lr 0.000717 time 0.6776 (0.7768) loss 3.9844 (3.6800) grad_norm 1.6542 (1.2597) [2022-09-30 14:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1100/1251] eta 0:01:57 lr 0.000716 time 0.8244 (0.7768) loss 4.3023 (3.6863) grad_norm 1.2749 (1.2604) [2022-09-30 14:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1200/1251] eta 0:00:39 lr 0.000716 time 0.8399 (0.7761) loss 4.0657 (3.6850) grad_norm 1.1735 (1.2643) [2022-09-30 14:45:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 107 training takes 0:16:10 [2022-09-30 14:45:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.829 (3.829) Loss 1.1318 (1.1318) Acc@1 73.535 (73.535) Acc@5 91.895 (91.895) [2022-09-30 14:45:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.024 Acc@5 92.338 [2022-09-30 14:45:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-09-30 14:45:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.02% [2022-09-30 14:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][0/1251] eta 1:43:24 lr 0.000716 time 4.9597 (4.9597) loss 4.1509 (4.1509) grad_norm 1.3333 (1.3333) [2022-09-30 14:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][100/1251] eta 0:15:42 lr 0.000715 time 0.6931 (0.8187) loss 4.0319 (3.6853) grad_norm 1.4593 (1.2791) [2022-09-30 14:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][200/1251] eta 0:13:53 lr 0.000715 time 0.6778 (0.7929) loss 4.2613 (3.7049) grad_norm 1.1672 (1.2736) [2022-09-30 14:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][300/1251] eta 0:12:26 lr 0.000715 time 0.8348 (0.7852) loss 4.2061 (3.7153) grad_norm 1.1842 (1.2645) [2022-09-30 14:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][400/1251] eta 0:11:07 lr 0.000714 time 0.7281 (0.7841) loss 4.0217 (3.6901) grad_norm 1.0956 (1.2626) [2022-09-30 14:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][500/1251] eta 0:09:47 lr 0.000714 time 0.7858 (0.7829) loss 2.7627 (3.7024) grad_norm 1.1349 (1.2614) [2022-09-30 14:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][600/1251] eta 0:08:29 lr 0.000714 time 0.9135 (0.7822) loss 2.9258 (3.7097) grad_norm 1.2423 (1.2603) [2022-09-30 14:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][700/1251] eta 0:07:10 lr 0.000713 time 0.8345 (0.7817) loss 2.8655 (3.7079) grad_norm 1.4339 (1.2617) [2022-09-30 14:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][800/1251] eta 0:05:52 lr 0.000713 time 0.7644 (0.7816) loss 4.0526 (3.7080) grad_norm 1.1444 (1.2582) [2022-09-30 14:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][900/1251] eta 0:04:34 lr 0.000712 time 0.7439 (0.7808) loss 3.0649 (3.7097) grad_norm 1.2537 (1.2579) [2022-09-30 14:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1000/1251] eta 0:03:15 lr 0.000712 time 0.6600 (0.7797) loss 3.8053 (3.7144) grad_norm 1.2098 (1.2635) [2022-09-30 14:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1100/1251] eta 0:01:57 lr 0.000712 time 0.8555 (0.7793) loss 3.4099 (3.7123) grad_norm 1.2572 (1.2636) [2022-09-30 15:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1200/1251] eta 0:00:39 lr 0.000711 time 0.7894 (0.7794) loss 3.2699 (3.7181) grad_norm 1.1146 (1.2649) [2022-09-30 15:01:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 108 training takes 0:16:15 [2022-09-30 15:01:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.168 (4.168) Loss 1.1553 (1.1553) Acc@1 73.047 (73.047) Acc@5 91.211 (91.211) [2022-09-30 15:02:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.926 Acc@5 92.404 [2022-09-30 15:02:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-09-30 15:02:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.02% [2022-09-30 15:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][0/1251] eta 1:51:19 lr 0.000711 time 5.3393 (5.3393) loss 3.5200 (3.5200) grad_norm 1.3209 (1.3209) [2022-09-30 15:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][100/1251] eta 0:15:57 lr 0.000711 time 0.8487 (0.8318) loss 3.2778 (3.6326) grad_norm 1.1817 (1.2429) [2022-09-30 15:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][200/1251] eta 0:14:00 lr 0.000710 time 0.8018 (0.7995) loss 3.6610 (3.5938) grad_norm 1.7466 (1.2535) [2022-09-30 15:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][300/1251] eta 0:12:31 lr 0.000710 time 0.7978 (0.7899) loss 3.9034 (3.6257) grad_norm 1.1744 (1.2613) [2022-09-30 15:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][400/1251] eta 0:11:09 lr 0.000710 time 0.7860 (0.7870) loss 2.8417 (3.6246) grad_norm 1.2078 (1.2620) [2022-09-30 15:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][500/1251] eta 0:09:47 lr 0.000709 time 0.7615 (0.7823) loss 4.4874 (3.6384) grad_norm 1.3130 (1.2626) [2022-09-30 15:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][600/1251] eta 0:08:27 lr 0.000709 time 0.6893 (0.7791) loss 3.6886 (3.6461) grad_norm 1.4183 (1.2637) [2022-09-30 15:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][700/1251] eta 0:07:08 lr 0.000708 time 0.8365 (0.7781) loss 3.8215 (3.6625) grad_norm 1.2892 (1.2671) [2022-09-30 15:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][800/1251] eta 0:05:50 lr 0.000708 time 0.8121 (0.7767) loss 4.4951 (3.6607) grad_norm 1.1755 (1.2684) [2022-09-30 15:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][900/1251] eta 0:04:32 lr 0.000708 time 0.8698 (0.7766) loss 3.3095 (3.6592) grad_norm 1.1938 (1.2678) [2022-09-30 15:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1000/1251] eta 0:03:15 lr 0.000707 time 0.7891 (0.7769) loss 2.8551 (3.6634) grad_norm 1.5480 (1.2692) [2022-09-30 15:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1100/1251] eta 0:01:57 lr 0.000707 time 0.8323 (0.7765) loss 3.8846 (3.6600) grad_norm 1.1803 (1.2676) [2022-09-30 15:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1200/1251] eta 0:00:39 lr 0.000707 time 0.6344 (0.7763) loss 3.0725 (3.6596) grad_norm 1.4827 (1.2664) [2022-09-30 15:18:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 109 training takes 0:16:11 [2022-09-30 15:18:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.382 (4.382) Loss 1.1379 (1.1379) Acc@1 74.609 (74.609) Acc@5 90.820 (90.820) [2022-09-30 15:18:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.106 Acc@5 92.378 [2022-09-30 15:18:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-09-30 15:18:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.11% [2022-09-30 15:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][0/1251] eta 1:39:31 lr 0.000706 time 4.7733 (4.7733) loss 4.1982 (4.1982) grad_norm 1.1288 (1.1288) [2022-09-30 15:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][100/1251] eta 0:15:45 lr 0.000706 time 0.8645 (0.8211) loss 3.4388 (3.6793) grad_norm 1.1349 (1.2669) [2022-09-30 15:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][200/1251] eta 0:13:57 lr 0.000706 time 0.8586 (0.7972) loss 4.0295 (3.6848) grad_norm 1.3157 (1.2796) [2022-09-30 15:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][300/1251] eta 0:12:32 lr 0.000705 time 0.7540 (0.7912) loss 3.0332 (3.6650) grad_norm 1.2493 (1.2737) [2022-09-30 15:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][400/1251] eta 0:11:08 lr 0.000705 time 0.6836 (0.7857) loss 3.3639 (3.6643) grad_norm 1.3747 (1.2735) [2022-09-30 15:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][500/1251] eta 0:09:48 lr 0.000704 time 0.8969 (0.7841) loss 3.8479 (3.6701) grad_norm 1.1545 (1.2742) [2022-09-30 15:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][600/1251] eta 0:08:29 lr 0.000704 time 0.7842 (0.7824) loss 4.3529 (3.6715) grad_norm 1.3526 (1.2762) [2022-09-30 15:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][700/1251] eta 0:07:11 lr 0.000704 time 0.8693 (0.7823) loss 4.1228 (3.6680) grad_norm 1.3884 (1.2711) [2022-09-30 15:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][800/1251] eta 0:05:52 lr 0.000703 time 0.8312 (0.7811) loss 3.0728 (3.6613) grad_norm 1.3402 (1.2698) [2022-09-30 15:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][900/1251] eta 0:04:33 lr 0.000703 time 0.7460 (0.7801) loss 4.0258 (3.6612) grad_norm 1.2251 (1.2743) [2022-09-30 15:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1000/1251] eta 0:03:15 lr 0.000703 time 0.8617 (0.7803) loss 4.2346 (3.6567) grad_norm 1.1903 (1.2718) [2022-09-30 15:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1100/1251] eta 0:01:57 lr 0.000702 time 0.6980 (0.7797) loss 4.1046 (3.6508) grad_norm 1.1909 (1.2724) [2022-09-30 15:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1200/1251] eta 0:00:39 lr 0.000702 time 0.7620 (0.7800) loss 3.9018 (3.6512) grad_norm 1.3510 (1.2737) [2022-09-30 15:34:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 110 training takes 0:16:14 [2022-09-30 15:34:48 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saving...... [2022-09-30 15:34:49 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saved !!! [2022-09-30 15:34:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.061 (4.061) Loss 1.0169 (1.0169) Acc@1 75.293 (75.293) Acc@5 92.773 (92.773) [2022-09-30 15:35:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.166 Acc@5 92.352 [2022-09-30 15:35:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-09-30 15:35:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.17% [2022-09-30 15:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][0/1251] eta 1:40:38 lr 0.000702 time 4.8268 (4.8268) loss 3.3263 (3.3263) grad_norm 1.2394 (1.2394) [2022-09-30 15:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][100/1251] eta 0:15:39 lr 0.000701 time 0.7719 (0.8161) loss 3.6910 (3.6507) grad_norm 1.5850 (1.2798) [2022-09-30 15:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][200/1251] eta 0:13:53 lr 0.000701 time 0.9068 (0.7932) loss 3.7889 (3.6362) grad_norm 1.2647 (1.2856) [2022-09-30 15:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][300/1251] eta 0:12:29 lr 0.000700 time 0.8566 (0.7886) loss 2.7358 (3.6461) grad_norm 1.3835 (1.2815) [2022-09-30 15:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][400/1251] eta 0:11:06 lr 0.000700 time 0.7850 (0.7831) loss 2.7991 (3.6432) grad_norm 1.0785 (1.2794) [2022-09-30 15:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][500/1251] eta 0:09:45 lr 0.000700 time 0.7632 (0.7796) loss 3.6812 (3.6561) grad_norm 1.3440 (1.2785) [2022-09-30 15:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][600/1251] eta 0:08:27 lr 0.000699 time 0.8342 (0.7789) loss 3.9200 (3.6652) grad_norm 1.4268 (1.2760) [2022-09-30 15:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][700/1251] eta 0:07:08 lr 0.000699 time 0.7849 (0.7784) loss 3.8201 (3.6724) grad_norm 1.3365 (1.2760) [2022-09-30 15:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][800/1251] eta 0:05:42 lr 0.000699 time 0.8637 (0.7599) loss 2.8533 (3.6688) grad_norm 1.4064 (1.2797) [2022-09-30 15:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][900/1251] eta 0:04:27 lr 0.000698 time 0.6259 (0.7624) loss 4.3509 (3.6670) grad_norm 1.2175 (1.2834) [2022-09-30 15:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1000/1251] eta 0:03:11 lr 0.000698 time 0.8367 (0.7640) loss 3.2254 (3.6771) grad_norm 1.1343 (1.2820) [2022-09-30 15:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1100/1251] eta 0:01:55 lr 0.000697 time 0.8043 (0.7649) loss 2.5438 (3.6776) grad_norm 1.4138 (1.2836) [2022-09-30 15:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1200/1251] eta 0:00:39 lr 0.000697 time 0.7474 (0.7647) loss 4.6037 (3.6793) grad_norm 1.2782 (1.2844) [2022-09-30 15:51:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 111 training takes 0:15:56 [2022-09-30 15:51:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.746 (4.746) Loss 1.1762 (1.1762) Acc@1 72.559 (72.559) Acc@5 91.113 (91.113) [2022-09-30 15:51:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.148 Acc@5 92.438 [2022-09-30 15:51:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-09-30 15:51:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.17% [2022-09-30 15:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][0/1251] eta 1:26:11 lr 0.000697 time 4.1335 (4.1335) loss 4.3368 (4.3368) grad_norm 1.2002 (1.2002) [2022-09-30 15:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][100/1251] eta 0:15:27 lr 0.000696 time 0.8393 (0.8057) loss 3.8839 (3.6642) grad_norm 1.3915 (1.2863) [2022-09-30 15:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][200/1251] eta 0:13:53 lr 0.000696 time 0.8044 (0.7926) loss 3.9635 (3.6478) grad_norm 1.2257 (1.2905) [2022-09-30 15:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][300/1251] eta 0:12:26 lr 0.000696 time 0.7518 (0.7851) loss 3.7913 (3.6467) grad_norm 1.3491 (1.2849) [2022-09-30 15:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][400/1251] eta 0:11:05 lr 0.000695 time 0.8738 (0.7825) loss 3.6702 (3.6681) grad_norm 1.2797 (1.2910) [2022-09-30 15:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][500/1251] eta 0:09:46 lr 0.000695 time 0.7966 (0.7805) loss 3.8215 (3.6692) grad_norm 1.1086 (1.2899) [2022-09-30 15:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][600/1251] eta 0:08:26 lr 0.000695 time 0.8526 (0.7780) loss 3.8683 (3.6721) grad_norm 1.3316 (1.2904) [2022-09-30 16:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][700/1251] eta 0:07:08 lr 0.000694 time 0.6891 (0.7773) loss 3.1863 (3.6693) grad_norm 1.2811 (1.2893) [2022-09-30 16:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][800/1251] eta 0:05:49 lr 0.000694 time 0.7067 (0.7760) loss 3.1544 (3.6648) grad_norm 1.4055 (1.2887) [2022-09-30 16:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][900/1251] eta 0:04:32 lr 0.000693 time 0.9150 (0.7753) loss 3.2905 (3.6655) grad_norm 1.2324 (1.2883) [2022-09-30 16:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1000/1251] eta 0:03:14 lr 0.000693 time 0.6922 (0.7743) loss 4.2857 (3.6728) grad_norm 1.1417 (1.2906) [2022-09-30 16:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1100/1251] eta 0:01:56 lr 0.000693 time 0.7901 (0.7744) loss 3.9324 (3.6653) grad_norm 1.4184 (1.2894) [2022-09-30 16:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1200/1251] eta 0:00:39 lr 0.000692 time 0.7766 (0.7747) loss 3.7207 (3.6554) grad_norm 1.1019 (1.2895) [2022-09-30 16:07:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 112 training takes 0:16:09 [2022-09-30 16:07:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.170 (4.170) Loss 1.0771 (1.0771) Acc@1 73.633 (73.633) Acc@5 92.383 (92.383) [2022-09-30 16:07:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.374 Acc@5 92.494 [2022-09-30 16:07:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-09-30 16:07:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.37% [2022-09-30 16:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][0/1251] eta 1:27:14 lr 0.000692 time 4.1841 (4.1841) loss 2.7821 (2.7821) grad_norm 1.0824 (1.0824) [2022-09-30 16:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][100/1251] eta 0:15:41 lr 0.000692 time 0.8970 (0.8184) loss 3.5660 (3.6027) grad_norm 1.2507 (1.3112) [2022-09-30 16:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][200/1251] eta 0:13:58 lr 0.000691 time 0.7211 (0.7973) loss 4.6684 (3.6301) grad_norm 1.3463 (1.2947) [2022-09-30 16:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][300/1251] eta 0:12:35 lr 0.000691 time 0.8471 (0.7940) loss 4.0950 (3.6399) grad_norm 1.3125 (1.2967) [2022-09-30 16:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][400/1251] eta 0:11:11 lr 0.000690 time 0.7647 (0.7887) loss 4.0093 (3.6659) grad_norm 1.2990 (1.2871) [2022-09-30 16:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][500/1251] eta 0:09:50 lr 0.000690 time 0.7927 (0.7865) loss 3.9979 (3.6630) grad_norm 1.1252 (1.2948) [2022-09-30 16:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][600/1251] eta 0:08:30 lr 0.000690 time 0.8130 (0.7849) loss 3.6221 (3.6563) grad_norm 1.5839 (1.2924) [2022-09-30 16:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][700/1251] eta 0:07:11 lr 0.000689 time 0.7960 (0.7828) loss 4.2561 (3.6496) grad_norm 1.1666 (1.2944) [2022-09-30 16:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][800/1251] eta 0:05:52 lr 0.000689 time 0.6637 (0.7811) loss 3.4546 (3.6567) grad_norm 1.3245 (1.2970) [2022-09-30 16:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][900/1251] eta 0:04:33 lr 0.000689 time 0.6985 (0.7799) loss 4.2012 (3.6542) grad_norm 1.1132 (1.2976) [2022-09-30 16:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1000/1251] eta 0:03:15 lr 0.000688 time 0.8374 (0.7798) loss 3.9377 (3.6546) grad_norm 1.4956 (1.2990) [2022-09-30 16:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1100/1251] eta 0:01:57 lr 0.000688 time 0.9357 (0.7800) loss 4.1171 (3.6557) grad_norm 1.5342 (1.2989) [2022-09-30 16:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1200/1251] eta 0:00:39 lr 0.000687 time 0.8090 (0.7791) loss 3.8075 (3.6571) grad_norm 1.2668 (1.2966) [2022-09-30 16:24:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 113 training takes 0:16:14 [2022-09-30 16:24:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.271 (4.271) Loss 1.0885 (1.0885) Acc@1 74.512 (74.512) Acc@5 92.090 (92.090) [2022-09-30 16:24:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.138 Acc@5 92.482 [2022-09-30 16:24:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-09-30 16:24:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.37% [2022-09-30 16:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][0/1251] eta 1:29:12 lr 0.000687 time 4.2788 (4.2788) loss 3.8173 (3.8173) grad_norm 1.2108 (1.2108) [2022-09-30 16:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][100/1251] eta 0:15:33 lr 0.000687 time 0.7108 (0.8112) loss 4.4927 (3.6083) grad_norm 1.4453 (1.2935) [2022-09-30 16:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][200/1251] eta 0:14:00 lr 0.000686 time 0.8121 (0.7994) loss 4.1590 (3.6485) grad_norm 1.1711 (1.2965) [2022-09-30 16:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][300/1251] eta 0:12:34 lr 0.000686 time 0.8089 (0.7929) loss 3.6352 (3.6540) grad_norm 1.5030 (1.2978) [2022-09-30 16:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][400/1251] eta 0:11:11 lr 0.000686 time 0.6820 (0.7895) loss 4.3470 (3.6589) grad_norm 1.2264 (1.2906) [2022-09-30 16:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][500/1251] eta 0:09:51 lr 0.000685 time 0.8003 (0.7874) loss 2.8485 (3.6375) grad_norm 1.5199 (1.2847) [2022-09-30 16:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][600/1251] eta 0:08:29 lr 0.000685 time 0.8314 (0.7834) loss 4.0993 (3.6224) grad_norm 1.1468 (1.2847) [2022-09-30 16:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][700/1251] eta 0:07:10 lr 0.000685 time 0.8208 (0.7813) loss 2.9693 (3.6371) grad_norm 1.1323 (1.2883) [2022-09-30 16:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][800/1251] eta 0:05:52 lr 0.000684 time 0.9505 (0.7806) loss 4.4519 (3.6417) grad_norm 1.3868 (1.2863) [2022-09-30 16:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][900/1251] eta 0:04:33 lr 0.000684 time 0.8216 (0.7802) loss 3.6691 (3.6505) grad_norm 1.2550 (1.2904) [2022-09-30 16:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1000/1251] eta 0:03:15 lr 0.000683 time 0.8263 (0.7795) loss 3.6648 (3.6491) grad_norm 1.1069 (1.2894) [2022-09-30 16:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1100/1251] eta 0:01:57 lr 0.000683 time 0.7436 (0.7786) loss 3.7590 (3.6445) grad_norm 1.2380 (1.2892) [2022-09-30 16:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1200/1251] eta 0:00:39 lr 0.000683 time 0.8082 (0.7788) loss 3.7799 (3.6479) grad_norm 1.2001 (1.2887) [2022-09-30 16:40:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 114 training takes 0:16:13 [2022-09-30 16:40:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.962 (4.962) Loss 1.1156 (1.1156) Acc@1 73.633 (73.633) Acc@5 92.188 (92.188) [2022-09-30 16:41:09 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.262 Acc@5 92.486 [2022-09-30 16:41:09 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-09-30 16:41:09 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.37% [2022-09-30 16:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][0/1251] eta 1:43:32 lr 0.000682 time 4.9656 (4.9656) loss 3.9149 (3.9149) grad_norm 1.2480 (1.2480) [2022-09-30 16:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][100/1251] eta 0:15:39 lr 0.000682 time 0.8292 (0.8164) loss 2.6154 (3.6096) grad_norm 1.4817 (1.2979) [2022-09-30 16:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][200/1251] eta 0:13:59 lr 0.000682 time 0.8276 (0.7991) loss 4.0942 (3.5968) grad_norm 1.3013 (1.2827) [2022-09-30 16:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][300/1251] eta 0:12:30 lr 0.000681 time 0.6259 (0.7897) loss 4.2840 (3.6383) grad_norm 1.3675 (1.2859) [2022-09-30 16:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][400/1251] eta 0:11:09 lr 0.000681 time 0.6782 (0.7872) loss 2.5648 (3.6339) grad_norm 1.1045 (1.2894) [2022-09-30 16:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][500/1251] eta 0:09:46 lr 0.000680 time 0.9014 (0.7815) loss 2.5723 (3.6252) grad_norm 1.1824 (1.2933) [2022-09-30 16:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][600/1251] eta 0:08:27 lr 0.000680 time 0.8049 (0.7798) loss 3.8756 (3.6252) grad_norm 1.1622 (1.2964) [2022-09-30 16:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][700/1251] eta 0:07:09 lr 0.000680 time 0.7082 (0.7804) loss 3.8093 (3.6321) grad_norm 1.2208 (1.2938) [2022-09-30 16:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][800/1251] eta 0:05:51 lr 0.000679 time 0.8212 (0.7794) loss 4.2924 (3.6408) grad_norm 1.3315 (1.2938) [2022-09-30 16:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][900/1251] eta 0:04:33 lr 0.000679 time 0.8339 (0.7794) loss 4.0240 (3.6294) grad_norm 1.3026 (1.2971) [2022-09-30 16:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1000/1251] eta 0:03:15 lr 0.000679 time 0.8561 (0.7795) loss 3.3823 (3.6307) grad_norm 1.3544 (1.3002) [2022-09-30 16:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1100/1251] eta 0:01:57 lr 0.000678 time 0.8128 (0.7789) loss 3.1913 (3.6295) grad_norm 1.4631 (1.2985) [2022-09-30 16:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1200/1251] eta 0:00:39 lr 0.000678 time 0.7514 (0.7786) loss 3.5099 (3.6282) grad_norm 1.1663 (1.3004) [2022-09-30 16:57:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 115 training takes 0:16:13 [2022-09-30 16:57:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.243 (4.243) Loss 1.0489 (1.0489) Acc@1 74.902 (74.902) Acc@5 93.164 (93.164) [2022-09-30 16:57:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.312 Acc@5 92.610 [2022-09-30 16:57:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-09-30 16:57:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.37% [2022-09-30 16:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][0/1251] eta 1:36:28 lr 0.000678 time 4.6270 (4.6270) loss 3.7780 (3.7780) grad_norm 1.2792 (1.2792) [2022-09-30 16:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][100/1251] eta 0:15:25 lr 0.000677 time 0.6274 (0.8042) loss 4.3141 (3.6636) grad_norm 1.4575 (1.3171) [2022-09-30 17:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][200/1251] eta 0:13:52 lr 0.000677 time 0.8547 (0.7919) loss 3.3960 (3.6507) grad_norm 1.2193 (1.3111) [2022-09-30 17:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][300/1251] eta 0:12:28 lr 0.000676 time 0.8032 (0.7868) loss 3.7885 (3.6332) grad_norm 1.5758 (1.3087) [2022-09-30 17:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][400/1251] eta 0:11:06 lr 0.000676 time 0.8352 (0.7837) loss 2.9750 (3.6420) grad_norm 1.1975 (1.3092) [2022-09-30 17:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][500/1251] eta 0:09:45 lr 0.000676 time 0.7546 (0.7801) loss 3.4317 (3.6470) grad_norm 1.1230 (1.3079) [2022-09-30 17:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][600/1251] eta 0:08:27 lr 0.000675 time 0.7179 (0.7802) loss 4.0380 (3.6455) grad_norm 1.1957 (1.3045) [2022-09-30 17:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][700/1251] eta 0:07:09 lr 0.000675 time 0.8519 (0.7799) loss 3.3611 (3.6477) grad_norm 1.1698 (1.3062) [2022-09-30 17:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][800/1251] eta 0:05:51 lr 0.000674 time 0.7969 (0.7800) loss 4.0697 (3.6527) grad_norm 1.2343 (1.3041) [2022-09-30 17:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][900/1251] eta 0:04:33 lr 0.000674 time 0.7619 (0.7804) loss 3.8283 (3.6546) grad_norm 1.2530 (1.3036) [2022-09-30 17:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1000/1251] eta 0:03:15 lr 0.000674 time 0.7432 (0.7795) loss 3.5927 (3.6568) grad_norm 1.2738 (1.3040) [2022-09-30 17:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1100/1251] eta 0:01:57 lr 0.000673 time 0.6409 (0.7794) loss 3.8833 (3.6604) grad_norm 1.3062 (1.3012) [2022-09-30 17:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1200/1251] eta 0:00:39 lr 0.000673 time 0.8648 (0.7791) loss 3.7287 (3.6542) grad_norm 1.4273 (1.3007) [2022-09-30 17:13:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 116 training takes 0:16:15 [2022-09-30 17:14:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.375 (4.375) Loss 1.0880 (1.0880) Acc@1 74.609 (74.609) Acc@5 92.773 (92.773) [2022-09-30 17:14:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.392 Acc@5 92.616 [2022-09-30 17:14:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-09-30 17:14:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.39% [2022-09-30 17:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][0/1251] eta 1:40:07 lr 0.000673 time 4.8025 (4.8025) loss 3.8386 (3.8386) grad_norm 1.1838 (1.1838) [2022-09-30 17:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][100/1251] eta 0:15:39 lr 0.000672 time 0.8239 (0.8158) loss 4.3868 (3.6846) grad_norm 1.4160 (1.3302) [2022-09-30 17:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][200/1251] eta 0:13:56 lr 0.000672 time 0.8240 (0.7959) loss 3.1454 (3.6921) grad_norm 1.2694 (1.3099) [2022-09-30 17:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][300/1251] eta 0:12:31 lr 0.000672 time 0.7319 (0.7903) loss 2.6259 (3.6582) grad_norm 1.2624 (1.3171) [2022-09-30 17:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][400/1251] eta 0:11:11 lr 0.000671 time 0.8965 (0.7886) loss 2.6804 (3.6528) grad_norm 1.2156 (1.3151) [2022-09-30 17:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][500/1251] eta 0:09:51 lr 0.000671 time 0.7654 (0.7877) loss 3.3722 (3.6701) grad_norm 1.1461 (1.3135) [2022-09-30 17:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][600/1251] eta 0:08:32 lr 0.000670 time 0.8368 (0.7873) loss 3.4488 (3.6663) grad_norm 1.2448 (1.3070) [2022-09-30 17:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][700/1251] eta 0:07:12 lr 0.000670 time 0.8039 (0.7855) loss 3.7405 (3.6541) grad_norm 1.2242 (1.3053) [2022-09-30 17:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][800/1251] eta 0:05:52 lr 0.000670 time 0.8024 (0.7824) loss 4.2633 (3.6607) grad_norm 1.1765 (1.3031) [2022-09-30 17:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][900/1251] eta 0:04:34 lr 0.000669 time 0.9168 (0.7813) loss 4.1486 (3.6583) grad_norm 1.5438 (1.3050) [2022-09-30 17:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1000/1251] eta 0:03:15 lr 0.000669 time 0.8514 (0.7801) loss 3.1939 (3.6628) grad_norm 1.3344 (1.3039) [2022-09-30 17:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1100/1251] eta 0:01:57 lr 0.000668 time 0.7527 (0.7800) loss 4.2595 (3.6602) grad_norm 1.2912 (1.3028) [2022-09-30 17:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1200/1251] eta 0:00:39 lr 0.000668 time 0.6584 (0.7796) loss 4.0020 (3.6567) grad_norm 1.3162 (1.3055) [2022-09-30 17:30:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 117 training takes 0:16:15 [2022-09-30 17:30:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.561 (4.561) Loss 1.0903 (1.0903) Acc@1 73.633 (73.633) Acc@5 92.773 (92.773) [2022-09-30 17:30:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.436 Acc@5 92.494 [2022-09-30 17:30:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-09-30 17:30:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.44% [2022-09-30 17:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][0/1251] eta 1:42:47 lr 0.000668 time 4.9299 (4.9299) loss 2.6125 (2.6125) grad_norm 1.2698 (1.2698) [2022-09-30 17:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][100/1251] eta 0:15:37 lr 0.000667 time 0.7652 (0.8143) loss 3.6720 (3.6502) grad_norm 1.2699 (1.3119) [2022-09-30 17:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][200/1251] eta 0:13:53 lr 0.000667 time 0.8204 (0.7930) loss 2.8490 (3.6720) grad_norm 1.4309 (1.2935) [2022-09-30 17:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][300/1251] eta 0:12:27 lr 0.000667 time 0.7617 (0.7861) loss 3.5027 (3.6333) grad_norm 1.2389 (1.2997) [2022-09-30 17:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][400/1251] eta 0:11:05 lr 0.000666 time 0.6835 (0.7820) loss 3.1058 (3.6366) grad_norm 1.2757 (1.3038) [2022-09-30 17:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][500/1251] eta 0:09:45 lr 0.000666 time 0.7338 (0.7791) loss 3.3452 (3.6365) grad_norm 1.3042 (1.3026) [2022-09-30 17:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][600/1251] eta 0:08:26 lr 0.000665 time 0.8462 (0.7785) loss 4.1661 (3.6476) grad_norm 1.2078 (1.3040) [2022-09-30 17:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][700/1251] eta 0:07:08 lr 0.000665 time 0.7930 (0.7784) loss 3.4162 (3.6465) grad_norm 1.5150 (1.3022) [2022-09-30 17:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][800/1251] eta 0:05:50 lr 0.000665 time 0.7959 (0.7774) loss 4.0747 (3.6487) grad_norm 1.1565 (1.3027) [2022-09-30 17:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][900/1251] eta 0:04:32 lr 0.000664 time 0.7639 (0.7771) loss 3.9752 (3.6420) grad_norm 1.2187 (1.3043) [2022-09-30 17:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1000/1251] eta 0:03:14 lr 0.000664 time 0.8545 (0.7768) loss 3.9610 (3.6404) grad_norm 1.1619 (1.3061) [2022-09-30 17:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1100/1251] eta 0:01:57 lr 0.000663 time 0.8015 (0.7764) loss 3.3276 (3.6395) grad_norm 1.2983 (1.3042) [2022-09-30 17:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1200/1251] eta 0:00:39 lr 0.000663 time 0.7975 (0.7754) loss 3.8906 (3.6436) grad_norm 1.1592 (1.3083) [2022-09-30 17:47:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 118 training takes 0:16:10 [2022-09-30 17:47:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.466 (4.466) Loss 1.0484 (1.0484) Acc@1 74.316 (74.316) Acc@5 92.969 (92.969) [2022-09-30 17:47:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.484 Acc@5 92.472 [2022-09-30 17:47:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-09-30 17:47:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.48% [2022-09-30 17:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][0/1251] eta 1:34:30 lr 0.000663 time 4.5330 (4.5330) loss 4.0838 (4.0838) grad_norm 1.3278 (1.3278) [2022-09-30 17:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][100/1251] eta 0:15:25 lr 0.000662 time 0.7278 (0.8041) loss 3.6866 (3.5812) grad_norm 1.3431 (1.3145) [2022-09-30 17:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][200/1251] eta 0:13:48 lr 0.000662 time 0.7288 (0.7886) loss 4.1184 (3.5872) grad_norm 1.2059 (1.3131) [2022-09-30 17:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][300/1251] eta 0:12:24 lr 0.000662 time 0.8087 (0.7831) loss 3.4192 (3.5960) grad_norm 1.3233 (1.3101) [2022-09-30 17:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][400/1251] eta 0:11:03 lr 0.000661 time 0.6653 (0.7799) loss 3.2832 (3.5930) grad_norm 1.1798 (1.3089) [2022-09-30 17:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][500/1251] eta 0:09:45 lr 0.000661 time 0.8105 (0.7796) loss 3.2334 (3.5971) grad_norm 1.2189 (1.3047) [2022-09-30 17:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][600/1251] eta 0:08:25 lr 0.000661 time 0.8330 (0.7764) loss 3.1350 (3.6142) grad_norm 1.3301 (1.3050) [2022-09-30 17:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][700/1251] eta 0:07:08 lr 0.000660 time 0.7616 (0.7768) loss 2.5596 (3.6305) grad_norm 1.2572 (1.3078) [2022-09-30 17:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][800/1251] eta 0:05:50 lr 0.000660 time 0.9188 (0.7762) loss 4.0196 (3.6220) grad_norm 1.2558 (1.3049) [2022-09-30 17:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][900/1251] eta 0:04:32 lr 0.000659 time 0.7987 (0.7769) loss 2.5929 (3.6284) grad_norm 1.2272 (1.3096) [2022-09-30 18:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1000/1251] eta 0:03:15 lr 0.000659 time 0.6579 (0.7774) loss 3.8493 (3.6253) grad_norm 1.3150 (1.3083) [2022-09-30 18:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1100/1251] eta 0:01:57 lr 0.000659 time 0.7775 (0.7781) loss 3.8361 (3.6347) grad_norm 1.2817 (1.3073) [2022-09-30 18:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1200/1251] eta 0:00:39 lr 0.000658 time 0.7955 (0.7773) loss 4.4393 (3.6419) grad_norm 1.1964 (1.3078) [2022-09-30 18:03:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 119 training takes 0:16:12 [2022-09-30 18:03:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.951 (3.951) Loss 1.0988 (1.0988) Acc@1 73.340 (73.340) Acc@5 92.480 (92.480) [2022-09-30 18:03:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.600 Acc@5 92.646 [2022-09-30 18:03:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-09-30 18:03:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.60% [2022-09-30 18:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][0/1251] eta 1:46:52 lr 0.000658 time 5.1263 (5.1263) loss 2.9098 (2.9098) grad_norm 1.3437 (1.3437) [2022-09-30 18:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][100/1251] eta 0:15:49 lr 0.000658 time 0.8151 (0.8251) loss 4.1293 (3.6207) grad_norm 1.1969 (1.3052) [2022-09-30 18:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][200/1251] eta 0:14:00 lr 0.000657 time 0.8430 (0.7995) loss 4.0619 (3.6292) grad_norm 1.2847 (1.3024) [2022-09-30 18:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][300/1251] eta 0:11:54 lr 0.000657 time 0.7279 (0.7518) loss 3.1115 (3.6554) grad_norm 1.4742 (1.3004) [2022-09-30 18:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][400/1251] eta 0:10:45 lr 0.000656 time 0.8418 (0.7587) loss 2.5725 (3.6655) grad_norm 1.3769 (1.3149) [2022-09-30 18:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][500/1251] eta 0:09:32 lr 0.000656 time 0.8689 (0.7629) loss 3.2875 (3.6618) grad_norm 1.2122 (1.3186) [2022-09-30 18:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][600/1251] eta 0:08:17 lr 0.000656 time 0.8255 (0.7637) loss 3.6293 (3.6607) grad_norm 1.3368 (1.3138) [2022-09-30 18:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][700/1251] eta 0:07:01 lr 0.000655 time 0.7828 (0.7649) loss 2.8289 (3.6546) grad_norm 1.2552 (1.3135) [2022-09-30 18:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][800/1251] eta 0:05:45 lr 0.000655 time 0.7686 (0.7665) loss 3.2029 (3.6485) grad_norm 1.2565 (1.3130) [2022-09-30 18:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][900/1251] eta 0:04:29 lr 0.000654 time 0.7340 (0.7669) loss 3.4948 (3.6566) grad_norm 1.4272 (1.3171) [2022-09-30 18:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1000/1251] eta 0:03:12 lr 0.000654 time 0.9092 (0.7678) loss 3.4072 (3.6617) grad_norm 1.2753 (1.3192) [2022-09-30 18:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1100/1251] eta 0:01:56 lr 0.000654 time 0.8117 (0.7687) loss 4.1482 (3.6594) grad_norm 1.2819 (1.3185) [2022-09-30 18:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1200/1251] eta 0:00:39 lr 0.000653 time 0.7948 (0.7685) loss 4.4134 (3.6534) grad_norm 1.2241 (1.3213) [2022-09-30 18:20:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 120 training takes 0:16:01 [2022-09-30 18:20:01 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saving...... [2022-09-30 18:20:02 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saved !!! [2022-09-30 18:20:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.822 (4.822) Loss 1.0472 (1.0472) Acc@1 75.586 (75.586) Acc@5 93.848 (93.848) [2022-09-30 18:20:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.306 Acc@5 92.468 [2022-09-30 18:20:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-09-30 18:20:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.60% [2022-09-30 18:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][0/1251] eta 1:42:24 lr 0.000653 time 4.9117 (4.9117) loss 3.2994 (3.2994) grad_norm 1.3423 (1.3423) [2022-09-30 18:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][100/1251] eta 0:15:37 lr 0.000653 time 0.7871 (0.8143) loss 3.4301 (3.6466) grad_norm 1.1364 (1.3393) [2022-09-30 18:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][200/1251] eta 0:13:53 lr 0.000652 time 0.8887 (0.7927) loss 3.4557 (3.5958) grad_norm 1.2399 (1.3345) [2022-09-30 18:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][300/1251] eta 0:12:28 lr 0.000652 time 0.6126 (0.7870) loss 3.4297 (3.6028) grad_norm 1.5118 (1.3311) [2022-09-30 18:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][400/1251] eta 0:11:09 lr 0.000651 time 0.6478 (0.7871) loss 3.6505 (3.6105) grad_norm 1.1798 (1.3312) [2022-09-30 18:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][500/1251] eta 0:09:49 lr 0.000651 time 0.8363 (0.7849) loss 3.9810 (3.6327) grad_norm 1.6504 (1.3255) [2022-09-30 18:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][600/1251] eta 0:08:29 lr 0.000651 time 0.8074 (0.7828) loss 3.5145 (3.6324) grad_norm 1.1196 (1.3239) [2022-09-30 18:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][700/1251] eta 0:07:10 lr 0.000650 time 0.8947 (0.7821) loss 4.0178 (3.6368) grad_norm 1.6002 (1.3244) [2022-09-30 18:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][800/1251] eta 0:05:51 lr 0.000650 time 0.7896 (0.7801) loss 3.2360 (3.6466) grad_norm 1.3428 (1.3248) [2022-09-30 18:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][900/1251] eta 0:04:33 lr 0.000649 time 0.8223 (0.7791) loss 4.1495 (3.6438) grad_norm 1.4020 (1.3258) [2022-09-30 18:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1000/1251] eta 0:03:15 lr 0.000649 time 0.8385 (0.7780) loss 3.5390 (3.6424) grad_norm 1.1444 (1.3250) [2022-09-30 18:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1100/1251] eta 0:01:57 lr 0.000649 time 0.6765 (0.7771) loss 3.3928 (3.6406) grad_norm 1.4809 (1.3251) [2022-09-30 18:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1200/1251] eta 0:00:39 lr 0.000648 time 0.8767 (0.7771) loss 2.7102 (3.6349) grad_norm 1.2567 (1.3256) [2022-09-30 18:36:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 121 training takes 0:16:12 [2022-09-30 18:36:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.970 (3.970) Loss 1.0747 (1.0747) Acc@1 75.586 (75.586) Acc@5 92.676 (92.676) [2022-09-30 18:36:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.680 Acc@5 92.644 [2022-09-30 18:36:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-09-30 18:36:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.68% [2022-09-30 18:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][0/1251] eta 1:31:43 lr 0.000648 time 4.3991 (4.3991) loss 3.8222 (3.8222) grad_norm 1.2226 (1.2226) [2022-09-30 18:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][100/1251] eta 0:15:34 lr 0.000648 time 0.7839 (0.8115) loss 2.9133 (3.5217) grad_norm 1.3027 (1.3439) [2022-09-30 18:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][200/1251] eta 0:13:54 lr 0.000647 time 0.7952 (0.7941) loss 3.3861 (3.5457) grad_norm 1.3452 (1.3437) [2022-09-30 18:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][300/1251] eta 0:12:28 lr 0.000647 time 0.8274 (0.7872) loss 3.2021 (3.5636) grad_norm 1.3650 (1.3321) [2022-09-30 18:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][400/1251] eta 0:11:07 lr 0.000646 time 0.8577 (0.7840) loss 3.7887 (3.5589) grad_norm 1.1704 (1.3282) [2022-09-30 18:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][500/1251] eta 0:09:48 lr 0.000646 time 0.8516 (0.7839) loss 4.1768 (3.5671) grad_norm 1.3386 (1.3248) [2022-09-30 18:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][600/1251] eta 0:08:29 lr 0.000646 time 0.7352 (0.7824) loss 2.7241 (3.5673) grad_norm 1.2668 (1.3257) [2022-09-30 18:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][700/1251] eta 0:07:10 lr 0.000645 time 0.6662 (0.7804) loss 4.1055 (3.5906) grad_norm 1.1653 (1.3241) [2022-09-30 18:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][800/1251] eta 0:05:51 lr 0.000645 time 0.8431 (0.7792) loss 3.7243 (3.6058) grad_norm 1.9935 (1.3291) [2022-09-30 18:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][900/1251] eta 0:04:33 lr 0.000644 time 0.8514 (0.7787) loss 4.1066 (3.6138) grad_norm 1.2981 (1.3317) [2022-09-30 18:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1000/1251] eta 0:03:15 lr 0.000644 time 0.7210 (0.7781) loss 3.9432 (3.6121) grad_norm 1.4016 (1.3333) [2022-09-30 18:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1100/1251] eta 0:01:57 lr 0.000644 time 0.8359 (0.7779) loss 3.6363 (3.6104) grad_norm 1.1023 (1.3322) [2022-09-30 18:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1200/1251] eta 0:00:39 lr 0.000643 time 0.8378 (0.7778) loss 3.9243 (3.6132) grad_norm 1.5066 (1.3313) [2022-09-30 18:53:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 122 training takes 0:16:13 [2022-09-30 18:53:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.146 (4.146) Loss 1.0252 (1.0252) Acc@1 75.000 (75.000) Acc@5 93.750 (93.750) [2022-09-30 18:53:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.786 Acc@5 92.742 [2022-09-30 18:53:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-09-30 18:53:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-09-30 18:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][0/1251] eta 1:29:23 lr 0.000643 time 4.2875 (4.2875) loss 3.8775 (3.8775) grad_norm 1.3243 (1.3243) [2022-09-30 18:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][100/1251] eta 0:15:32 lr 0.000643 time 0.9005 (0.8101) loss 3.5731 (3.5664) grad_norm 1.3439 (1.3404) [2022-09-30 18:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][200/1251] eta 0:13:54 lr 0.000642 time 0.7714 (0.7944) loss 3.9925 (3.6550) grad_norm 1.4693 (1.3394) [2022-09-30 18:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][300/1251] eta 0:12:30 lr 0.000642 time 0.8451 (0.7890) loss 3.7471 (3.6136) grad_norm 1.6083 (1.3329) [2022-09-30 18:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][400/1251] eta 0:11:07 lr 0.000642 time 0.8319 (0.7849) loss 4.0088 (3.6322) grad_norm 1.5619 (1.3307) [2022-09-30 19:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][500/1251] eta 0:09:48 lr 0.000641 time 0.7825 (0.7834) loss 4.0719 (3.6311) grad_norm 1.3513 (1.3274) [2022-09-30 19:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][600/1251] eta 0:08:28 lr 0.000641 time 0.7381 (0.7817) loss 4.0588 (3.6459) grad_norm 1.4663 (1.3294) [2022-09-30 19:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][700/1251] eta 0:07:09 lr 0.000640 time 0.8321 (0.7796) loss 3.3818 (3.6451) grad_norm 1.3231 (1.3333) [2022-09-30 19:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][800/1251] eta 0:05:51 lr 0.000640 time 0.6750 (0.7789) loss 3.9690 (3.6461) grad_norm 1.2092 (1.3350) [2022-09-30 19:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][900/1251] eta 0:04:33 lr 0.000640 time 0.8126 (0.7791) loss 3.8347 (3.6427) grad_norm 1.2544 (1.3365) [2022-09-30 19:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1000/1251] eta 0:03:15 lr 0.000639 time 0.8346 (0.7801) loss 3.6236 (3.6450) grad_norm 1.5526 (1.3355) [2022-09-30 19:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1100/1251] eta 0:01:57 lr 0.000639 time 0.8185 (0.7803) loss 4.1569 (3.6456) grad_norm 1.4824 (1.3371) [2022-09-30 19:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1200/1251] eta 0:00:39 lr 0.000638 time 0.7699 (0.7804) loss 4.0621 (3.6457) grad_norm 1.2401 (1.3351) [2022-09-30 19:09:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 123 training takes 0:16:15 [2022-09-30 19:09:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.604 (4.604) Loss 1.0552 (1.0552) Acc@1 75.293 (75.293) Acc@5 92.480 (92.480) [2022-09-30 19:10:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.720 Acc@5 92.774 [2022-09-30 19:10:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-09-30 19:10:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-09-30 19:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][0/1251] eta 1:25:41 lr 0.000638 time 4.1096 (4.1096) loss 3.7303 (3.7303) grad_norm 1.1748 (1.1748) [2022-09-30 19:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][100/1251] eta 0:15:31 lr 0.000638 time 0.8110 (0.8097) loss 4.2136 (3.5752) grad_norm 1.2987 (1.3404) [2022-09-30 19:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][200/1251] eta 0:13:53 lr 0.000637 time 0.7566 (0.7929) loss 4.3071 (3.5891) grad_norm 1.7344 (1.3467) [2022-09-30 19:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][300/1251] eta 0:12:28 lr 0.000637 time 0.8049 (0.7868) loss 3.1906 (3.6081) grad_norm 1.2209 (1.3529) [2022-09-30 19:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][400/1251] eta 0:11:08 lr 0.000637 time 0.8104 (0.7857) loss 3.8138 (3.6290) grad_norm 1.2887 (1.3496) [2022-09-30 19:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][500/1251] eta 0:09:48 lr 0.000636 time 0.8015 (0.7836) loss 3.2806 (3.6350) grad_norm 1.4170 (1.3443) [2022-09-30 19:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][600/1251] eta 0:08:28 lr 0.000636 time 0.7815 (0.7809) loss 3.9320 (3.6328) grad_norm 1.2497 (1.3397) [2022-09-30 19:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][700/1251] eta 0:07:09 lr 0.000635 time 0.7682 (0.7803) loss 3.7719 (3.6262) grad_norm 1.2172 (1.3391) [2022-09-30 19:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][800/1251] eta 0:05:52 lr 0.000635 time 0.9002 (0.7807) loss 4.1374 (3.6291) grad_norm 1.3756 (1.3382) [2022-09-30 19:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][900/1251] eta 0:04:33 lr 0.000635 time 0.7056 (0.7805) loss 4.1904 (3.6258) grad_norm 1.4459 (1.3386) [2022-09-30 19:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1000/1251] eta 0:03:15 lr 0.000634 time 0.6932 (0.7806) loss 3.9683 (3.6311) grad_norm 1.4115 (1.3383) [2022-09-30 19:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1100/1251] eta 0:01:57 lr 0.000634 time 0.7333 (0.7805) loss 3.9828 (3.6365) grad_norm 1.3681 (1.3427) [2022-09-30 19:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1200/1251] eta 0:00:39 lr 0.000633 time 0.6571 (0.7803) loss 2.7574 (3.6401) grad_norm 1.2261 (1.3438) [2022-09-30 19:26:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 124 training takes 0:16:16 [2022-09-30 19:26:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.566 (4.566) Loss 1.0185 (1.0185) Acc@1 76.270 (76.270) Acc@5 93.359 (93.359) [2022-09-30 19:26:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.908 Acc@5 92.642 [2022-09-30 19:26:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-09-30 19:26:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.91% [2022-09-30 19:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][0/1251] eta 1:44:08 lr 0.000633 time 4.9946 (4.9946) loss 3.5432 (3.5432) grad_norm 1.2773 (1.2773) [2022-09-30 19:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][100/1251] eta 0:15:36 lr 0.000633 time 0.8508 (0.8133) loss 3.5208 (3.5602) grad_norm 1.1907 (1.3289) [2022-09-30 19:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][200/1251] eta 0:13:53 lr 0.000632 time 0.8228 (0.7934) loss 3.9304 (3.5906) grad_norm 1.3913 (1.3365) [2022-09-30 19:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][300/1251] eta 0:12:25 lr 0.000632 time 0.8161 (0.7839) loss 3.4124 (3.6380) grad_norm 1.5396 (1.3350) [2022-09-30 19:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][400/1251] eta 0:11:05 lr 0.000632 time 0.8115 (0.7817) loss 4.0546 (3.6173) grad_norm 1.2597 (1.3320) [2022-09-30 19:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][500/1251] eta 0:09:45 lr 0.000631 time 0.8268 (0.7802) loss 4.3672 (3.6031) grad_norm 1.4061 (1.3442) [2022-09-30 19:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][600/1251] eta 0:08:28 lr 0.000631 time 0.7306 (0.7810) loss 3.2714 (3.6159) grad_norm 1.4447 (1.3446) [2022-09-30 19:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][700/1251] eta 0:07:10 lr 0.000630 time 0.8717 (0.7810) loss 3.6357 (3.6075) grad_norm 1.1140 (1.3457) [2022-09-30 19:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][800/1251] eta 0:05:51 lr 0.000630 time 0.7324 (0.7793) loss 3.7057 (3.6064) grad_norm 1.1557 (1.3459) [2022-09-30 19:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][900/1251] eta 0:04:33 lr 0.000630 time 0.8033 (0.7794) loss 4.0519 (3.6110) grad_norm 1.3889 (1.3457) [2022-09-30 19:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1000/1251] eta 0:03:15 lr 0.000629 time 0.8359 (0.7792) loss 3.4582 (3.6112) grad_norm 1.3838 (1.3451) [2022-09-30 19:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1100/1251] eta 0:01:57 lr 0.000629 time 0.8270 (0.7793) loss 4.0164 (3.6086) grad_norm 1.5785 (1.3448) [2022-09-30 19:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1200/1251] eta 0:00:39 lr 0.000628 time 0.7028 (0.7787) loss 3.5936 (3.6154) grad_norm 1.3766 (1.3420) [2022-09-30 19:42:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 125 training takes 0:16:13 [2022-09-30 19:43:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.560 (4.560) Loss 1.0879 (1.0879) Acc@1 74.609 (74.609) Acc@5 92.383 (92.383) [2022-09-30 19:43:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.754 Acc@5 92.836 [2022-09-30 19:43:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-09-30 19:43:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.91% [2022-09-30 19:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][0/1251] eta 1:27:13 lr 0.000628 time 4.1835 (4.1835) loss 3.8880 (3.8880) grad_norm 1.1589 (1.1589) [2022-09-30 19:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][100/1251] eta 0:15:44 lr 0.000628 time 0.7753 (0.8210) loss 4.1876 (3.6564) grad_norm 1.1886 (1.3460) [2022-09-30 19:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][200/1251] eta 0:14:00 lr 0.000627 time 0.9388 (0.7998) loss 4.4319 (3.6328) grad_norm 1.3132 (1.3600) [2022-09-30 19:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][300/1251] eta 0:12:33 lr 0.000627 time 0.7213 (0.7923) loss 2.9489 (3.6545) grad_norm 1.4909 (1.3561) [2022-09-30 19:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][400/1251] eta 0:11:11 lr 0.000626 time 0.6902 (0.7893) loss 3.5070 (3.6465) grad_norm 1.1513 (1.3511) [2022-09-30 19:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][500/1251] eta 0:09:51 lr 0.000626 time 0.8173 (0.7877) loss 3.1663 (3.6383) grad_norm 1.4930 (1.3487) [2022-09-30 19:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][600/1251] eta 0:08:31 lr 0.000626 time 0.7910 (0.7850) loss 3.2972 (3.6307) grad_norm 1.3067 (1.3492) [2022-09-30 19:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][700/1251] eta 0:07:12 lr 0.000625 time 0.8256 (0.7844) loss 3.4428 (3.6351) grad_norm 1.2876 (1.3486) [2022-09-30 19:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][800/1251] eta 0:05:53 lr 0.000625 time 0.8216 (0.7837) loss 4.0536 (3.6283) grad_norm 1.1546 (1.3506) [2022-09-30 19:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][900/1251] eta 0:04:34 lr 0.000624 time 0.8569 (0.7829) loss 3.9061 (3.6315) grad_norm 1.2959 (1.3503) [2022-09-30 19:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1000/1251] eta 0:03:16 lr 0.000624 time 0.7794 (0.7813) loss 3.8830 (3.6250) grad_norm 1.3506 (1.3532) [2022-09-30 19:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1100/1251] eta 0:01:57 lr 0.000624 time 0.7953 (0.7803) loss 3.0153 (3.6232) grad_norm 1.5006 (1.3518) [2022-09-30 19:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1200/1251] eta 0:00:39 lr 0.000623 time 0.8834 (0.7802) loss 4.1019 (3.6203) grad_norm 1.3337 (1.3550) [2022-09-30 19:59:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 126 training takes 0:16:15 [2022-09-30 19:59:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.118 (4.118) Loss 1.0314 (1.0314) Acc@1 77.832 (77.832) Acc@5 93.359 (93.359) [2022-09-30 19:59:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.886 Acc@5 92.780 [2022-09-30 19:59:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-09-30 19:59:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.91% [2022-09-30 20:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][0/1251] eta 1:23:08 lr 0.000623 time 3.9876 (3.9876) loss 4.3667 (4.3667) grad_norm 1.2745 (1.2745) [2022-09-30 20:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][100/1251] eta 0:15:40 lr 0.000623 time 0.8318 (0.8172) loss 3.5861 (3.5563) grad_norm 1.2892 (1.3456) [2022-09-30 20:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][200/1251] eta 0:13:57 lr 0.000622 time 0.8326 (0.7964) loss 3.8283 (3.6237) grad_norm 1.2392 (1.3418) [2022-09-30 20:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][300/1251] eta 0:12:29 lr 0.000622 time 0.7241 (0.7878) loss 3.7871 (3.6132) grad_norm 1.3341 (1.3453) [2022-09-30 20:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][400/1251] eta 0:11:09 lr 0.000621 time 0.9008 (0.7869) loss 3.4974 (3.6247) grad_norm 1.3013 (1.3536) [2022-09-30 20:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][500/1251] eta 0:09:49 lr 0.000621 time 0.8244 (0.7844) loss 3.4375 (3.6125) grad_norm 1.2319 (1.3514) [2022-09-30 20:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][600/1251] eta 0:08:28 lr 0.000621 time 0.6624 (0.7808) loss 3.8165 (3.6056) grad_norm 1.3495 (1.3541) [2022-09-30 20:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][700/1251] eta 0:07:10 lr 0.000620 time 0.8344 (0.7807) loss 4.0999 (3.6122) grad_norm 1.3848 (1.3535) [2022-09-30 20:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][800/1251] eta 0:05:51 lr 0.000620 time 0.8438 (0.7804) loss 2.7821 (3.6072) grad_norm 1.4631 (1.3564) [2022-09-30 20:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][900/1251] eta 0:04:33 lr 0.000619 time 0.7790 (0.7794) loss 3.5825 (3.6025) grad_norm 1.1853 (1.3584) [2022-09-30 20:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1000/1251] eta 0:03:15 lr 0.000619 time 0.8293 (0.7789) loss 3.5070 (3.6116) grad_norm 1.2000 (1.3566) [2022-09-30 20:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1100/1251] eta 0:01:57 lr 0.000619 time 0.7214 (0.7785) loss 4.0883 (3.6078) grad_norm 1.1449 (1.3525) [2022-09-30 20:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1200/1251] eta 0:00:39 lr 0.000618 time 0.6983 (0.7780) loss 3.1080 (3.6054) grad_norm 1.2395 (1.3495) [2022-09-30 20:16:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 127 training takes 0:16:13 [2022-09-30 20:16:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.730 (4.730) Loss 1.1307 (1.1307) Acc@1 73.926 (73.926) Acc@5 91.211 (91.211) [2022-09-30 20:16:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.910 Acc@5 92.726 [2022-09-30 20:16:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-09-30 20:16:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.91% [2022-09-30 20:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][0/1251] eta 1:36:40 lr 0.000618 time 4.6371 (4.6371) loss 3.6797 (3.6797) grad_norm 1.2762 (1.2762) [2022-09-30 20:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][100/1251] eta 0:15:41 lr 0.000618 time 0.8481 (0.8183) loss 4.2974 (3.6278) grad_norm 1.3160 (1.3460) [2022-09-30 20:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][200/1251] eta 0:13:58 lr 0.000617 time 0.7577 (0.7977) loss 4.5301 (3.6534) grad_norm 1.4076 (1.3573) [2022-09-30 20:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][300/1251] eta 0:12:34 lr 0.000617 time 0.6689 (0.7937) loss 3.5302 (3.6184) grad_norm 1.3189 (1.3494) [2022-09-30 20:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][400/1251] eta 0:11:11 lr 0.000616 time 0.6782 (0.7887) loss 4.0596 (3.6210) grad_norm 1.2992 (1.3575) [2022-09-30 20:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][500/1251] eta 0:09:50 lr 0.000616 time 0.7162 (0.7867) loss 4.0395 (3.6158) grad_norm 1.2308 (1.3576) [2022-09-30 20:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][600/1251] eta 0:08:30 lr 0.000616 time 0.9147 (0.7847) loss 3.2806 (3.6094) grad_norm 1.2750 (1.3636) [2022-09-30 20:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][700/1251] eta 0:07:12 lr 0.000615 time 0.8417 (0.7844) loss 3.7229 (3.6066) grad_norm 1.2827 (1.3634) [2022-09-30 20:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][800/1251] eta 0:05:53 lr 0.000615 time 0.8563 (0.7832) loss 3.3761 (3.6032) grad_norm 1.4501 (1.3617) [2022-09-30 20:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][900/1251] eta 0:04:34 lr 0.000614 time 0.8405 (0.7834) loss 3.8617 (3.6022) grad_norm 1.4224 (1.3628) [2022-09-30 20:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1000/1251] eta 0:03:16 lr 0.000614 time 0.7901 (0.7814) loss 3.9515 (3.6070) grad_norm 1.4640 (1.3639) [2022-09-30 20:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1100/1251] eta 0:01:56 lr 0.000614 time 0.6454 (0.7746) loss 4.1108 (3.6078) grad_norm 1.2570 (1.3628) [2022-09-30 20:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1200/1251] eta 0:00:39 lr 0.000613 time 0.8357 (0.7688) loss 3.7190 (3.6084) grad_norm 1.4364 (1.3650) [2022-09-30 20:32:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 128 training takes 0:16:03 [2022-09-30 20:32:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.163 (4.163) Loss 1.0840 (1.0840) Acc@1 75.977 (75.977) Acc@5 91.895 (91.895) [2022-09-30 20:32:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.818 Acc@5 92.732 [2022-09-30 20:32:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-09-30 20:32:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.91% [2022-09-30 20:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][0/1251] eta 1:49:15 lr 0.000613 time 5.2400 (5.2400) loss 4.1765 (4.1765) grad_norm 1.3214 (1.3214) [2022-09-30 20:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][100/1251] eta 0:15:46 lr 0.000613 time 0.8415 (0.8226) loss 3.9213 (3.5471) grad_norm 1.5764 (1.3240) [2022-09-30 20:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][200/1251] eta 0:13:59 lr 0.000612 time 0.8402 (0.7984) loss 3.9528 (3.5718) grad_norm 1.3936 (1.3415) [2022-09-30 20:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][300/1251] eta 0:12:33 lr 0.000612 time 0.9406 (0.7923) loss 3.6133 (3.5944) grad_norm 1.3881 (1.3552) [2022-09-30 20:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][400/1251] eta 0:11:08 lr 0.000611 time 0.7793 (0.7860) loss 3.2906 (3.6190) grad_norm 1.4439 (1.3574) [2022-09-30 20:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][500/1251] eta 0:09:49 lr 0.000611 time 0.8657 (0.7850) loss 3.9323 (3.6158) grad_norm 1.4028 (1.3619) [2022-09-30 20:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][600/1251] eta 0:08:30 lr 0.000611 time 0.8499 (0.7840) loss 3.3479 (3.6212) grad_norm 1.4126 (1.3652) [2022-09-30 20:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][700/1251] eta 0:07:11 lr 0.000610 time 0.8505 (0.7827) loss 4.0136 (3.6163) grad_norm 1.4287 (1.3677) [2022-09-30 20:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][800/1251] eta 0:05:52 lr 0.000610 time 0.9374 (0.7813) loss 4.3002 (3.6185) grad_norm 1.4260 (1.3696) [2022-09-30 20:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][900/1251] eta 0:04:33 lr 0.000609 time 0.7292 (0.7799) loss 3.4323 (3.6252) grad_norm 1.4546 (1.3674) [2022-09-30 20:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1000/1251] eta 0:03:15 lr 0.000609 time 0.6820 (0.7785) loss 3.9056 (3.6255) grad_norm 1.4191 (1.3686) [2022-09-30 20:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1100/1251] eta 0:01:57 lr 0.000609 time 0.7829 (0.7776) loss 3.9697 (3.6329) grad_norm 1.4048 (1.3654) [2022-09-30 20:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1200/1251] eta 0:00:39 lr 0.000608 time 0.7307 (0.7769) loss 3.8297 (3.6311) grad_norm 1.1944 (1.3655) [2022-09-30 20:49:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 129 training takes 0:16:11 [2022-09-30 20:49:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.149 (4.149) Loss 1.0478 (1.0478) Acc@1 74.414 (74.414) Acc@5 94.043 (94.043) [2022-09-30 20:49:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.166 Acc@5 93.000 [2022-09-30 20:49:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-09-30 20:49:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.17% [2022-09-30 20:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][0/1251] eta 1:46:49 lr 0.000608 time 5.1235 (5.1235) loss 3.9397 (3.9397) grad_norm 1.6035 (1.6035) [2022-09-30 20:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][100/1251] eta 0:15:38 lr 0.000608 time 0.8163 (0.8151) loss 3.2241 (3.6645) grad_norm 1.2993 (1.3755) [2022-09-30 20:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][200/1251] eta 0:13:54 lr 0.000607 time 0.7000 (0.7941) loss 4.0158 (3.6762) grad_norm 1.5242 (1.3813) [2022-09-30 20:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][300/1251] eta 0:12:27 lr 0.000607 time 0.7909 (0.7858) loss 4.1412 (3.6768) grad_norm 1.5299 (1.3740) [2022-09-30 20:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][400/1251] eta 0:11:06 lr 0.000606 time 0.5871 (0.7826) loss 3.6050 (3.6600) grad_norm 1.2685 (1.3640) [2022-09-30 20:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][500/1251] eta 0:09:46 lr 0.000606 time 0.8224 (0.7812) loss 4.2799 (3.6535) grad_norm 1.4195 (1.3707) [2022-09-30 20:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][600/1251] eta 0:08:29 lr 0.000605 time 0.7322 (0.7821) loss 3.4546 (3.6508) grad_norm 1.4015 (1.3666) [2022-09-30 20:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][700/1251] eta 0:07:09 lr 0.000605 time 0.7813 (0.7797) loss 3.9726 (3.6384) grad_norm 1.2771 (1.3657) [2022-09-30 20:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][800/1251] eta 0:05:51 lr 0.000605 time 0.8133 (0.7788) loss 4.1459 (3.6393) grad_norm 1.1351 (1.3681) [2022-09-30 21:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][900/1251] eta 0:04:33 lr 0.000604 time 0.8540 (0.7794) loss 3.3599 (3.6400) grad_norm 1.3020 (1.3646) [2022-09-30 21:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1000/1251] eta 0:03:15 lr 0.000604 time 0.7852 (0.7792) loss 3.7180 (3.6480) grad_norm 1.2999 (1.3660) [2022-09-30 21:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1100/1251] eta 0:01:57 lr 0.000603 time 0.8345 (0.7789) loss 3.9437 (3.6410) grad_norm 1.7521 (1.3675) [2022-09-30 21:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1200/1251] eta 0:00:39 lr 0.000603 time 0.8120 (0.7783) loss 4.5621 (3.6435) grad_norm 1.3620 (1.3686) [2022-09-30 21:05:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 130 training takes 0:16:13 [2022-09-30 21:05:43 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saving...... [2022-09-30 21:05:43 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saved !!! [2022-09-30 21:05:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.734 (4.734) Loss 1.0078 (1.0078) Acc@1 76.758 (76.758) Acc@5 93.848 (93.848) [2022-09-30 21:06:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.008 Acc@5 93.004 [2022-09-30 21:06:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-09-30 21:06:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.17% [2022-09-30 21:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][0/1251] eta 1:31:16 lr 0.000603 time 4.3777 (4.3777) loss 3.5588 (3.5588) grad_norm 1.1158 (1.1158) [2022-09-30 21:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][100/1251] eta 0:15:36 lr 0.000602 time 0.6566 (0.8137) loss 3.9619 (3.6527) grad_norm 1.4144 (1.3733) [2022-09-30 21:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][200/1251] eta 0:13:59 lr 0.000602 time 0.9002 (0.7985) loss 2.6901 (3.6140) grad_norm 1.3510 (1.3822) [2022-09-30 21:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][300/1251] eta 0:12:30 lr 0.000602 time 0.7021 (0.7890) loss 3.3271 (3.6191) grad_norm 1.2191 (1.3742) [2022-09-30 21:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][400/1251] eta 0:11:09 lr 0.000601 time 0.8178 (0.7866) loss 3.3949 (3.6064) grad_norm 1.3805 (1.3660) [2022-09-30 21:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][500/1251] eta 0:09:48 lr 0.000601 time 0.8189 (0.7838) loss 4.0503 (3.5858) grad_norm 1.2593 (1.3728) [2022-09-30 21:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][600/1251] eta 0:08:28 lr 0.000600 time 0.7147 (0.7813) loss 4.0217 (3.5888) grad_norm 1.3434 (1.3692) [2022-09-30 21:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][700/1251] eta 0:07:09 lr 0.000600 time 0.8305 (0.7788) loss 3.7180 (3.5892) grad_norm 1.3927 (1.3707) [2022-09-30 21:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][800/1251] eta 0:05:50 lr 0.000600 time 0.8282 (0.7777) loss 3.6521 (3.5917) grad_norm 1.7805 (1.3710) [2022-09-30 21:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][900/1251] eta 0:04:33 lr 0.000599 time 0.7665 (0.7788) loss 4.2143 (3.5976) grad_norm 1.3558 (1.3728) [2022-09-30 21:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1000/1251] eta 0:03:15 lr 0.000599 time 0.7133 (0.7775) loss 3.1431 (3.5934) grad_norm 1.3008 (1.3695) [2022-09-30 21:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1100/1251] eta 0:01:57 lr 0.000598 time 0.8007 (0.7777) loss 3.9250 (3.6032) grad_norm 1.5496 (1.3705) [2022-09-30 21:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1200/1251] eta 0:00:39 lr 0.000598 time 0.7731 (0.7766) loss 2.8733 (3.6065) grad_norm 1.4094 (1.3689) [2022-09-30 21:22:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 131 training takes 0:16:11 [2022-09-30 21:22:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.466 (4.466) Loss 1.1297 (1.1297) Acc@1 74.219 (74.219) Acc@5 92.676 (92.676) [2022-09-30 21:22:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.032 Acc@5 92.840 [2022-09-30 21:22:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-09-30 21:22:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.17% [2022-09-30 21:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][0/1251] eta 1:22:39 lr 0.000598 time 3.9643 (3.9643) loss 4.1556 (4.1556) grad_norm 1.4344 (1.4344) [2022-09-30 21:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][100/1251] eta 0:15:33 lr 0.000597 time 0.7537 (0.8113) loss 2.7505 (3.5251) grad_norm 1.5688 (1.3837) [2022-09-30 21:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][200/1251] eta 0:13:54 lr 0.000597 time 0.7311 (0.7942) loss 2.5585 (3.5696) grad_norm 1.3344 (1.3795) [2022-09-30 21:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][300/1251] eta 0:12:29 lr 0.000597 time 0.8146 (0.7886) loss 3.2729 (3.5777) grad_norm 1.3490 (1.3838) [2022-09-30 21:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][400/1251] eta 0:11:04 lr 0.000596 time 0.8133 (0.7803) loss 4.2729 (3.5957) grad_norm 1.2624 (1.3785) [2022-09-30 21:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][500/1251] eta 0:09:45 lr 0.000596 time 0.7503 (0.7795) loss 3.9260 (3.5939) grad_norm 1.1658 (1.3839) [2022-09-30 21:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][600/1251] eta 0:08:26 lr 0.000595 time 0.8425 (0.7786) loss 4.0613 (3.6073) grad_norm 1.3630 (1.3873) [2022-09-30 21:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][700/1251] eta 0:07:08 lr 0.000595 time 0.8295 (0.7775) loss 3.9698 (3.5969) grad_norm 1.1915 (1.3852) [2022-09-30 21:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][800/1251] eta 0:05:50 lr 0.000594 time 0.6918 (0.7777) loss 2.5102 (3.5941) grad_norm 1.4618 (1.3867) [2022-09-30 21:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][900/1251] eta 0:04:32 lr 0.000594 time 0.8226 (0.7772) loss 3.6461 (3.5910) grad_norm 1.4265 (1.3814) [2022-09-30 21:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1000/1251] eta 0:03:14 lr 0.000594 time 0.7869 (0.7762) loss 3.6151 (3.5950) grad_norm 1.4020 (1.3855) [2022-09-30 21:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1100/1251] eta 0:01:56 lr 0.000593 time 0.8367 (0.7746) loss 2.6903 (3.5948) grad_norm 1.3669 (1.3835) [2022-09-30 21:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1200/1251] eta 0:00:39 lr 0.000593 time 0.7843 (0.7751) loss 3.3355 (3.6030) grad_norm 1.4189 (1.3855) [2022-09-30 21:38:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 132 training takes 0:16:10 [2022-09-30 21:38:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.468 (4.468) Loss 1.0259 (1.0259) Acc@1 75.879 (75.879) Acc@5 92.090 (92.090) [2022-09-30 21:39:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.894 Acc@5 92.926 [2022-09-30 21:39:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-09-30 21:39:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.17% [2022-09-30 21:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][0/1251] eta 1:51:10 lr 0.000593 time 5.3321 (5.3321) loss 4.1993 (4.1993) grad_norm 1.3543 (1.3543) [2022-09-30 21:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][100/1251] eta 0:15:44 lr 0.000592 time 0.8587 (0.8208) loss 4.3268 (3.5455) grad_norm 1.4361 (1.3515) [2022-09-30 21:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][200/1251] eta 0:14:00 lr 0.000592 time 0.7625 (0.8000) loss 3.5903 (3.5954) grad_norm 1.2183 (1.3742) [2022-09-30 21:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][300/1251] eta 0:12:30 lr 0.000591 time 0.6724 (0.7897) loss 2.7072 (3.5748) grad_norm 1.1426 (1.3621) [2022-09-30 21:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][400/1251] eta 0:11:06 lr 0.000591 time 0.8132 (0.7836) loss 2.6791 (3.5918) grad_norm 1.4577 (1.3692) [2022-09-30 21:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][500/1251] eta 0:09:47 lr 0.000591 time 0.7450 (0.7820) loss 3.3193 (3.5975) grad_norm 1.4483 (1.3707) [2022-09-30 21:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][600/1251] eta 0:08:28 lr 0.000590 time 0.8887 (0.7815) loss 4.2268 (3.6047) grad_norm 1.3030 (1.3698) [2022-09-30 21:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][700/1251] eta 0:07:10 lr 0.000590 time 0.8232 (0.7812) loss 4.0799 (3.5843) grad_norm 1.1189 (1.3740) [2022-09-30 21:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][800/1251] eta 0:05:51 lr 0.000589 time 0.6983 (0.7804) loss 4.1163 (3.5882) grad_norm 1.6135 (1.3776) [2022-09-30 21:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][900/1251] eta 0:04:33 lr 0.000589 time 0.8194 (0.7802) loss 4.3293 (3.5898) grad_norm 1.2495 (1.3782) [2022-09-30 21:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1000/1251] eta 0:03:15 lr 0.000589 time 0.7241 (0.7798) loss 4.0435 (3.5906) grad_norm 1.5086 (1.3802) [2022-09-30 21:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1100/1251] eta 0:01:57 lr 0.000588 time 0.9138 (0.7796) loss 4.1574 (3.5922) grad_norm 1.3818 (1.3794) [2022-09-30 21:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1200/1251] eta 0:00:39 lr 0.000588 time 0.7015 (0.7792) loss 3.3073 (3.5970) grad_norm 1.8985 (1.3828) [2022-09-30 21:55:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 133 training takes 0:16:14 [2022-09-30 21:55:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.193 (4.193) Loss 1.1261 (1.1261) Acc@1 73.047 (73.047) Acc@5 92.578 (92.578) [2022-09-30 21:55:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.142 Acc@5 92.960 [2022-09-30 21:55:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-09-30 21:55:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.17% [2022-09-30 21:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][0/1251] eta 1:45:22 lr 0.000588 time 5.0539 (5.0539) loss 3.4247 (3.4247) grad_norm 1.3380 (1.3380) [2022-09-30 21:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][100/1251] eta 0:15:39 lr 0.000587 time 0.8200 (0.8162) loss 2.8082 (3.5506) grad_norm 1.2768 (1.3897) [2022-09-30 21:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][200/1251] eta 0:13:57 lr 0.000587 time 0.8343 (0.7971) loss 3.8330 (3.5288) grad_norm 1.4396 (1.3837) [2022-09-30 21:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][300/1251] eta 0:12:31 lr 0.000586 time 0.8012 (0.7903) loss 4.0129 (3.5771) grad_norm 1.3147 (1.3830) [2022-09-30 22:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][400/1251] eta 0:11:09 lr 0.000586 time 0.8581 (0.7866) loss 4.0037 (3.5762) grad_norm 1.4202 (1.3829) [2022-09-30 22:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][500/1251] eta 0:09:47 lr 0.000586 time 0.8243 (0.7828) loss 3.9937 (3.5587) grad_norm 1.3195 (1.3858) [2022-09-30 22:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][600/1251] eta 0:08:29 lr 0.000585 time 0.8421 (0.7830) loss 2.9061 (3.5697) grad_norm 1.2982 (1.3861) [2022-09-30 22:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][700/1251] eta 0:07:09 lr 0.000585 time 0.7585 (0.7801) loss 2.6451 (3.5762) grad_norm 1.4600 (1.3877) [2022-09-30 22:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][800/1251] eta 0:05:51 lr 0.000584 time 0.8711 (0.7798) loss 2.9300 (3.5853) grad_norm 1.3269 (1.3878) [2022-09-30 22:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][900/1251] eta 0:04:33 lr 0.000584 time 0.7509 (0.7806) loss 4.0071 (3.5920) grad_norm 1.1852 (1.3874) [2022-09-30 22:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1000/1251] eta 0:03:15 lr 0.000583 time 0.7502 (0.7797) loss 4.1981 (3.5865) grad_norm 1.3613 (1.3854) [2022-09-30 22:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1100/1251] eta 0:01:57 lr 0.000583 time 0.8604 (0.7791) loss 3.7699 (3.5897) grad_norm 1.4865 (1.3877) [2022-09-30 22:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1200/1251] eta 0:00:39 lr 0.000583 time 0.7681 (0.7781) loss 3.3526 (3.5911) grad_norm 1.3401 (1.3888) [2022-09-30 22:11:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 134 training takes 0:16:13 [2022-09-30 22:12:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.503 (4.503) Loss 1.0403 (1.0403) Acc@1 75.293 (75.293) Acc@5 92.480 (92.480) [2022-09-30 22:12:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.294 Acc@5 93.122 [2022-09-30 22:12:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-09-30 22:12:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.29% [2022-09-30 22:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][0/1251] eta 1:46:46 lr 0.000582 time 5.1208 (5.1208) loss 4.1295 (4.1295) grad_norm 1.3650 (1.3650) [2022-09-30 22:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][100/1251] eta 0:15:37 lr 0.000582 time 0.8443 (0.8144) loss 3.9856 (3.5889) grad_norm 1.3285 (1.4168) [2022-09-30 22:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][200/1251] eta 0:13:57 lr 0.000582 time 0.7879 (0.7967) loss 4.3058 (3.6616) grad_norm 1.2344 (1.3967) [2022-09-30 22:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][300/1251] eta 0:12:33 lr 0.000581 time 0.7161 (0.7922) loss 3.2840 (3.6221) grad_norm 1.2227 (1.3897) [2022-09-30 22:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][400/1251] eta 0:11:10 lr 0.000581 time 0.8583 (0.7876) loss 3.7257 (3.6233) grad_norm 1.2363 (1.3814) [2022-09-30 22:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][500/1251] eta 0:09:49 lr 0.000580 time 0.9480 (0.7851) loss 3.5112 (3.6241) grad_norm 1.4882 (1.3853) [2022-09-30 22:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][600/1251] eta 0:08:29 lr 0.000580 time 0.6174 (0.7824) loss 3.9055 (3.6262) grad_norm 1.4593 (1.3860) [2022-09-30 22:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][700/1251] eta 0:07:10 lr 0.000580 time 0.6638 (0.7815) loss 4.1386 (3.6168) grad_norm 1.5828 (1.3892) [2022-09-30 22:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][800/1251] eta 0:05:51 lr 0.000579 time 0.8212 (0.7799) loss 4.2482 (3.6212) grad_norm 1.2762 (1.3926) [2022-09-30 22:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][900/1251] eta 0:04:33 lr 0.000579 time 0.8353 (0.7793) loss 3.0668 (3.6149) grad_norm 1.4556 (1.3936) [2022-09-30 22:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1000/1251] eta 0:03:15 lr 0.000578 time 0.8628 (0.7789) loss 2.7151 (3.6111) grad_norm 1.4579 (1.3951) [2022-09-30 22:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1100/1251] eta 0:01:57 lr 0.000578 time 0.7153 (0.7794) loss 4.0096 (3.6087) grad_norm 1.5749 (1.3945) [2022-09-30 22:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1200/1251] eta 0:00:39 lr 0.000578 time 0.5975 (0.7791) loss 2.9182 (3.6056) grad_norm 1.2660 (1.3955) [2022-09-30 22:28:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 135 training takes 0:16:14 [2022-09-30 22:28:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.146 (4.146) Loss 1.1047 (1.1047) Acc@1 74.707 (74.707) Acc@5 92.676 (92.676) [2022-09-30 22:28:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.094 Acc@5 92.832 [2022-09-30 22:28:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-09-30 22:28:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.29% [2022-09-30 22:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][0/1251] eta 1:32:17 lr 0.000577 time 4.4263 (4.4263) loss 3.2926 (3.2926) grad_norm 1.3764 (1.3764) [2022-09-30 22:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][100/1251] eta 0:15:41 lr 0.000577 time 0.8439 (0.8179) loss 4.1299 (3.5146) grad_norm 1.3105 (1.4008) [2022-09-30 22:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][200/1251] eta 0:13:56 lr 0.000576 time 0.9375 (0.7963) loss 4.2507 (3.6246) grad_norm 1.2587 (1.3994) [2022-09-30 22:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][300/1251] eta 0:12:32 lr 0.000576 time 0.8841 (0.7911) loss 3.8992 (3.6203) grad_norm 1.3718 (1.3890) [2022-09-30 22:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][400/1251] eta 0:11:11 lr 0.000576 time 0.8279 (0.7890) loss 2.7079 (3.6162) grad_norm 1.4567 (1.3977) [2022-09-30 22:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][500/1251] eta 0:09:49 lr 0.000575 time 0.8705 (0.7849) loss 4.2194 (3.6141) grad_norm 1.4543 (1.3944) [2022-09-30 22:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][600/1251] eta 0:08:30 lr 0.000575 time 0.8331 (0.7849) loss 3.9056 (3.6199) grad_norm 1.3386 (1.4023) [2022-09-30 22:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][700/1251] eta 0:07:11 lr 0.000574 time 0.7323 (0.7836) loss 4.0743 (3.6222) grad_norm 1.3805 (1.4031) [2022-09-30 22:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][800/1251] eta 0:05:52 lr 0.000574 time 0.8437 (0.7825) loss 4.0288 (3.6198) grad_norm 1.4000 (1.4035) [2022-09-30 22:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][900/1251] eta 0:04:34 lr 0.000574 time 0.8232 (0.7815) loss 3.9226 (3.6162) grad_norm 1.3347 (1.3998) [2022-09-30 22:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1000/1251] eta 0:03:15 lr 0.000573 time 0.8410 (0.7804) loss 3.9917 (3.6070) grad_norm 1.5966 (1.3999) [2022-09-30 22:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1100/1251] eta 0:01:57 lr 0.000573 time 0.8223 (0.7805) loss 3.8153 (3.6061) grad_norm 1.3231 (1.3995) [2022-09-30 22:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1200/1251] eta 0:00:39 lr 0.000572 time 0.9075 (0.7791) loss 3.8772 (3.6074) grad_norm 1.3285 (1.4001) [2022-09-30 22:45:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 136 training takes 0:16:13 [2022-09-30 22:45:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.625 (4.625) Loss 0.9937 (0.9937) Acc@1 76.953 (76.953) Acc@5 93.945 (93.945) [2022-09-30 22:45:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.390 Acc@5 93.060 [2022-09-30 22:45:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-09-30 22:45:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.39% [2022-09-30 22:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][0/1251] eta 1:42:33 lr 0.000572 time 4.9185 (4.9185) loss 3.4725 (3.4725) grad_norm 1.3546 (1.3546) [2022-09-30 22:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][100/1251] eta 0:15:35 lr 0.000572 time 0.6459 (0.8130) loss 2.7335 (3.5770) grad_norm 1.3597 (1.3649) [2022-09-30 22:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][200/1251] eta 0:13:59 lr 0.000571 time 0.8333 (0.7987) loss 4.1694 (3.5951) grad_norm 1.3348 (1.3890) [2022-09-30 22:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][300/1251] eta 0:12:29 lr 0.000571 time 0.6266 (0.7884) loss 2.8610 (3.5817) grad_norm 1.2103 (1.4011) [2022-09-30 22:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][400/1251] eta 0:11:09 lr 0.000571 time 0.8499 (0.7873) loss 3.8591 (3.5732) grad_norm 1.3293 (1.3989) [2022-09-30 22:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][500/1251] eta 0:09:48 lr 0.000570 time 0.7759 (0.7835) loss 3.1652 (3.5699) grad_norm 1.5128 (1.4024) [2022-09-30 22:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][600/1251] eta 0:08:26 lr 0.000570 time 0.3040 (0.7774) loss 3.7581 (3.5578) grad_norm 1.3839 (1.4057) [2022-09-30 22:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][700/1251] eta 0:06:58 lr 0.000569 time 0.8335 (0.7595) loss 3.0024 (3.5676) grad_norm 1.2587 (1.4022) [2022-09-30 22:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][800/1251] eta 0:05:43 lr 0.000569 time 0.8392 (0.7612) loss 3.4865 (3.5596) grad_norm 1.4203 (1.4006) [2022-09-30 22:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][900/1251] eta 0:04:28 lr 0.000568 time 0.8700 (0.7638) loss 3.7521 (3.5569) grad_norm 1.2536 (1.4012) [2022-09-30 22:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1000/1251] eta 0:03:11 lr 0.000568 time 0.8317 (0.7648) loss 3.4777 (3.5688) grad_norm 1.3567 (1.4004) [2022-09-30 22:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1100/1251] eta 0:01:55 lr 0.000568 time 0.7607 (0.7663) loss 3.5333 (3.5798) grad_norm 1.4302 (1.4003) [2022-09-30 23:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1200/1251] eta 0:00:39 lr 0.000567 time 0.8314 (0.7677) loss 4.4857 (3.5791) grad_norm 1.3494 (1.4008) [2022-09-30 23:01:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 137 training takes 0:16:01 [2022-09-30 23:01:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 4.413 (4.413) Loss 1.0475 (1.0475) Acc@1 75.195 (75.195) Acc@5 92.480 (92.480) [2022-09-30 23:01:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.270 Acc@5 92.982 [2022-09-30 23:01:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-09-30 23:01:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.39% [2022-09-30 23:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][0/1251] eta 1:28:32 lr 0.000567 time 4.2463 (4.2463) loss 3.9446 (3.9446) grad_norm 1.5486 (1.5486) [2022-09-30 23:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][100/1251] eta 0:15:37 lr 0.000567 time 0.8214 (0.8148) loss 2.7038 (3.6048) grad_norm 1.5289 (1.3843) [2022-09-30 23:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][200/1251] eta 0:11:08 lr 0.000566 time 0.2943 (0.6362) loss 3.6606 (3.6285) grad_norm 1.4627 (1.4027) [2022-09-30 23:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][300/1251] eta 0:08:16 lr 0.000566 time 0.2936 (0.5222) loss 2.6387 (3.6209) grad_norm 1.5865 (1.4050) [2022-09-30 23:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][400/1251] eta 0:06:35 lr 0.000565 time 0.2928 (0.4651) loss 2.7530 (3.5806) grad_norm 1.4181 (1.4067) [2022-09-30 23:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][500/1251] eta 0:05:23 lr 0.000565 time 0.2952 (0.4307) loss 4.0681 (3.5679) grad_norm 1.3318 (1.4009) [2022-09-30 23:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][600/1251] eta 0:04:25 lr 0.000565 time 0.3884 (0.4080) loss 3.9230 (3.5528) grad_norm 1.3654 (1.4080) [2022-09-30 23:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][700/1251] eta 0:03:35 lr 0.000564 time 0.2954 (0.3917) loss 3.3313 (3.5595) grad_norm 1.3341 (1.4071) [2022-09-30 23:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][800/1251] eta 0:02:51 lr 0.000564 time 0.2926 (0.3794) loss 3.0860 (3.5584) grad_norm 1.3367 (1.4093) [2022-09-30 23:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][900/1251] eta 0:02:09 lr 0.000563 time 0.2956 (0.3699) loss 2.8059 (3.5716) grad_norm 1.5392 (1.4089) [2022-09-30 23:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1000/1251] eta 0:01:30 lr 0.000563 time 0.2882 (0.3622) loss 4.2647 (3.5775) grad_norm 1.3762 (1.4084) [2022-09-30 23:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1100/1251] eta 0:00:53 lr 0.000563 time 0.3871 (0.3560) loss 3.8490 (3.5809) grad_norm 1.4953 (1.4104) [2022-09-30 23:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1200/1251] eta 0:00:17 lr 0.000562 time 0.2929 (0.3508) loss 3.9126 (3.5804) grad_norm 1.5219 (1.4115) [2022-09-30 23:09:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 138 training takes 0:07:16 [2022-09-30 23:09:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.189 (3.189) Loss 1.0811 (1.0811) Acc@1 74.414 (74.414) Acc@5 92.383 (92.383) [2022-09-30 23:09:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.354 Acc@5 93.112 [2022-09-30 23:09:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-09-30 23:09:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.39% [2022-09-30 23:09:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][0/1251] eta 0:45:06 lr 0.000562 time 2.1635 (2.1635) loss 2.4570 (2.4570) grad_norm 1.4467 (1.4467) [2022-09-30 23:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][100/1251] eta 0:05:57 lr 0.000561 time 0.2843 (0.3103) loss 3.6563 (3.4976) grad_norm 1.3765 (1.4256) [2022-09-30 23:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][200/1251] eta 0:05:14 lr 0.000561 time 0.2935 (0.2989) loss 2.7799 (3.5647) grad_norm 1.4602 (1.4147) [2022-09-30 23:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][300/1251] eta 0:04:40 lr 0.000561 time 0.3770 (0.2953) loss 4.2318 (3.5643) grad_norm 1.4994 (1.4032) [2022-09-30 23:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][400/1251] eta 0:04:09 lr 0.000560 time 0.2871 (0.2934) loss 3.3144 (3.5681) grad_norm 1.3995 (1.4050) [2022-09-30 23:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][500/1251] eta 0:03:39 lr 0.000560 time 0.2877 (0.2922) loss 3.3304 (3.5694) grad_norm 1.6636 (1.4130) [2022-09-30 23:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][600/1251] eta 0:03:09 lr 0.000559 time 0.2863 (0.2913) loss 4.1876 (3.5715) grad_norm 1.8074 (1.4189) [2022-09-30 23:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][700/1251] eta 0:02:40 lr 0.000559 time 0.2863 (0.2907) loss 4.0984 (3.5746) grad_norm 1.4307 (1.4183) [2022-09-30 23:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][800/1251] eta 0:02:10 lr 0.000559 time 0.3767 (0.2904) loss 3.4694 (3.5650) grad_norm 1.3257 (1.4138) [2022-09-30 23:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][900/1251] eta 0:01:41 lr 0.000558 time 0.2861 (0.2901) loss 4.3676 (3.5599) grad_norm 1.7207 (1.4194) [2022-09-30 23:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1000/1251] eta 0:01:12 lr 0.000558 time 0.2856 (0.2899) loss 3.7932 (3.5565) grad_norm 1.2342 (1.4194) [2022-09-30 23:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1100/1251] eta 0:00:43 lr 0.000557 time 0.2872 (0.2897) loss 3.7470 (3.5566) grad_norm 1.2368 (1.4184) [2022-09-30 23:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1200/1251] eta 0:00:14 lr 0.000557 time 0.2846 (0.2895) loss 4.2559 (3.5618) grad_norm 1.4996 (1.4222) [2022-09-30 23:15:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 139 training takes 0:06:02 [2022-09-30 23:15:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.578 (2.578) Loss 1.0700 (1.0700) Acc@1 75.781 (75.781) Acc@5 93.164 (93.164) [2022-09-30 23:15:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.388 Acc@5 93.038 [2022-09-30 23:15:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-09-30 23:15:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.39% [2022-09-30 23:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][0/1251] eta 1:06:02 lr 0.000557 time 3.1678 (3.1678) loss 4.0897 (4.0897) grad_norm 1.3398 (1.3398) [2022-09-30 23:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][100/1251] eta 0:06:09 lr 0.000556 time 0.2930 (0.3214) loss 2.7766 (3.5853) grad_norm 1.3347 (1.4605) [2022-09-30 23:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][200/1251] eta 0:05:21 lr 0.000556 time 0.2910 (0.3061) loss 3.8724 (3.5762) grad_norm 1.4553 (1.4501) [2022-09-30 23:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][300/1251] eta 0:04:46 lr 0.000556 time 0.2913 (0.3009) loss 3.9985 (3.5659) grad_norm 1.3438 (1.4443) [2022-09-30 23:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][400/1251] eta 0:04:13 lr 0.000555 time 0.2906 (0.2984) loss 3.4801 (3.5777) grad_norm 1.4998 (1.4442) [2022-09-30 23:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][500/1251] eta 0:03:43 lr 0.000555 time 0.3852 (0.2970) loss 3.8392 (3.5833) grad_norm 1.2477 (1.4338) [2022-09-30 23:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][600/1251] eta 0:03:12 lr 0.000554 time 0.2911 (0.2959) loss 2.6764 (3.5851) grad_norm 1.2702 (1.4321) [2022-09-30 23:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][700/1251] eta 0:02:42 lr 0.000554 time 0.2921 (0.2951) loss 3.5283 (3.5925) grad_norm 1.6913 (1.4275) [2022-09-30 23:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][800/1251] eta 0:02:12 lr 0.000553 time 0.2892 (0.2945) loss 4.2461 (3.5831) grad_norm 1.4792 (1.4281) [2022-09-30 23:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][900/1251] eta 0:01:43 lr 0.000553 time 0.2928 (0.2941) loss 3.0091 (3.5899) grad_norm 1.2980 (1.4286) [2022-09-30 23:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1000/1251] eta 0:01:13 lr 0.000553 time 0.3837 (0.2938) loss 3.2367 (3.5843) grad_norm 1.3510 (1.4256) [2022-09-30 23:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1100/1251] eta 0:00:44 lr 0.000552 time 0.2903 (0.2934) loss 4.0149 (3.5822) grad_norm 1.3374 (1.4284) [2022-09-30 23:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1200/1251] eta 0:00:14 lr 0.000552 time 0.2934 (0.2932) loss 3.7634 (3.5874) grad_norm 1.3322 (1.4294) [2022-09-30 23:21:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 140 training takes 0:06:06 [2022-09-30 23:21:44 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saving...... [2022-09-30 23:21:44 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saved !!! [2022-09-30 23:21:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.759 (2.759) Loss 1.0804 (1.0804) Acc@1 74.219 (74.219) Acc@5 93.555 (93.555) [2022-09-30 23:21:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.756 Acc@5 93.020 [2022-09-30 23:21:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-09-30 23:21:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-09-30 23:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][0/1251] eta 1:08:12 lr 0.000552 time 3.2716 (3.2716) loss 2.8785 (2.8785) grad_norm 1.4003 (1.4003) [2022-09-30 23:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][100/1251] eta 0:06:08 lr 0.000551 time 0.2925 (0.3200) loss 2.9180 (3.5264) grad_norm 1.5021 (1.4458) [2022-09-30 23:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][200/1251] eta 0:05:21 lr 0.000551 time 0.3821 (0.3055) loss 3.6008 (3.4561) grad_norm 1.5430 (1.4369) [2022-09-30 23:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][300/1251] eta 0:04:45 lr 0.000550 time 0.2925 (0.3004) loss 2.6595 (3.4740) grad_norm 1.2549 (1.4346) [2022-09-30 23:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][400/1251] eta 0:04:13 lr 0.000550 time 0.2898 (0.2976) loss 3.8919 (3.4866) grad_norm 1.3129 (1.4385) [2022-09-30 23:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][500/1251] eta 0:03:42 lr 0.000550 time 0.2934 (0.2960) loss 3.2595 (3.4910) grad_norm 1.3307 (1.4411) [2022-09-30 23:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][600/1251] eta 0:03:12 lr 0.000549 time 0.2929 (0.2949) loss 4.2198 (3.5046) grad_norm 1.3360 (1.4350) [2022-09-30 23:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][700/1251] eta 0:02:42 lr 0.000549 time 0.3820 (0.2943) loss 3.9514 (3.5229) grad_norm 1.3538 (1.4380) [2022-09-30 23:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][800/1251] eta 0:02:12 lr 0.000548 time 0.2899 (0.2937) loss 3.3696 (3.5232) grad_norm 1.5255 (1.4386) [2022-09-30 23:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][900/1251] eta 0:01:42 lr 0.000548 time 0.2900 (0.2932) loss 3.2030 (3.5382) grad_norm 1.4701 (1.4377) [2022-09-30 23:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1000/1251] eta 0:01:13 lr 0.000547 time 0.2888 (0.2928) loss 3.7328 (3.5498) grad_norm 1.4028 (1.4397) [2022-09-30 23:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1100/1251] eta 0:00:44 lr 0.000547 time 0.2881 (0.2924) loss 2.9368 (3.5593) grad_norm 1.4318 (1.4351) [2022-09-30 23:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1200/1251] eta 0:00:14 lr 0.000547 time 0.3771 (0.2922) loss 3.2716 (3.5691) grad_norm 1.4839 (1.4357) [2022-09-30 23:28:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 141 training takes 0:06:05 [2022-09-30 23:28:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.082 (3.082) Loss 1.1108 (1.1108) Acc@1 73.828 (73.828) Acc@5 92.188 (92.188) [2022-09-30 23:28:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.330 Acc@5 93.004 [2022-09-30 23:28:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-09-30 23:28:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-09-30 23:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][0/1251] eta 0:58:15 lr 0.000546 time 2.7941 (2.7941) loss 2.8928 (2.8928) grad_norm 1.4102 (1.4102) [2022-09-30 23:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][100/1251] eta 0:06:04 lr 0.000546 time 0.2959 (0.3165) loss 3.6289 (3.5395) grad_norm 1.4183 (1.4586) [2022-09-30 23:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][200/1251] eta 0:05:18 lr 0.000546 time 0.2838 (0.3027) loss 3.9763 (3.5260) grad_norm 1.2376 (1.4585) [2022-09-30 23:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][300/1251] eta 0:04:43 lr 0.000545 time 0.2861 (0.2982) loss 3.7198 (3.5447) grad_norm 1.3271 (1.4499) [2022-09-30 23:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][400/1251] eta 0:04:12 lr 0.000545 time 0.3791 (0.2962) loss 3.8467 (3.5348) grad_norm 1.3513 (1.4445) [2022-09-30 23:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][500/1251] eta 0:03:41 lr 0.000544 time 0.2871 (0.2948) loss 3.3994 (3.5406) grad_norm 1.6038 (1.4431) [2022-09-30 23:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][600/1251] eta 0:03:11 lr 0.000544 time 0.2888 (0.2939) loss 4.0821 (3.5385) grad_norm 1.2786 (1.4382) [2022-09-30 23:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][700/1251] eta 0:02:41 lr 0.000544 time 0.2897 (0.2932) loss 3.5667 (3.5414) grad_norm 1.3917 (1.4376) [2022-09-30 23:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][800/1251] eta 0:02:12 lr 0.000543 time 0.2925 (0.2928) loss 4.0025 (3.5436) grad_norm 1.2905 (1.4400) [2022-09-30 23:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][900/1251] eta 0:01:42 lr 0.000543 time 0.3831 (0.2926) loss 3.5715 (3.5541) grad_norm 1.6852 (1.4373) [2022-09-30 23:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1000/1251] eta 0:01:13 lr 0.000542 time 0.2853 (0.2923) loss 3.6563 (3.5617) grad_norm 1.5137 (1.4367) [2022-09-30 23:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1100/1251] eta 0:00:44 lr 0.000542 time 0.2876 (0.2921) loss 3.9420 (3.5612) grad_norm 1.4635 (1.4350) [2022-09-30 23:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1200/1251] eta 0:00:14 lr 0.000541 time 0.2869 (0.2919) loss 2.5175 (3.5589) grad_norm 1.3511 (1.4342) [2022-09-30 23:34:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 142 training takes 0:06:05 [2022-09-30 23:34:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.887 (2.887) Loss 1.0662 (1.0662) Acc@1 74.219 (74.219) Acc@5 93.359 (93.359) [2022-09-30 23:34:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.664 Acc@5 93.168 [2022-09-30 23:34:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-09-30 23:34:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-09-30 23:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][0/1251] eta 0:53:29 lr 0.000541 time 2.5655 (2.5655) loss 2.9692 (2.9692) grad_norm 1.3569 (1.3569) [2022-09-30 23:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][100/1251] eta 0:06:08 lr 0.000541 time 0.3917 (0.3198) loss 3.7455 (3.5921) grad_norm 1.4027 (1.4438) [2022-09-30 23:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][200/1251] eta 0:05:21 lr 0.000540 time 0.2920 (0.3059) loss 2.9239 (3.5825) grad_norm 1.3358 (1.4348) [2022-09-30 23:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][300/1251] eta 0:04:46 lr 0.000540 time 0.2969 (0.3010) loss 3.3718 (3.5522) grad_norm 1.2993 (1.4405) [2022-09-30 23:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][400/1251] eta 0:04:14 lr 0.000540 time 0.2885 (0.2986) loss 4.2331 (3.5394) grad_norm 1.3517 (1.4393) [2022-09-30 23:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][500/1251] eta 0:03:43 lr 0.000539 time 0.2911 (0.2971) loss 3.5922 (3.5257) grad_norm 1.5077 (1.4413) [2022-09-30 23:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][600/1251] eta 0:03:12 lr 0.000539 time 0.3810 (0.2962) loss 4.0263 (3.5444) grad_norm 1.4888 (1.4433) [2022-09-30 23:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][700/1251] eta 0:02:42 lr 0.000538 time 0.2890 (0.2955) loss 3.8089 (3.5313) grad_norm 1.2214 (1.4458) [2022-09-30 23:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][800/1251] eta 0:02:13 lr 0.000538 time 0.2891 (0.2950) loss 4.1521 (3.5250) grad_norm 1.4278 (1.4416) [2022-09-30 23:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][900/1251] eta 0:01:43 lr 0.000538 time 0.2915 (0.2946) loss 3.9290 (3.5285) grad_norm 1.3997 (1.4377) [2022-09-30 23:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1000/1251] eta 0:01:13 lr 0.000537 time 0.2884 (0.2942) loss 2.2518 (3.5275) grad_norm 1.3484 (1.4387) [2022-09-30 23:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1100/1251] eta 0:00:44 lr 0.000537 time 0.3787 (0.2940) loss 3.9061 (3.5255) grad_norm 1.4725 (1.4406) [2022-09-30 23:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1200/1251] eta 0:00:14 lr 0.000536 time 0.2896 (0.2937) loss 4.1695 (3.5346) grad_norm 1.4376 (1.4428) [2022-09-30 23:40:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 143 training takes 0:06:07 [2022-09-30 23:40:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.987 (2.987) Loss 1.1079 (1.1079) Acc@1 74.805 (74.805) Acc@5 92.285 (92.285) [2022-09-30 23:40:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.456 Acc@5 93.004 [2022-09-30 23:40:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-09-30 23:40:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-09-30 23:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][0/1251] eta 1:05:56 lr 0.000536 time 3.1630 (3.1630) loss 3.8614 (3.8614) grad_norm 1.3624 (1.3624) [2022-09-30 23:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][100/1251] eta 0:06:07 lr 0.000536 time 0.2915 (0.3194) loss 4.1311 (3.5488) grad_norm 1.5305 (1.4540) [2022-09-30 23:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][200/1251] eta 0:05:20 lr 0.000535 time 0.2941 (0.3050) loss 3.3212 (3.5012) grad_norm 1.5506 (1.4524) [2022-09-30 23:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][300/1251] eta 0:04:45 lr 0.000535 time 0.3802 (0.3003) loss 3.7528 (3.5125) grad_norm 1.2658 (1.4524) [2022-09-30 23:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][400/1251] eta 0:04:13 lr 0.000534 time 0.2920 (0.2977) loss 4.0122 (3.5230) grad_norm 1.3748 (1.4519) [2022-09-30 23:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][500/1251] eta 0:03:42 lr 0.000534 time 0.2893 (0.2962) loss 3.0011 (3.5230) grad_norm 1.3322 (1.4507) [2022-09-30 23:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][600/1251] eta 0:03:12 lr 0.000534 time 0.2946 (0.2952) loss 3.1839 (3.5367) grad_norm 1.4866 (1.4538) [2022-09-30 23:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][700/1251] eta 0:02:42 lr 0.000533 time 0.2897 (0.2944) loss 3.8504 (3.5276) grad_norm 1.3866 (1.4552) [2022-09-30 23:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][800/1251] eta 0:02:12 lr 0.000533 time 0.3881 (0.2939) loss 4.0820 (3.5309) grad_norm 1.3615 (1.4525) [2022-09-30 23:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][900/1251] eta 0:01:42 lr 0.000532 time 0.2884 (0.2933) loss 4.0685 (3.5274) grad_norm 1.4248 (1.4526) [2022-09-30 23:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1000/1251] eta 0:01:13 lr 0.000532 time 0.2938 (0.2928) loss 3.8090 (3.5269) grad_norm 1.6400 (1.4491) [2022-09-30 23:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1100/1251] eta 0:00:44 lr 0.000532 time 0.2860 (0.2924) loss 3.4697 (3.5311) grad_norm 1.5509 (1.4472) [2022-09-30 23:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1200/1251] eta 0:00:14 lr 0.000531 time 0.2964 (0.2921) loss 4.2919 (3.5333) grad_norm 1.6673 (1.4473) [2022-09-30 23:46:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 144 training takes 0:06:05 [2022-09-30 23:47:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.837 (2.837) Loss 1.0651 (1.0651) Acc@1 75.781 (75.781) Acc@5 92.773 (92.773) [2022-09-30 23:47:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.540 Acc@5 93.146 [2022-09-30 23:47:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-09-30 23:47:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-09-30 23:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][0/1251] eta 1:03:26 lr 0.000531 time 3.0427 (3.0427) loss 2.6171 (2.6171) grad_norm 1.5271 (1.5271) [2022-09-30 23:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][100/1251] eta 0:06:07 lr 0.000530 time 0.2883 (0.3190) loss 2.8285 (3.5217) grad_norm 1.4262 (1.4494) [2022-09-30 23:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][200/1251] eta 0:05:20 lr 0.000530 time 0.2891 (0.3046) loss 2.7100 (3.5567) grad_norm 1.5033 (1.4377) [2022-09-30 23:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][300/1251] eta 0:04:44 lr 0.000530 time 0.2882 (0.2995) loss 4.2658 (3.5824) grad_norm 1.3674 (1.4491) [2022-09-30 23:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][400/1251] eta 0:04:12 lr 0.000529 time 0.2912 (0.2968) loss 3.6940 (3.5752) grad_norm 1.3819 (1.4528) [2022-09-30 23:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][500/1251] eta 0:03:41 lr 0.000529 time 0.3873 (0.2954) loss 2.9295 (3.5564) grad_norm 1.3995 (1.4496) [2022-09-30 23:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][600/1251] eta 0:03:11 lr 0.000528 time 0.2907 (0.2943) loss 3.8628 (3.5657) grad_norm 1.5285 (1.4460) [2022-09-30 23:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][700/1251] eta 0:02:41 lr 0.000528 time 0.2857 (0.2937) loss 3.4939 (3.5595) grad_norm 1.3562 (1.4528) [2022-09-30 23:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][800/1251] eta 0:02:12 lr 0.000528 time 0.2872 (0.2931) loss 2.8743 (3.5588) grad_norm 1.5839 (1.4561) [2022-09-30 23:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][900/1251] eta 0:01:42 lr 0.000527 time 0.2933 (0.2927) loss 3.8431 (3.5523) grad_norm 1.3132 (1.4519) [2022-09-30 23:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1000/1251] eta 0:01:13 lr 0.000527 time 0.3971 (0.2925) loss 2.8241 (3.5500) grad_norm 1.4268 (1.4512) [2022-09-30 23:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1100/1251] eta 0:00:44 lr 0.000526 time 0.2866 (0.2923) loss 3.7753 (3.5473) grad_norm 1.4517 (1.4530) [2022-09-30 23:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1200/1251] eta 0:00:14 lr 0.000526 time 0.2888 (0.2921) loss 4.0391 (3.5501) grad_norm 1.5632 (1.4543) [2022-09-30 23:53:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 145 training takes 0:06:05 [2022-09-30 23:53:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.226 (2.226) Loss 0.9971 (0.9971) Acc@1 77.148 (77.148) Acc@5 93.945 (93.945) [2022-09-30 23:53:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.814 Acc@5 93.350 [2022-09-30 23:53:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-09-30 23:53:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.81% [2022-09-30 23:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][0/1251] eta 1:05:39 lr 0.000526 time 3.1495 (3.1495) loss 4.2784 (4.2784) grad_norm 1.3964 (1.3964) [2022-09-30 23:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][100/1251] eta 0:06:05 lr 0.000525 time 0.2890 (0.3179) loss 2.7725 (3.4378) grad_norm 1.3917 (1.4386) [2022-09-30 23:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][200/1251] eta 0:05:19 lr 0.000525 time 0.3814 (0.3040) loss 2.6611 (3.4782) grad_norm 1.4158 (1.4524) [2022-09-30 23:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][300/1251] eta 0:04:44 lr 0.000524 time 0.2869 (0.2990) loss 3.1667 (3.4889) grad_norm 1.3706 (1.4522) [2022-09-30 23:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][400/1251] eta 0:04:12 lr 0.000524 time 0.2880 (0.2964) loss 4.3558 (3.5009) grad_norm 1.2851 (1.4602) [2022-09-30 23:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][500/1251] eta 0:03:41 lr 0.000524 time 0.2891 (0.2949) loss 2.5204 (3.5191) grad_norm 1.3117 (1.4645) [2022-09-30 23:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][600/1251] eta 0:03:11 lr 0.000523 time 0.2908 (0.2939) loss 4.0890 (3.5215) grad_norm 1.2899 (1.4618) [2022-09-30 23:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][700/1251] eta 0:02:41 lr 0.000523 time 0.3801 (0.2933) loss 2.4125 (3.5221) grad_norm 1.5767 (1.4561) [2022-09-30 23:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][800/1251] eta 0:02:11 lr 0.000522 time 0.2882 (0.2927) loss 3.6424 (3.5218) grad_norm 1.4623 (1.4542) [2022-09-30 23:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][900/1251] eta 0:01:42 lr 0.000522 time 0.2874 (0.2921) loss 2.9898 (3.5241) grad_norm 1.3849 (1.4549) [2022-09-30 23:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1000/1251] eta 0:01:13 lr 0.000522 time 0.2876 (0.2917) loss 3.9743 (3.5303) grad_norm 1.5193 (1.4564) [2022-09-30 23:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1100/1251] eta 0:00:43 lr 0.000521 time 0.2879 (0.2914) loss 2.2388 (3.5290) grad_norm 1.3759 (1.4593) [2022-09-30 23:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1200/1251] eta 0:00:14 lr 0.000521 time 0.3816 (0.2912) loss 4.2474 (3.5300) grad_norm 1.3982 (1.4606) [2022-09-30 23:59:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 146 training takes 0:06:04 [2022-09-30 23:59:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.280 (3.280) Loss 1.1002 (1.1002) Acc@1 73.828 (73.828) Acc@5 91.602 (91.602) [2022-09-30 23:59:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.764 Acc@5 93.132 [2022-09-30 23:59:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-09-30 23:59:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.81% [2022-09-30 23:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][0/1251] eta 1:10:02 lr 0.000521 time 3.3595 (3.3595) loss 3.7290 (3.7290) grad_norm 1.2797 (1.2797) [2022-10-01 00:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][100/1251] eta 0:06:10 lr 0.000520 time 0.2972 (0.3222) loss 3.9807 (3.5313) grad_norm 1.4759 (1.5082) [2022-10-01 00:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][200/1251] eta 0:05:22 lr 0.000520 time 0.2993 (0.3070) loss 2.9424 (3.5060) grad_norm 1.7418 (1.4833) [2022-10-01 00:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][300/1251] eta 0:04:47 lr 0.000519 time 0.2964 (0.3018) loss 3.6028 (3.4991) grad_norm 1.4968 (1.4672) [2022-10-01 00:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][400/1251] eta 0:04:14 lr 0.000519 time 0.3826 (0.2993) loss 2.3007 (3.4904) grad_norm 1.3547 (1.4686) [2022-10-01 00:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][500/1251] eta 0:03:43 lr 0.000518 time 0.2937 (0.2976) loss 3.4923 (3.5084) grad_norm 1.3691 (1.4694) [2022-10-01 00:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][600/1251] eta 0:03:12 lr 0.000518 time 0.2906 (0.2964) loss 2.8386 (3.5175) grad_norm 1.3394 (1.4640) [2022-10-01 00:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][700/1251] eta 0:02:42 lr 0.000518 time 0.2904 (0.2956) loss 4.1261 (3.5099) grad_norm 1.3337 (1.4625) [2022-10-01 00:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][800/1251] eta 0:02:13 lr 0.000517 time 0.2961 (0.2950) loss 4.1676 (3.5283) grad_norm 1.3024 (1.4597) [2022-10-01 00:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][900/1251] eta 0:01:43 lr 0.000517 time 0.3860 (0.2945) loss 3.8361 (3.5374) grad_norm 1.3359 (1.4609) [2022-10-01 00:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1000/1251] eta 0:01:13 lr 0.000516 time 0.2892 (0.2939) loss 3.1970 (3.5278) grad_norm 1.4989 (1.4616) [2022-10-01 00:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1100/1251] eta 0:00:44 lr 0.000516 time 0.2913 (0.2935) loss 3.7139 (3.5282) grad_norm 1.4597 (1.4604) [2022-10-01 00:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1200/1251] eta 0:00:14 lr 0.000516 time 0.2901 (0.2931) loss 3.4974 (3.5282) grad_norm 1.4353 (1.4594) [2022-10-01 00:05:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 147 training takes 0:06:06 [2022-10-01 00:05:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.469 (2.469) Loss 1.0388 (1.0388) Acc@1 74.902 (74.902) Acc@5 93.555 (93.555) [2022-10-01 00:06:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.608 Acc@5 93.344 [2022-10-01 00:06:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-01 00:06:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.81% [2022-10-01 00:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][0/1251] eta 0:50:20 lr 0.000515 time 2.4145 (2.4145) loss 3.5701 (3.5701) grad_norm 1.7381 (1.7381) [2022-10-01 00:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][100/1251] eta 0:06:05 lr 0.000515 time 0.3764 (0.3178) loss 2.4327 (3.5790) grad_norm 1.4235 (1.5014) [2022-10-01 00:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][200/1251] eta 0:05:19 lr 0.000515 time 0.2921 (0.3042) loss 3.9428 (3.5201) grad_norm 1.3915 (1.4787) [2022-10-01 00:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][300/1251] eta 0:04:44 lr 0.000514 time 0.2918 (0.2994) loss 3.9730 (3.5141) grad_norm 1.4041 (1.4768) [2022-10-01 00:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][400/1251] eta 0:04:12 lr 0.000514 time 0.2852 (0.2971) loss 4.2722 (3.5303) grad_norm 1.4283 (1.4747) [2022-10-01 00:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][500/1251] eta 0:03:42 lr 0.000513 time 0.2906 (0.2957) loss 3.4822 (3.5361) grad_norm 1.7028 (1.4724) [2022-10-01 00:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][600/1251] eta 0:03:12 lr 0.000513 time 0.3852 (0.2952) loss 4.3757 (3.5485) grad_norm 1.6046 (1.4732) [2022-10-01 00:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][700/1251] eta 0:02:42 lr 0.000512 time 0.2941 (0.2945) loss 3.7946 (3.5455) grad_norm 1.7041 (1.4727) [2022-10-01 00:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][800/1251] eta 0:02:12 lr 0.000512 time 0.2918 (0.2940) loss 4.0218 (3.5377) grad_norm 1.3695 (1.4747) [2022-10-01 00:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][900/1251] eta 0:01:43 lr 0.000512 time 0.2920 (0.2936) loss 3.0769 (3.5450) grad_norm 1.3083 (1.4738) [2022-10-01 00:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1000/1251] eta 0:01:13 lr 0.000511 time 0.2906 (0.2933) loss 2.1658 (3.5513) grad_norm 1.4781 (1.4743) [2022-10-01 00:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1100/1251] eta 0:00:44 lr 0.000511 time 0.3835 (0.2932) loss 3.7329 (3.5509) grad_norm 1.4720 (1.4862) [2022-10-01 00:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1200/1251] eta 0:00:14 lr 0.000510 time 0.2921 (0.2929) loss 3.4278 (3.5498) grad_norm 1.4947 (1.4890) [2022-10-01 00:12:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 148 training takes 0:06:06 [2022-10-01 00:12:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.232 (2.232) Loss 0.9665 (0.9665) Acc@1 77.246 (77.246) Acc@5 93.750 (93.750) [2022-10-01 00:12:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.922 Acc@5 93.234 [2022-10-01 00:12:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-01 00:12:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.92% [2022-10-01 00:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][0/1251] eta 1:08:15 lr 0.000510 time 3.2737 (3.2737) loss 3.7887 (3.7887) grad_norm 1.4942 (1.4942) [2022-10-01 00:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][100/1251] eta 0:06:07 lr 0.000510 time 0.2873 (0.3192) loss 3.6059 (3.5191) grad_norm 1.7268 (1.4883) [2022-10-01 00:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][200/1251] eta 0:05:19 lr 0.000509 time 0.2862 (0.3038) loss 3.2073 (3.5309) grad_norm 1.6674 (1.4965) [2022-10-01 00:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][300/1251] eta 0:04:44 lr 0.000509 time 0.3811 (0.2988) loss 2.9250 (3.5469) grad_norm 1.4914 (1.4977) [2022-10-01 00:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][400/1251] eta 0:04:12 lr 0.000509 time 0.2902 (0.2963) loss 4.0673 (3.5508) grad_norm 1.5692 (1.5053) [2022-10-01 00:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][500/1251] eta 0:03:41 lr 0.000508 time 0.2885 (0.2948) loss 4.2274 (3.5520) grad_norm 1.7733 (1.5035) [2022-10-01 00:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][600/1251] eta 0:03:11 lr 0.000508 time 0.2890 (0.2938) loss 3.6941 (3.5456) grad_norm 1.3712 (1.4944) [2022-10-01 00:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][700/1251] eta 0:02:41 lr 0.000507 time 0.2860 (0.2931) loss 2.5996 (3.5358) grad_norm 1.4803 (1.4889) [2022-10-01 00:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][800/1251] eta 0:02:11 lr 0.000507 time 0.3784 (0.2926) loss 3.7265 (3.5241) grad_norm 1.3124 (1.4834) [2022-10-01 00:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][900/1251] eta 0:01:42 lr 0.000506 time 0.2862 (0.2922) loss 3.8933 (3.5205) grad_norm 1.3800 (1.4838) [2022-10-01 00:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1000/1251] eta 0:01:13 lr 0.000506 time 0.2854 (0.2917) loss 2.9095 (3.5260) grad_norm 1.5403 (1.4820) [2022-10-01 00:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1100/1251] eta 0:00:43 lr 0.000506 time 0.2832 (0.2914) loss 3.5322 (3.5267) grad_norm 1.5979 (1.4852) [2022-10-01 00:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1200/1251] eta 0:00:14 lr 0.000505 time 0.2929 (0.2911) loss 3.7110 (3.5252) grad_norm 1.4605 (1.4813) [2022-10-01 00:18:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 149 training takes 0:06:04 [2022-10-01 00:18:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.304 (2.304) Loss 1.0863 (1.0863) Acc@1 73.438 (73.438) Acc@5 92.188 (92.188) [2022-10-01 00:18:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.820 Acc@5 93.170 [2022-10-01 00:18:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-01 00:18:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.92% [2022-10-01 00:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][0/1251] eta 1:03:43 lr 0.000505 time 3.0568 (3.0568) loss 3.9831 (3.9831) grad_norm 1.4653 (1.4653) [2022-10-01 00:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][100/1251] eta 0:06:07 lr 0.000505 time 0.2898 (0.3189) loss 2.5828 (3.5481) grad_norm 1.5092 (1.4711) [2022-10-01 00:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][200/1251] eta 0:05:19 lr 0.000504 time 0.2915 (0.3042) loss 2.9304 (3.5052) grad_norm 1.5515 (1.4643) [2022-10-01 00:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][300/1251] eta 0:04:44 lr 0.000504 time 0.2894 (0.2992) loss 4.1076 (3.4998) grad_norm 1.5694 (1.4757) [2022-10-01 00:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][400/1251] eta 0:04:12 lr 0.000503 time 0.2859 (0.2967) loss 3.0873 (3.4923) grad_norm 1.7016 (1.4808) [2022-10-01 00:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][500/1251] eta 0:03:41 lr 0.000503 time 0.3786 (0.2953) loss 4.2368 (3.4744) grad_norm 1.4399 (1.4795) [2022-10-01 00:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][600/1251] eta 0:03:11 lr 0.000503 time 0.2913 (0.2942) loss 2.7154 (3.4832) grad_norm 1.4984 (1.4769) [2022-10-01 00:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][700/1251] eta 0:02:41 lr 0.000502 time 0.2870 (0.2934) loss 3.2242 (3.4943) grad_norm 1.5412 (1.4754) [2022-10-01 00:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][800/1251] eta 0:02:12 lr 0.000502 time 0.2887 (0.2929) loss 3.1362 (3.5001) grad_norm 1.5049 (1.4729) [2022-10-01 00:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][900/1251] eta 0:01:42 lr 0.000501 time 0.2851 (0.2924) loss 3.6748 (3.5070) grad_norm 1.3814 (1.4735) [2022-10-01 00:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1000/1251] eta 0:01:13 lr 0.000501 time 0.3823 (0.2922) loss 3.9026 (3.5048) grad_norm 1.5523 (1.4740) [2022-10-01 00:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1100/1251] eta 0:00:44 lr 0.000500 time 0.2868 (0.2919) loss 3.6907 (3.5106) grad_norm 1.2908 (1.4737) [2022-10-01 00:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1200/1251] eta 0:00:14 lr 0.000500 time 0.2906 (0.2916) loss 3.8199 (3.5060) grad_norm 1.3707 (1.4755) [2022-10-01 00:24:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 150 training takes 0:06:04 [2022-10-01 00:24:49 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saving...... [2022-10-01 00:24:49 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saved !!! [2022-10-01 00:24:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.540 (2.540) Loss 1.0125 (1.0125) Acc@1 75.879 (75.879) Acc@5 93.359 (93.359) [2022-10-01 00:25:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.890 Acc@5 93.304 [2022-10-01 00:25:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-01 00:25:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.92% [2022-10-01 00:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][0/1251] eta 0:45:52 lr 0.000500 time 2.2006 (2.2006) loss 3.9213 (3.9213) grad_norm 1.6472 (1.6472) [2022-10-01 00:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][100/1251] eta 0:06:03 lr 0.000499 time 0.2907 (0.3156) loss 3.0067 (3.5527) grad_norm 1.4458 (1.5205) [2022-10-01 00:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][200/1251] eta 0:05:17 lr 0.000499 time 0.3791 (0.3024) loss 3.1045 (3.5560) grad_norm 1.4709 (1.4937) [2022-10-01 00:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][300/1251] eta 0:04:43 lr 0.000499 time 0.2912 (0.2978) loss 3.9649 (3.5483) grad_norm 1.8143 (1.4876) [2022-10-01 00:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][400/1251] eta 0:04:11 lr 0.000498 time 0.2840 (0.2955) loss 2.5300 (3.5529) grad_norm 1.5002 (1.4901) [2022-10-01 00:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][500/1251] eta 0:03:40 lr 0.000498 time 0.2918 (0.2941) loss 3.7995 (3.5466) grad_norm 1.3540 (1.4866) [2022-10-01 00:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][600/1251] eta 0:03:10 lr 0.000497 time 0.2877 (0.2931) loss 2.9350 (3.5534) grad_norm 1.5398 (1.4893) [2022-10-01 00:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][700/1251] eta 0:02:41 lr 0.000497 time 0.3901 (0.2926) loss 3.1410 (3.5460) grad_norm 1.4593 (1.4946) [2022-10-01 00:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][800/1251] eta 0:02:11 lr 0.000497 time 0.2882 (0.2921) loss 3.8710 (3.5546) grad_norm 1.2321 (1.4945) [2022-10-01 00:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][900/1251] eta 0:01:42 lr 0.000496 time 0.2898 (0.2918) loss 4.1231 (3.5543) grad_norm 1.4741 (1.4961) [2022-10-01 00:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1000/1251] eta 0:01:13 lr 0.000496 time 0.2887 (0.2914) loss 3.7701 (3.5515) grad_norm 1.7988 (1.4993) [2022-10-01 00:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1100/1251] eta 0:00:43 lr 0.000495 time 0.2917 (0.2912) loss 3.4047 (3.5597) grad_norm 1.4551 (1.5003) [2022-10-01 00:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1200/1251] eta 0:00:14 lr 0.000495 time 0.3776 (0.2910) loss 2.4903 (3.5566) grad_norm 1.3039 (1.4980) [2022-10-01 00:31:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 151 training takes 0:06:04 [2022-10-01 00:31:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.154 (3.154) Loss 1.0543 (1.0543) Acc@1 76.172 (76.172) Acc@5 92.871 (92.871) [2022-10-01 00:31:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.196 Acc@5 93.282 [2022-10-01 00:31:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-01 00:31:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.20% [2022-10-01 00:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][0/1251] eta 1:06:40 lr 0.000495 time 3.1978 (3.1978) loss 4.2245 (4.2245) grad_norm 1.4298 (1.4298) [2022-10-01 00:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][100/1251] eta 0:06:07 lr 0.000494 time 0.2886 (0.3192) loss 4.1377 (3.5316) grad_norm 1.3202 (1.5232) [2022-10-01 00:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][200/1251] eta 0:05:20 lr 0.000494 time 0.2868 (0.3051) loss 2.3505 (3.5550) grad_norm 1.3815 (1.5181) [2022-10-01 00:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][300/1251] eta 0:04:45 lr 0.000493 time 0.2904 (0.3001) loss 2.5759 (3.5509) grad_norm 1.5996 (1.5285) [2022-10-01 00:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][400/1251] eta 0:04:13 lr 0.000493 time 0.3727 (0.2980) loss 3.5567 (3.5365) grad_norm 2.1978 (1.5285) [2022-10-01 00:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][500/1251] eta 0:03:42 lr 0.000493 time 0.2878 (0.2965) loss 3.6735 (3.5315) grad_norm 1.3056 (1.5240) [2022-10-01 00:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][600/1251] eta 0:03:12 lr 0.000492 time 0.2883 (0.2954) loss 3.6986 (3.5132) grad_norm 1.6139 (1.5189) [2022-10-01 00:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][700/1251] eta 0:02:42 lr 0.000492 time 0.2916 (0.2947) loss 2.9933 (3.5229) grad_norm 1.4195 (1.5212) [2022-10-01 00:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][800/1251] eta 0:02:12 lr 0.000491 time 0.2867 (0.2942) loss 3.8110 (3.5192) grad_norm 1.4397 (1.5194) [2022-10-01 00:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][900/1251] eta 0:01:43 lr 0.000491 time 0.3866 (0.2939) loss 3.6215 (3.5347) grad_norm 1.4048 (1.5166) [2022-10-01 00:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1000/1251] eta 0:01:13 lr 0.000490 time 0.2883 (0.2936) loss 3.7845 (3.5363) grad_norm 1.6135 (1.5144) [2022-10-01 00:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1100/1251] eta 0:00:44 lr 0.000490 time 0.2898 (0.2932) loss 4.0692 (3.5297) grad_norm 1.5793 (1.5169) [2022-10-01 00:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1200/1251] eta 0:00:14 lr 0.000490 time 0.2893 (0.2930) loss 4.2551 (3.5360) grad_norm 1.6015 (1.5199) [2022-10-01 00:37:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 152 training takes 0:06:06 [2022-10-01 00:37:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.275 (2.275) Loss 1.0617 (1.0617) Acc@1 76.367 (76.367) Acc@5 91.992 (91.992) [2022-10-01 00:37:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.042 Acc@5 93.406 [2022-10-01 00:37:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-01 00:37:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.20% [2022-10-01 00:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][0/1251] eta 1:06:49 lr 0.000489 time 3.2054 (3.2054) loss 3.4175 (3.4175) grad_norm 1.4616 (1.4616) [2022-10-01 00:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][100/1251] eta 0:06:05 lr 0.000489 time 0.3802 (0.3179) loss 3.0847 (3.4369) grad_norm 1.4695 (1.5311) [2022-10-01 00:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][200/1251] eta 0:05:18 lr 0.000489 time 0.2874 (0.3029) loss 3.3147 (3.4563) grad_norm 1.9172 (1.5185) [2022-10-01 00:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][300/1251] eta 0:04:43 lr 0.000488 time 0.2866 (0.2977) loss 3.7685 (3.4932) grad_norm 1.5923 (1.5340) [2022-10-01 00:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][400/1251] eta 0:04:11 lr 0.000488 time 0.2868 (0.2953) loss 3.4468 (3.5056) grad_norm 1.3633 (1.5329) [2022-10-01 00:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][500/1251] eta 0:03:40 lr 0.000487 time 0.2852 (0.2938) loss 3.5443 (3.5068) grad_norm 1.4533 (1.5281) [2022-10-01 00:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][600/1251] eta 0:03:10 lr 0.000487 time 0.3774 (0.2928) loss 3.1928 (3.5109) grad_norm 1.4339 (1.5261) [2022-10-01 00:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][700/1251] eta 0:02:40 lr 0.000487 time 0.2889 (0.2921) loss 3.6048 (3.5067) grad_norm 1.3799 (1.5235) [2022-10-01 00:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][800/1251] eta 0:02:11 lr 0.000486 time 0.2898 (0.2914) loss 3.7349 (3.5092) grad_norm 1.4061 (1.5239) [2022-10-01 00:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][900/1251] eta 0:01:42 lr 0.000486 time 0.2849 (0.2909) loss 3.7142 (3.5022) grad_norm 1.6027 (1.5202) [2022-10-01 00:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1000/1251] eta 0:01:12 lr 0.000485 time 0.2868 (0.2905) loss 3.4906 (3.5069) grad_norm 1.3500 (1.5148) [2022-10-01 00:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1100/1251] eta 0:00:43 lr 0.000485 time 0.3764 (0.2903) loss 4.1281 (3.5085) grad_norm 1.4176 (1.5124) [2022-10-01 00:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1200/1251] eta 0:00:14 lr 0.000484 time 0.2857 (0.2900) loss 3.3475 (3.5013) grad_norm 1.6040 (1.5111) [2022-10-01 00:43:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 153 training takes 0:06:02 [2022-10-01 00:43:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.902 (2.902) Loss 0.9594 (0.9594) Acc@1 75.684 (75.684) Acc@5 94.629 (94.629) [2022-10-01 00:43:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.058 Acc@5 93.356 [2022-10-01 00:43:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-01 00:43:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.20% [2022-10-01 00:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][0/1251] eta 1:07:34 lr 0.000484 time 3.2406 (3.2406) loss 3.2046 (3.2046) grad_norm 1.4644 (1.4644) [2022-10-01 00:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][100/1251] eta 0:06:07 lr 0.000484 time 0.2872 (0.3190) loss 4.2176 (3.4461) grad_norm 1.4422 (1.5033) [2022-10-01 00:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][200/1251] eta 0:05:19 lr 0.000483 time 0.2847 (0.3039) loss 3.9263 (3.5116) grad_norm 1.6661 (1.5136) [2022-10-01 00:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][300/1251] eta 0:04:44 lr 0.000483 time 0.3786 (0.2989) loss 3.7568 (3.4990) grad_norm 1.4480 (1.5185) [2022-10-01 00:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][400/1251] eta 0:04:12 lr 0.000483 time 0.2868 (0.2965) loss 3.9014 (3.4942) grad_norm 1.4148 (1.5242) [2022-10-01 00:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][500/1251] eta 0:03:41 lr 0.000482 time 0.2865 (0.2949) loss 3.6336 (3.4901) grad_norm 1.5508 (1.5183) [2022-10-01 00:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][600/1251] eta 0:03:11 lr 0.000482 time 0.2874 (0.2939) loss 2.6983 (3.4894) grad_norm 1.3151 (1.5169) [2022-10-01 00:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][700/1251] eta 0:02:41 lr 0.000481 time 0.2867 (0.2931) loss 3.8468 (3.5011) grad_norm 1.4588 (1.5121) [2022-10-01 00:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][800/1251] eta 0:02:11 lr 0.000481 time 0.3808 (0.2926) loss 4.1803 (3.5009) grad_norm 1.7089 (1.5146) [2022-10-01 00:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][900/1251] eta 0:01:42 lr 0.000481 time 0.2890 (0.2922) loss 2.9245 (3.4928) grad_norm 1.3606 (1.5134) [2022-10-01 00:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1000/1251] eta 0:01:13 lr 0.000480 time 0.2874 (0.2918) loss 4.0959 (3.4896) grad_norm 1.7288 (1.5120) [2022-10-01 00:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1100/1251] eta 0:00:44 lr 0.000480 time 0.2881 (0.2915) loss 4.0594 (3.4940) grad_norm 1.3605 (1.5131) [2022-10-01 00:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1200/1251] eta 0:00:14 lr 0.000479 time 0.2845 (0.2912) loss 2.9205 (3.4989) grad_norm 1.5845 (1.5145) [2022-10-01 00:49:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 154 training takes 0:06:04 [2022-10-01 00:50:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.483 (2.483) Loss 0.9083 (0.9083) Acc@1 79.004 (79.004) Acc@5 94.336 (94.336) [2022-10-01 00:50:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.322 Acc@5 93.286 [2022-10-01 00:50:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-01 00:50:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.32% [2022-10-01 00:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][0/1251] eta 1:05:46 lr 0.000479 time 3.1547 (3.1547) loss 3.3923 (3.3923) grad_norm 1.3534 (1.3534) [2022-10-01 00:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][100/1251] eta 0:06:09 lr 0.000479 time 0.2907 (0.3212) loss 3.9821 (3.4750) grad_norm 1.4350 (1.5598) [2022-10-01 00:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][200/1251] eta 0:05:21 lr 0.000478 time 0.2880 (0.3057) loss 3.2619 (3.5036) grad_norm 1.4053 (1.5336) [2022-10-01 00:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][300/1251] eta 0:04:45 lr 0.000478 time 0.2893 (0.3006) loss 3.3850 (3.5086) grad_norm 1.5291 (1.5288) [2022-10-01 00:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][400/1251] eta 0:04:13 lr 0.000477 time 0.2858 (0.2980) loss 3.8117 (3.5031) grad_norm 1.7190 (1.5270) [2022-10-01 00:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][500/1251] eta 0:03:42 lr 0.000477 time 0.3890 (0.2966) loss 2.8772 (3.5041) grad_norm 1.4583 (1.5218) [2022-10-01 00:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][600/1251] eta 0:03:12 lr 0.000477 time 0.2872 (0.2954) loss 3.9342 (3.5010) grad_norm 1.7169 (1.5202) [2022-10-01 00:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][700/1251] eta 0:02:42 lr 0.000476 time 0.2863 (0.2945) loss 2.8366 (3.4994) grad_norm 1.6611 (1.5196) [2022-10-01 00:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][800/1251] eta 0:02:12 lr 0.000476 time 0.2856 (0.2939) loss 3.4290 (3.5077) grad_norm 1.4424 (1.5207) [2022-10-01 00:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][900/1251] eta 0:01:42 lr 0.000475 time 0.2887 (0.2932) loss 2.4092 (3.5182) grad_norm 1.6960 (1.5177) [2022-10-01 00:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1000/1251] eta 0:01:13 lr 0.000475 time 0.3770 (0.2928) loss 3.9527 (3.5193) grad_norm 1.5114 (1.5180) [2022-10-01 00:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1100/1251] eta 0:00:44 lr 0.000475 time 0.2869 (0.2925) loss 2.7206 (3.5180) grad_norm 1.4264 (1.5178) [2022-10-01 00:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1200/1251] eta 0:00:14 lr 0.000474 time 0.2943 (0.2921) loss 3.0104 (3.5176) grad_norm 1.8070 (1.5197) [2022-10-01 00:56:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 155 training takes 0:06:05 [2022-10-01 00:56:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.783 (2.783) Loss 0.9951 (0.9951) Acc@1 75.391 (75.391) Acc@5 93.066 (93.066) [2022-10-01 00:56:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.390 Acc@5 93.412 [2022-10-01 00:56:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 00:56:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.39% [2022-10-01 00:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][0/1251] eta 1:00:57 lr 0.000474 time 2.9239 (2.9239) loss 4.0678 (4.0678) grad_norm 1.5828 (1.5828) [2022-10-01 00:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][100/1251] eta 0:06:03 lr 0.000474 time 0.2914 (0.3161) loss 4.5083 (3.5742) grad_norm 1.4593 (1.5241) [2022-10-01 00:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][200/1251] eta 0:05:18 lr 0.000473 time 0.3776 (0.3026) loss 3.5827 (3.5215) grad_norm 1.4007 (1.5250) [2022-10-01 00:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][300/1251] eta 0:04:43 lr 0.000473 time 0.2871 (0.2977) loss 3.5381 (3.5039) grad_norm 1.4342 (1.5279) [2022-10-01 00:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][400/1251] eta 0:04:11 lr 0.000472 time 0.2891 (0.2952) loss 3.9354 (3.4898) grad_norm 1.4765 (1.5274) [2022-10-01 00:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][500/1251] eta 0:03:40 lr 0.000472 time 0.2858 (0.2937) loss 3.7101 (3.5063) grad_norm 1.5472 (1.5255) [2022-10-01 00:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][600/1251] eta 0:03:10 lr 0.000471 time 0.2953 (0.2928) loss 3.6650 (3.5079) grad_norm 1.3856 (1.5270) [2022-10-01 00:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][700/1251] eta 0:02:40 lr 0.000471 time 0.3832 (0.2922) loss 3.7277 (3.5122) grad_norm 1.4912 (1.5220) [2022-10-01 01:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][800/1251] eta 0:02:11 lr 0.000471 time 0.2884 (0.2917) loss 3.8077 (3.5137) grad_norm 1.5940 (1.5249) [2022-10-01 01:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][900/1251] eta 0:01:42 lr 0.000470 time 0.2875 (0.2913) loss 3.8219 (3.5224) grad_norm 1.6706 (1.5264) [2022-10-01 01:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1000/1251] eta 0:01:13 lr 0.000470 time 0.2862 (0.2910) loss 4.1561 (3.5212) grad_norm 1.7099 (1.5256) [2022-10-01 01:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1100/1251] eta 0:00:43 lr 0.000469 time 0.2855 (0.2907) loss 3.5964 (3.5243) grad_norm 1.6072 (1.5271) [2022-10-01 01:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1200/1251] eta 0:00:14 lr 0.000469 time 0.3817 (0.2906) loss 2.6950 (3.5226) grad_norm 1.2650 (1.5281) [2022-10-01 01:02:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 156 training takes 0:06:03 [2022-10-01 01:02:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.256 (3.256) Loss 1.0279 (1.0279) Acc@1 76.270 (76.270) Acc@5 94.238 (94.238) [2022-10-01 01:02:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.330 Acc@5 93.512 [2022-10-01 01:02:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-01 01:02:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.39% [2022-10-01 01:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][0/1251] eta 1:07:54 lr 0.000469 time 3.2572 (3.2572) loss 2.1069 (2.1069) grad_norm 1.4889 (1.4889) [2022-10-01 01:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][100/1251] eta 0:06:08 lr 0.000468 time 0.2898 (0.3204) loss 3.3517 (3.5247) grad_norm 1.4797 (1.5534) [2022-10-01 01:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][200/1251] eta 0:05:21 lr 0.000468 time 0.3384 (0.3057) loss 3.9276 (3.5015) grad_norm 1.6090 (1.5448) [2022-10-01 01:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][300/1251] eta 0:04:46 lr 0.000468 time 0.2910 (0.3008) loss 2.2401 (3.5138) grad_norm 1.4764 (1.5477) [2022-10-01 01:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][400/1251] eta 0:04:14 lr 0.000467 time 0.3893 (0.2985) loss 3.6047 (3.5184) grad_norm 1.4814 (1.5404) [2022-10-01 01:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][500/1251] eta 0:03:43 lr 0.000467 time 0.2906 (0.2970) loss 3.7680 (3.5058) grad_norm 1.4958 (1.5425) [2022-10-01 01:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][600/1251] eta 0:03:12 lr 0.000466 time 0.2894 (0.2959) loss 3.9417 (3.5023) grad_norm 1.3752 (1.5469) [2022-10-01 01:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][700/1251] eta 0:02:42 lr 0.000466 time 0.2907 (0.2952) loss 3.5959 (3.5111) grad_norm 1.5541 (1.5545) [2022-10-01 01:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][800/1251] eta 0:02:12 lr 0.000465 time 0.2900 (0.2946) loss 2.4743 (3.5026) grad_norm 1.4948 (1.5542) [2022-10-01 01:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][900/1251] eta 0:01:43 lr 0.000465 time 0.3891 (0.2942) loss 3.6599 (3.5089) grad_norm 1.6374 (1.5514) [2022-10-01 01:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1000/1251] eta 0:01:13 lr 0.000465 time 0.2929 (0.2937) loss 3.8642 (3.5036) grad_norm 1.5015 (1.5504) [2022-10-01 01:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1100/1251] eta 0:00:44 lr 0.000464 time 0.2938 (0.2933) loss 4.1874 (3.4969) grad_norm 1.6367 (1.5481) [2022-10-01 01:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1200/1251] eta 0:00:14 lr 0.000464 time 0.2905 (0.2931) loss 3.5530 (3.4964) grad_norm 1.3290 (1.5456) [2022-10-01 01:08:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 157 training takes 0:06:06 [2022-10-01 01:08:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.941 (2.941) Loss 0.9861 (0.9861) Acc@1 76.270 (76.270) Acc@5 93.457 (93.457) [2022-10-01 01:09:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.362 Acc@5 93.390 [2022-10-01 01:09:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 01:09:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.39% [2022-10-01 01:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][0/1251] eta 1:09:06 lr 0.000464 time 3.3149 (3.3149) loss 2.8137 (2.8137) grad_norm 1.4791 (1.4791) [2022-10-01 01:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][100/1251] eta 0:06:09 lr 0.000463 time 0.3903 (0.3214) loss 3.9805 (3.5770) grad_norm 1.6041 (1.5611) [2022-10-01 01:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][200/1251] eta 0:05:21 lr 0.000463 time 0.2880 (0.3058) loss 3.3332 (3.5432) grad_norm 1.7900 (1.5512) [2022-10-01 01:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][300/1251] eta 0:04:46 lr 0.000462 time 0.2923 (0.3007) loss 3.9646 (3.5003) grad_norm 1.4380 (1.5347) [2022-10-01 01:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][400/1251] eta 0:04:13 lr 0.000462 time 0.2883 (0.2980) loss 3.3401 (3.5016) grad_norm 1.5284 (1.5400) [2022-10-01 01:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][500/1251] eta 0:03:42 lr 0.000462 time 0.2898 (0.2964) loss 3.6408 (3.5061) grad_norm 1.4408 (1.5298) [2022-10-01 01:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][600/1251] eta 0:03:12 lr 0.000461 time 0.3790 (0.2954) loss 3.9035 (3.5165) grad_norm 1.5510 (1.5312) [2022-10-01 01:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][700/1251] eta 0:02:42 lr 0.000461 time 0.2927 (0.2947) loss 2.8966 (3.5275) grad_norm 1.7087 (1.5348) [2022-10-01 01:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][800/1251] eta 0:02:12 lr 0.000460 time 0.2871 (0.2941) loss 2.4120 (3.5218) grad_norm 1.4868 (1.5365) [2022-10-01 01:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][900/1251] eta 0:01:43 lr 0.000460 time 0.2893 (0.2936) loss 4.3113 (3.5265) grad_norm 1.8434 (1.5333) [2022-10-01 01:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1000/1251] eta 0:01:13 lr 0.000459 time 0.2883 (0.2932) loss 3.8769 (3.5276) grad_norm 1.5027 (1.5335) [2022-10-01 01:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1100/1251] eta 0:00:44 lr 0.000459 time 0.3854 (0.2931) loss 4.2751 (3.5308) grad_norm 1.3767 (1.5339) [2022-10-01 01:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1200/1251] eta 0:00:14 lr 0.000459 time 0.2867 (0.2928) loss 3.8249 (3.5272) grad_norm 1.3226 (1.5345) [2022-10-01 01:15:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 158 training takes 0:06:06 [2022-10-01 01:15:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.333 (3.333) Loss 0.9639 (0.9639) Acc@1 77.441 (77.441) Acc@5 93.652 (93.652) [2022-10-01 01:15:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.274 Acc@5 93.580 [2022-10-01 01:15:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-01 01:15:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.39% [2022-10-01 01:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][0/1251] eta 0:57:57 lr 0.000458 time 2.7795 (2.7795) loss 3.7760 (3.7760) grad_norm 1.3520 (1.3520) [2022-10-01 01:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][100/1251] eta 0:06:04 lr 0.000458 time 0.2921 (0.3169) loss 4.2678 (3.5063) grad_norm 1.4465 (1.5157) [2022-10-01 01:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][200/1251] eta 0:05:19 lr 0.000458 time 0.2870 (0.3036) loss 3.2900 (3.4993) grad_norm 1.7493 (1.5358) [2022-10-01 01:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][300/1251] eta 0:04:44 lr 0.000457 time 0.3865 (0.2994) loss 4.0662 (3.5118) grad_norm 1.5320 (1.5488) [2022-10-01 01:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][400/1251] eta 0:04:12 lr 0.000457 time 0.2885 (0.2969) loss 3.1555 (3.5011) grad_norm 1.4528 (1.5454) [2022-10-01 01:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][500/1251] eta 0:03:41 lr 0.000456 time 0.2879 (0.2953) loss 2.3800 (3.4801) grad_norm 1.4196 (1.5408) [2022-10-01 01:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][600/1251] eta 0:03:11 lr 0.000456 time 0.2902 (0.2941) loss 3.0156 (3.4761) grad_norm 1.5301 (1.5483) [2022-10-01 01:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][700/1251] eta 0:02:41 lr 0.000456 time 0.2862 (0.2933) loss 2.9774 (3.4804) grad_norm 1.3637 (1.5454) [2022-10-01 01:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][800/1251] eta 0:02:12 lr 0.000455 time 0.3806 (0.2928) loss 3.7808 (3.4888) grad_norm 1.8975 (1.5427) [2022-10-01 01:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][900/1251] eta 0:01:42 lr 0.000455 time 0.2870 (0.2922) loss 4.5156 (3.4907) grad_norm 1.7333 (1.5412) [2022-10-01 01:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1000/1251] eta 0:01:13 lr 0.000454 time 0.2852 (0.2918) loss 4.0294 (3.4921) grad_norm 1.5027 (1.5429) [2022-10-01 01:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1100/1251] eta 0:00:44 lr 0.000454 time 0.2888 (0.2914) loss 3.7209 (3.4951) grad_norm 1.3488 (1.5422) [2022-10-01 01:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1200/1251] eta 0:00:14 lr 0.000453 time 0.2925 (0.2911) loss 3.1142 (3.4938) grad_norm 1.4949 (1.5423) [2022-10-01 01:21:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 159 training takes 0:06:04 [2022-10-01 01:21:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.794 (2.794) Loss 0.9583 (0.9583) Acc@1 77.246 (77.246) Acc@5 93.652 (93.652) [2022-10-01 01:21:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.442 Acc@5 93.600 [2022-10-01 01:21:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 01:21:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.44% [2022-10-01 01:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][0/1251] eta 1:01:36 lr 0.000453 time 2.9551 (2.9551) loss 4.2477 (4.2477) grad_norm 1.5758 (1.5758) [2022-10-01 01:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][100/1251] eta 0:06:05 lr 0.000453 time 0.2887 (0.3172) loss 4.0315 (3.5043) grad_norm 1.4402 (1.5957) [2022-10-01 01:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][200/1251] eta 0:05:19 lr 0.000452 time 0.2890 (0.3036) loss 4.1256 (3.4890) grad_norm 1.6449 (1.5661) [2022-10-01 01:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][300/1251] eta 0:04:44 lr 0.000452 time 0.2908 (0.2988) loss 2.5383 (3.4886) grad_norm 1.4841 (1.5678) [2022-10-01 01:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][400/1251] eta 0:04:12 lr 0.000452 time 0.2883 (0.2966) loss 4.1821 (3.4869) grad_norm 1.6248 (1.5704) [2022-10-01 01:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][500/1251] eta 0:03:41 lr 0.000451 time 0.3862 (0.2953) loss 3.0368 (3.4732) grad_norm 1.3237 (1.5646) [2022-10-01 01:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][600/1251] eta 0:03:11 lr 0.000451 time 0.2872 (0.2944) loss 4.0596 (3.4697) grad_norm 1.6817 (1.5650) [2022-10-01 01:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][700/1251] eta 0:02:41 lr 0.000450 time 0.2903 (0.2937) loss 2.2966 (3.4673) grad_norm 1.6268 (1.5645) [2022-10-01 01:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][800/1251] eta 0:02:12 lr 0.000450 time 0.2875 (0.2932) loss 3.9086 (3.4628) grad_norm 1.5822 (1.5588) [2022-10-01 01:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][900/1251] eta 0:01:42 lr 0.000450 time 0.2885 (0.2927) loss 4.0320 (3.4608) grad_norm 1.5763 (1.5569) [2022-10-01 01:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1000/1251] eta 0:01:13 lr 0.000449 time 0.3844 (0.2925) loss 4.2314 (3.4621) grad_norm 1.3923 (1.5578) [2022-10-01 01:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1100/1251] eta 0:00:44 lr 0.000449 time 0.2941 (0.2923) loss 4.1163 (3.4541) grad_norm 1.5456 (1.5558) [2022-10-01 01:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1200/1251] eta 0:00:14 lr 0.000448 time 0.2876 (0.2920) loss 3.9241 (3.4588) grad_norm 1.6040 (1.5545) [2022-10-01 01:27:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 160 training takes 0:06:05 [2022-10-01 01:27:48 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saving...... [2022-10-01 01:27:48 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saved !!! [2022-10-01 01:27:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.979 (2.979) Loss 1.0083 (1.0083) Acc@1 75.977 (75.977) Acc@5 93.164 (93.164) [2022-10-01 01:28:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.390 Acc@5 93.640 [2022-10-01 01:28:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 01:28:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.44% [2022-10-01 01:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][0/1251] eta 1:07:59 lr 0.000448 time 3.2609 (3.2609) loss 3.6052 (3.6052) grad_norm 1.4266 (1.4266) [2022-10-01 01:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][100/1251] eta 0:06:09 lr 0.000448 time 0.2915 (0.3207) loss 3.5499 (3.4143) grad_norm 1.5232 (1.5562) [2022-10-01 01:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][200/1251] eta 0:05:21 lr 0.000447 time 0.3856 (0.3058) loss 3.4861 (3.4120) grad_norm 1.6990 (1.5838) [2022-10-01 01:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][300/1251] eta 0:04:45 lr 0.000447 time 0.2904 (0.3006) loss 4.0273 (3.4432) grad_norm 1.6125 (1.5827) [2022-10-01 01:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][400/1251] eta 0:04:13 lr 0.000446 time 0.2893 (0.2980) loss 3.9517 (3.4680) grad_norm 1.5323 (1.5649) [2022-10-01 01:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][500/1251] eta 0:03:42 lr 0.000446 time 0.2863 (0.2964) loss 2.9821 (3.4754) grad_norm 1.4297 (1.5683) [2022-10-01 01:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][600/1251] eta 0:03:12 lr 0.000446 time 0.2928 (0.2952) loss 4.2313 (3.4759) grad_norm 1.6575 (1.5657) [2022-10-01 01:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][700/1251] eta 0:02:42 lr 0.000445 time 0.3802 (0.2946) loss 4.3415 (3.4827) grad_norm 1.5867 (1.5601) [2022-10-01 01:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][800/1251] eta 0:02:12 lr 0.000445 time 0.2902 (0.2940) loss 3.8395 (3.4900) grad_norm 1.4421 (1.5574) [2022-10-01 01:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][900/1251] eta 0:01:42 lr 0.000444 time 0.2864 (0.2934) loss 3.8847 (3.4934) grad_norm 1.5518 (1.5590) [2022-10-01 01:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1000/1251] eta 0:01:13 lr 0.000444 time 0.2898 (0.2930) loss 3.9020 (3.4954) grad_norm 1.3170 (1.5601) [2022-10-01 01:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1100/1251] eta 0:00:44 lr 0.000444 time 0.2910 (0.2926) loss 4.0911 (3.4943) grad_norm 1.5149 (1.5574) [2022-10-01 01:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1200/1251] eta 0:00:14 lr 0.000443 time 0.3798 (0.2923) loss 3.9670 (3.4948) grad_norm 2.0226 (1.5596) [2022-10-01 01:34:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 161 training takes 0:06:05 [2022-10-01 01:34:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.922 (2.922) Loss 1.0703 (1.0703) Acc@1 75.488 (75.488) Acc@5 93.066 (93.066) [2022-10-01 01:34:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.368 Acc@5 93.542 [2022-10-01 01:34:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 01:34:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.44% [2022-10-01 01:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][0/1251] eta 0:58:52 lr 0.000443 time 2.8235 (2.8235) loss 2.8564 (2.8564) grad_norm 1.6822 (1.6822) [2022-10-01 01:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][100/1251] eta 0:06:05 lr 0.000443 time 0.2926 (0.3174) loss 3.2477 (3.4069) grad_norm 1.3553 (1.5176) [2022-10-01 01:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][200/1251] eta 0:05:19 lr 0.000442 time 0.2900 (0.3040) loss 3.5608 (3.4237) grad_norm 1.7308 (1.5485) [2022-10-01 01:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][300/1251] eta 0:04:44 lr 0.000442 time 0.2911 (0.2996) loss 3.7546 (3.4167) grad_norm 1.4770 (1.5544) [2022-10-01 01:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][400/1251] eta 0:04:13 lr 0.000441 time 0.3820 (0.2976) loss 3.3536 (3.4419) grad_norm 2.5637 (1.5590) [2022-10-01 01:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][500/1251] eta 0:03:42 lr 0.000441 time 0.2900 (0.2962) loss 3.7235 (3.4521) grad_norm 1.5622 (1.5583) [2022-10-01 01:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][600/1251] eta 0:03:12 lr 0.000440 time 0.2881 (0.2952) loss 2.9091 (3.4660) grad_norm 1.4694 (1.5595) [2022-10-01 01:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][700/1251] eta 0:02:42 lr 0.000440 time 0.2930 (0.2946) loss 3.5882 (3.4774) grad_norm 1.7530 (1.5563) [2022-10-01 01:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][800/1251] eta 0:02:12 lr 0.000440 time 0.2839 (0.2940) loss 3.6829 (3.4831) grad_norm 1.7086 (1.5611) [2022-10-01 01:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][900/1251] eta 0:01:43 lr 0.000439 time 0.3844 (0.2937) loss 3.6636 (3.4837) grad_norm 1.3925 (1.5615) [2022-10-01 01:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1000/1251] eta 0:01:13 lr 0.000439 time 0.2871 (0.2933) loss 2.5950 (3.4823) grad_norm 1.5465 (1.5639) [2022-10-01 01:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1100/1251] eta 0:00:44 lr 0.000438 time 0.2899 (0.2930) loss 4.1655 (3.4872) grad_norm 1.5312 (1.5631) [2022-10-01 01:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1200/1251] eta 0:00:14 lr 0.000438 time 0.2833 (0.2927) loss 4.1699 (3.4992) grad_norm 1.7418 (1.5642) [2022-10-01 01:40:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 162 training takes 0:06:06 [2022-10-01 01:40:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.525 (2.525) Loss 1.0064 (1.0064) Acc@1 77.051 (77.051) Acc@5 92.871 (92.871) [2022-10-01 01:40:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.500 Acc@5 93.632 [2022-10-01 01:40:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-01 01:40:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.50% [2022-10-01 01:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][0/1251] eta 1:02:19 lr 0.000438 time 2.9888 (2.9888) loss 3.9161 (3.9161) grad_norm 1.6606 (1.6606) [2022-10-01 01:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][100/1251] eta 0:06:07 lr 0.000437 time 0.3831 (0.3191) loss 2.4167 (3.4838) grad_norm 1.4019 (1.5782) [2022-10-01 01:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][200/1251] eta 0:05:20 lr 0.000437 time 0.2936 (0.3048) loss 3.8893 (3.4846) grad_norm 1.4019 (1.5666) [2022-10-01 01:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][300/1251] eta 0:04:45 lr 0.000437 time 0.2883 (0.3002) loss 2.3485 (3.4685) grad_norm 1.6408 (1.5657) [2022-10-01 01:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][400/1251] eta 0:04:13 lr 0.000436 time 0.2949 (0.2979) loss 3.3350 (3.4864) grad_norm 1.3893 (1.5721) [2022-10-01 01:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][500/1251] eta 0:03:42 lr 0.000436 time 0.2882 (0.2964) loss 2.4309 (3.4780) grad_norm 1.4641 (1.5709) [2022-10-01 01:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][600/1251] eta 0:03:12 lr 0.000435 time 0.3871 (0.2955) loss 3.2401 (3.4839) grad_norm 1.7591 (1.5792) [2022-10-01 01:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][700/1251] eta 0:02:42 lr 0.000435 time 0.2897 (0.2948) loss 3.9946 (3.4889) grad_norm 1.6487 (1.5798) [2022-10-01 01:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][800/1251] eta 0:02:12 lr 0.000435 time 0.2918 (0.2942) loss 3.5708 (3.4810) grad_norm 1.5634 (1.5762) [2022-10-01 01:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][900/1251] eta 0:01:43 lr 0.000434 time 0.2854 (0.2938) loss 4.0988 (3.4893) grad_norm 1.4155 (1.5798) [2022-10-01 01:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1000/1251] eta 0:01:13 lr 0.000434 time 0.2909 (0.2934) loss 2.4268 (3.4900) grad_norm 1.7720 (1.5810) [2022-10-01 01:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1100/1251] eta 0:00:44 lr 0.000433 time 0.3805 (0.2932) loss 4.1862 (3.4865) grad_norm 1.6555 (1.5879) [2022-10-01 01:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1200/1251] eta 0:00:14 lr 0.000433 time 0.2882 (0.2930) loss 3.8288 (3.4852) grad_norm 1.3522 (1.5871) [2022-10-01 01:46:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 163 training takes 0:06:06 [2022-10-01 01:46:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.330 (3.330) Loss 0.9378 (0.9378) Acc@1 78.027 (78.027) Acc@5 94.629 (94.629) [2022-10-01 01:46:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.436 Acc@5 93.568 [2022-10-01 01:46:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-01 01:46:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.50% [2022-10-01 01:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][0/1251] eta 1:07:38 lr 0.000433 time 3.2445 (3.2445) loss 3.8358 (3.8358) grad_norm 1.4367 (1.4367) [2022-10-01 01:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][100/1251] eta 0:06:07 lr 0.000432 time 0.2912 (0.3193) loss 2.8218 (3.5641) grad_norm 1.8454 (1.5965) [2022-10-01 01:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][200/1251] eta 0:05:20 lr 0.000432 time 0.2852 (0.3049) loss 3.3442 (3.5414) grad_norm 1.3982 (1.6133) [2022-10-01 01:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][300/1251] eta 0:04:45 lr 0.000431 time 0.3786 (0.3001) loss 2.6855 (3.5061) grad_norm 1.5061 (1.6039) [2022-10-01 01:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][400/1251] eta 0:04:13 lr 0.000431 time 0.2878 (0.2973) loss 3.7551 (3.4927) grad_norm 1.3540 (1.5971) [2022-10-01 01:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][500/1251] eta 0:03:42 lr 0.000431 time 0.2895 (0.2957) loss 3.1011 (3.4901) grad_norm 1.5944 (1.5918) [2022-10-01 01:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][600/1251] eta 0:03:11 lr 0.000430 time 0.2859 (0.2945) loss 4.2119 (3.4911) grad_norm 1.7433 (1.5967) [2022-10-01 01:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][700/1251] eta 0:02:41 lr 0.000430 time 0.2874 (0.2937) loss 3.8707 (3.4891) grad_norm 1.5574 (1.5914) [2022-10-01 01:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][800/1251] eta 0:02:12 lr 0.000429 time 0.3769 (0.2932) loss 3.2597 (3.4980) grad_norm 1.4632 (1.5869) [2022-10-01 01:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][900/1251] eta 0:01:42 lr 0.000429 time 0.2875 (0.2927) loss 3.8664 (3.5006) grad_norm 1.4421 (1.5818) [2022-10-01 01:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1000/1251] eta 0:01:13 lr 0.000429 time 0.2856 (0.2922) loss 2.2258 (3.4908) grad_norm 1.4821 (1.5805) [2022-10-01 01:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1100/1251] eta 0:00:44 lr 0.000428 time 0.2867 (0.2918) loss 3.9810 (3.4894) grad_norm 1.7933 (1.5795) [2022-10-01 01:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1200/1251] eta 0:00:14 lr 0.000428 time 0.2887 (0.2915) loss 3.0927 (3.4926) grad_norm 1.4360 (1.5828) [2022-10-01 01:53:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 164 training takes 0:06:04 [2022-10-01 01:53:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.795 (2.795) Loss 0.9331 (0.9331) Acc@1 78.809 (78.809) Acc@5 94.141 (94.141) [2022-10-01 01:53:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.644 Acc@5 93.628 [2022-10-01 01:53:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-01 01:53:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.64% [2022-10-01 01:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][0/1251] eta 0:59:46 lr 0.000428 time 2.8667 (2.8667) loss 3.2168 (3.2168) grad_norm 1.5885 (1.5885) [2022-10-01 01:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][100/1251] eta 0:06:05 lr 0.000427 time 0.2930 (0.3174) loss 3.8079 (3.5139) grad_norm 1.6610 (1.5833) [2022-10-01 01:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][200/1251] eta 0:05:19 lr 0.000427 time 0.2868 (0.3038) loss 3.2153 (3.4765) grad_norm 1.7818 (1.5929) [2022-10-01 01:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][300/1251] eta 0:04:44 lr 0.000426 time 0.2950 (0.2993) loss 2.6722 (3.4900) grad_norm 1.5339 (1.5936) [2022-10-01 01:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][400/1251] eta 0:04:12 lr 0.000426 time 0.2874 (0.2970) loss 3.0914 (3.4813) grad_norm 1.5371 (1.5938) [2022-10-01 01:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][500/1251] eta 0:03:42 lr 0.000426 time 0.3927 (0.2958) loss 3.9701 (3.4874) grad_norm 1.7487 (1.5999) [2022-10-01 01:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][600/1251] eta 0:03:11 lr 0.000425 time 0.2851 (0.2948) loss 3.9746 (3.4909) grad_norm 1.8870 (1.6017) [2022-10-01 01:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][700/1251] eta 0:02:42 lr 0.000425 time 0.2904 (0.2941) loss 3.8836 (3.5000) grad_norm 1.5509 (1.5987) [2022-10-01 01:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][800/1251] eta 0:02:12 lr 0.000424 time 0.2854 (0.2936) loss 3.5367 (3.4987) grad_norm 1.9108 (1.6006) [2022-10-01 01:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][900/1251] eta 0:01:42 lr 0.000424 time 0.2927 (0.2932) loss 3.1249 (3.4924) grad_norm 1.9850 (1.6028) [2022-10-01 01:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1000/1251] eta 0:01:13 lr 0.000423 time 0.3807 (0.2930) loss 3.9153 (3.4825) grad_norm 1.6173 (1.6032) [2022-10-01 01:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1100/1251] eta 0:00:44 lr 0.000423 time 0.2905 (0.2927) loss 3.5236 (3.4799) grad_norm 1.6679 (1.6033) [2022-10-01 01:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1200/1251] eta 0:00:14 lr 0.000423 time 0.2896 (0.2926) loss 4.2631 (3.4866) grad_norm 1.7853 (1.6039) [2022-10-01 01:59:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 165 training takes 0:06:06 [2022-10-01 01:59:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.234 (3.234) Loss 1.0646 (1.0646) Acc@1 76.367 (76.367) Acc@5 92.969 (92.969) [2022-10-01 01:59:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.560 Acc@5 93.682 [2022-10-01 01:59:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-01 01:59:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.64% [2022-10-01 01:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][0/1251] eta 1:04:41 lr 0.000422 time 3.1028 (3.1028) loss 4.0448 (4.0448) grad_norm 1.7292 (1.7292) [2022-10-01 02:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][100/1251] eta 0:06:06 lr 0.000422 time 0.2890 (0.3182) loss 3.8122 (3.4998) grad_norm 1.5583 (1.6161) [2022-10-01 02:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][200/1251] eta 0:05:19 lr 0.000422 time 0.3817 (0.3042) loss 3.7151 (3.4673) grad_norm 1.6190 (1.6279) [2022-10-01 02:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][300/1251] eta 0:04:44 lr 0.000421 time 0.2913 (0.2992) loss 4.0051 (3.4676) grad_norm 1.8597 (1.6332) [2022-10-01 02:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][400/1251] eta 0:04:12 lr 0.000421 time 0.2884 (0.2967) loss 3.1681 (3.4639) grad_norm 1.5022 (1.6254) [2022-10-01 02:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][500/1251] eta 0:03:41 lr 0.000420 time 0.2887 (0.2953) loss 4.0015 (3.4607) grad_norm 1.4296 (1.6174) [2022-10-01 02:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][600/1251] eta 0:03:11 lr 0.000420 time 0.2883 (0.2942) loss 3.5914 (3.4643) grad_norm 1.7817 (1.6186) [2022-10-01 02:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][700/1251] eta 0:02:41 lr 0.000420 time 0.3823 (0.2937) loss 3.9003 (3.4693) grad_norm 1.4588 (1.6177) [2022-10-01 02:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][800/1251] eta 0:02:12 lr 0.000419 time 0.2888 (0.2931) loss 2.7882 (3.4632) grad_norm 1.6325 (1.6206) [2022-10-01 02:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][900/1251] eta 0:01:42 lr 0.000419 time 0.2850 (0.2927) loss 3.9496 (3.4598) grad_norm 1.5215 (1.6140) [2022-10-01 02:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1000/1251] eta 0:01:13 lr 0.000418 time 0.2886 (0.2923) loss 3.9345 (3.4657) grad_norm 1.5107 (1.6106) [2022-10-01 02:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1100/1251] eta 0:00:44 lr 0.000418 time 0.2887 (0.2920) loss 3.9170 (3.4627) grad_norm 1.3711 (1.6114) [2022-10-01 02:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1200/1251] eta 0:00:14 lr 0.000418 time 0.3814 (0.2918) loss 3.8486 (3.4711) grad_norm 1.7654 (1.6105) [2022-10-01 02:05:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 166 training takes 0:06:05 [2022-10-01 02:05:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.637 (2.637) Loss 0.9272 (0.9272) Acc@1 77.539 (77.539) Acc@5 95.020 (95.020) [2022-10-01 02:05:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.612 Acc@5 93.676 [2022-10-01 02:05:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-01 02:05:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.64% [2022-10-01 02:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][0/1251] eta 1:08:17 lr 0.000417 time 3.2751 (3.2751) loss 3.9028 (3.9028) grad_norm 1.5985 (1.5985) [2022-10-01 02:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][100/1251] eta 0:06:08 lr 0.000417 time 0.2887 (0.3200) loss 3.8594 (3.4579) grad_norm 1.6049 (1.5976) [2022-10-01 02:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][200/1251] eta 0:05:20 lr 0.000417 time 0.2878 (0.3050) loss 3.8604 (3.5386) grad_norm 1.7303 (1.6142) [2022-10-01 02:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][300/1251] eta 0:04:45 lr 0.000416 time 0.2892 (0.2999) loss 3.9914 (3.5437) grad_norm 1.7938 (1.6355) [2022-10-01 02:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][400/1251] eta 0:04:13 lr 0.000416 time 0.3736 (0.2977) loss 3.9553 (3.5071) grad_norm 1.5045 (1.6335) [2022-10-01 02:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][500/1251] eta 0:03:42 lr 0.000415 time 0.2860 (0.2960) loss 4.0369 (3.5032) grad_norm 1.5990 (1.6290) [2022-10-01 02:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][600/1251] eta 0:03:11 lr 0.000415 time 0.2874 (0.2948) loss 4.1190 (3.5052) grad_norm 1.6120 (1.6256) [2022-10-01 02:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][700/1251] eta 0:02:42 lr 0.000414 time 0.2874 (0.2940) loss 4.0449 (3.5064) grad_norm 1.5966 (1.6221) [2022-10-01 02:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][800/1251] eta 0:02:12 lr 0.000414 time 0.2894 (0.2934) loss 3.4938 (3.5028) grad_norm 1.5673 (1.6206) [2022-10-01 02:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][900/1251] eta 0:01:42 lr 0.000414 time 0.3780 (0.2931) loss 2.5747 (3.4945) grad_norm 1.5664 (1.6208) [2022-10-01 02:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1000/1251] eta 0:01:13 lr 0.000413 time 0.2871 (0.2927) loss 2.4386 (3.4919) grad_norm 1.5165 (1.6185) [2022-10-01 02:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1100/1251] eta 0:00:44 lr 0.000413 time 0.2873 (0.2923) loss 3.4084 (3.4888) grad_norm 1.5039 (1.6182) [2022-10-01 02:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1200/1251] eta 0:00:14 lr 0.000412 time 0.2860 (0.2919) loss 4.2315 (3.4845) grad_norm 1.7947 (1.6171) [2022-10-01 02:11:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 167 training takes 0:06:05 [2022-10-01 02:12:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.279 (2.279) Loss 0.9936 (0.9936) Acc@1 76.758 (76.758) Acc@5 93.066 (93.066) [2022-10-01 02:12:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.690 Acc@5 93.806 [2022-10-01 02:12:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-10-01 02:12:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.69% [2022-10-01 02:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][0/1251] eta 1:07:12 lr 0.000412 time 3.2237 (3.2237) loss 3.6873 (3.6873) grad_norm 1.7311 (1.7311) [2022-10-01 02:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][100/1251] eta 0:06:07 lr 0.000412 time 0.3803 (0.3196) loss 3.9719 (3.5196) grad_norm 1.6642 (1.6189) [2022-10-01 02:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][200/1251] eta 0:05:19 lr 0.000411 time 0.2906 (0.3042) loss 3.0542 (3.4474) grad_norm 1.6500 (1.6156) [2022-10-01 02:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][300/1251] eta 0:04:44 lr 0.000411 time 0.2889 (0.2991) loss 3.1238 (3.4367) grad_norm 1.3934 (1.6184) [2022-10-01 02:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][400/1251] eta 0:04:12 lr 0.000411 time 0.2871 (0.2968) loss 3.6646 (3.4550) grad_norm 1.5222 (1.6210) [2022-10-01 02:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][500/1251] eta 0:03:41 lr 0.000410 time 0.2854 (0.2952) loss 3.2762 (3.4477) grad_norm 1.8175 (1.6270) [2022-10-01 02:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][600/1251] eta 0:03:11 lr 0.000410 time 0.3750 (0.2942) loss 3.7653 (3.4446) grad_norm 1.7236 (1.6267) [2022-10-01 02:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][700/1251] eta 0:02:41 lr 0.000409 time 0.2868 (0.2934) loss 2.6031 (3.4338) grad_norm 1.6967 (1.6322) [2022-10-01 02:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][800/1251] eta 0:02:12 lr 0.000409 time 0.2886 (0.2928) loss 2.8574 (3.4388) grad_norm 1.5308 (1.6310) [2022-10-01 02:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][900/1251] eta 0:01:42 lr 0.000409 time 0.2874 (0.2923) loss 3.6936 (3.4562) grad_norm 1.7890 (1.6299) [2022-10-01 02:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1000/1251] eta 0:01:13 lr 0.000408 time 0.2883 (0.2919) loss 2.4661 (3.4573) grad_norm 1.7107 (1.6261) [2022-10-01 02:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1100/1251] eta 0:00:44 lr 0.000408 time 0.3815 (0.2918) loss 3.4542 (3.4580) grad_norm 1.5638 (1.6283) [2022-10-01 02:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1200/1251] eta 0:00:14 lr 0.000407 time 0.2869 (0.2916) loss 3.4843 (3.4443) grad_norm 1.3813 (1.6279) [2022-10-01 02:18:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 168 training takes 0:06:05 [2022-10-01 02:18:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.703 (2.703) Loss 1.1382 (1.1382) Acc@1 73.633 (73.633) Acc@5 92.090 (92.090) [2022-10-01 02:18:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.632 Acc@5 93.794 [2022-10-01 02:18:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-01 02:18:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.69% [2022-10-01 02:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][0/1251] eta 1:08:59 lr 0.000407 time 3.3086 (3.3086) loss 3.6247 (3.6247) grad_norm 1.4918 (1.4918) [2022-10-01 02:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][100/1251] eta 0:06:08 lr 0.000407 time 0.2904 (0.3203) loss 3.9169 (3.4113) grad_norm 1.9845 (1.6447) [2022-10-01 02:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][200/1251] eta 0:05:20 lr 0.000406 time 0.2918 (0.3051) loss 3.9677 (3.4017) grad_norm 1.4434 (1.6384) [2022-10-01 02:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][300/1251] eta 0:04:45 lr 0.000406 time 0.3840 (0.3005) loss 3.5270 (3.4239) grad_norm 1.3335 (1.6386) [2022-10-01 02:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][400/1251] eta 0:04:13 lr 0.000406 time 0.2905 (0.2979) loss 3.3380 (3.4350) grad_norm 1.5846 (1.6382) [2022-10-01 02:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][500/1251] eta 0:03:42 lr 0.000405 time 0.2887 (0.2963) loss 3.7890 (3.4499) grad_norm 1.6695 (1.6387) [2022-10-01 02:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][600/1251] eta 0:03:12 lr 0.000405 time 0.2935 (0.2952) loss 2.8230 (3.4647) grad_norm 1.4315 (1.6350) [2022-10-01 02:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][700/1251] eta 0:02:42 lr 0.000404 time 0.2913 (0.2944) loss 3.9559 (3.4649) grad_norm 1.3528 (1.6373) [2022-10-01 02:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][800/1251] eta 0:02:12 lr 0.000404 time 0.3798 (0.2939) loss 4.1380 (3.4677) grad_norm 1.6955 (1.6381) [2022-10-01 02:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][900/1251] eta 0:01:42 lr 0.000404 time 0.2900 (0.2933) loss 4.1587 (3.4620) grad_norm 2.0261 (1.6369) [2022-10-01 02:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1000/1251] eta 0:01:13 lr 0.000403 time 0.2892 (0.2928) loss 4.2553 (3.4668) grad_norm 1.4555 (1.6379) [2022-10-01 02:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1100/1251] eta 0:00:44 lr 0.000403 time 0.2879 (0.2923) loss 3.9331 (3.4679) grad_norm 1.7221 (1.6374) [2022-10-01 02:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1200/1251] eta 0:00:14 lr 0.000402 time 0.2882 (0.2920) loss 2.7251 (3.4662) grad_norm 1.6310 (1.6351) [2022-10-01 02:24:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 169 training takes 0:06:05 [2022-10-01 02:24:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.277 (2.277) Loss 0.9667 (0.9667) Acc@1 77.637 (77.637) Acc@5 94.629 (94.629) [2022-10-01 02:24:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.688 Acc@5 93.760 [2022-10-01 02:24:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-10-01 02:24:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.69% [2022-10-01 02:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][0/1251] eta 1:02:53 lr 0.000402 time 3.0161 (3.0161) loss 2.5857 (2.5857) grad_norm 1.6576 (1.6576) [2022-10-01 02:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][100/1251] eta 0:06:08 lr 0.000402 time 0.2996 (0.3205) loss 4.0655 (3.4649) grad_norm 1.6214 (1.5996) [2022-10-01 02:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][200/1251] eta 0:05:21 lr 0.000401 time 0.2856 (0.3058) loss 2.9515 (3.4821) grad_norm 1.7566 (1.6139) [2022-10-01 02:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][300/1251] eta 0:04:46 lr 0.000401 time 0.2956 (0.3009) loss 3.7046 (3.4704) grad_norm 1.5365 (1.6213) [2022-10-01 02:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][400/1251] eta 0:04:13 lr 0.000400 time 0.2895 (0.2984) loss 4.2334 (3.4615) grad_norm 1.9496 (1.6287) [2022-10-01 02:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][500/1251] eta 0:03:43 lr 0.000400 time 0.3905 (0.2972) loss 3.7281 (3.4603) grad_norm 1.6319 (1.6272) [2022-10-01 02:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][600/1251] eta 0:03:12 lr 0.000400 time 0.2908 (0.2961) loss 3.4884 (3.4581) grad_norm 1.9509 (1.6285) [2022-10-01 02:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][700/1251] eta 0:02:42 lr 0.000399 time 0.2996 (0.2953) loss 3.8981 (3.4605) grad_norm 1.7961 (1.6306) [2022-10-01 02:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][800/1251] eta 0:02:12 lr 0.000399 time 0.2871 (0.2947) loss 3.7437 (3.4642) grad_norm 1.5273 (1.6294) [2022-10-01 02:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][900/1251] eta 0:01:43 lr 0.000398 time 0.2952 (0.2942) loss 2.7494 (3.4726) grad_norm 1.5993 (1.6349) [2022-10-01 02:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1000/1251] eta 0:01:13 lr 0.000398 time 0.3781 (0.2939) loss 3.8314 (3.4687) grad_norm 1.6403 (1.6363) [2022-10-01 02:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1100/1251] eta 0:00:44 lr 0.000398 time 0.2908 (0.2934) loss 3.8814 (3.4607) grad_norm 1.6514 (1.6378) [2022-10-01 02:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1200/1251] eta 0:00:14 lr 0.000397 time 0.2861 (0.2930) loss 3.6875 (3.4626) grad_norm 1.5756 (1.6374) [2022-10-01 02:30:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 170 training takes 0:06:06 [2022-10-01 02:30:54 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saving...... [2022-10-01 02:30:54 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saved !!! [2022-10-01 02:30:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.746 (2.746) Loss 1.0105 (1.0105) Acc@1 75.293 (75.293) Acc@5 93.457 (93.457) [2022-10-01 02:31:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.938 Acc@5 93.856 [2022-10-01 02:31:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-01 02:31:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.94% [2022-10-01 02:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][0/1251] eta 0:56:58 lr 0.000397 time 2.7328 (2.7328) loss 4.1509 (4.1509) grad_norm 1.5564 (1.5564) [2022-10-01 02:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][100/1251] eta 0:06:03 lr 0.000397 time 0.2922 (0.3158) loss 2.3684 (3.3697) grad_norm 1.5353 (1.6254) [2022-10-01 02:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][200/1251] eta 0:05:18 lr 0.000396 time 0.3808 (0.3031) loss 3.2099 (3.4217) grad_norm 1.5005 (1.6457) [2022-10-01 02:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][300/1251] eta 0:04:44 lr 0.000396 time 0.2866 (0.2987) loss 3.5218 (3.4068) grad_norm 1.6101 (1.6397) [2022-10-01 02:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][400/1251] eta 0:04:12 lr 0.000395 time 0.2892 (0.2964) loss 3.8941 (3.4269) grad_norm 1.5894 (1.6367) [2022-10-01 02:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][500/1251] eta 0:03:41 lr 0.000395 time 0.2907 (0.2950) loss 3.5255 (3.4256) grad_norm 1.7237 (1.6311) [2022-10-01 02:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][600/1251] eta 0:03:11 lr 0.000395 time 0.2873 (0.2941) loss 2.5041 (3.4414) grad_norm 1.6755 (1.6337) [2022-10-01 02:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][700/1251] eta 0:02:41 lr 0.000394 time 0.3797 (0.2936) loss 3.7839 (3.4510) grad_norm 1.6313 (1.6338) [2022-10-01 02:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][800/1251] eta 0:02:12 lr 0.000394 time 0.2910 (0.2932) loss 3.8945 (3.4504) grad_norm 1.6633 (1.6395) [2022-10-01 02:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][900/1251] eta 0:01:42 lr 0.000393 time 0.2913 (0.2929) loss 3.9896 (3.4548) grad_norm 1.5367 (1.6425) [2022-10-01 02:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1000/1251] eta 0:01:13 lr 0.000393 time 0.2914 (0.2927) loss 3.0303 (3.4491) grad_norm 1.5811 (1.6429) [2022-10-01 02:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1100/1251] eta 0:00:44 lr 0.000393 time 0.2915 (0.2924) loss 2.8601 (3.4476) grad_norm 1.6317 (1.6478) [2022-10-01 02:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1200/1251] eta 0:00:14 lr 0.000392 time 0.3891 (0.2923) loss 4.1998 (3.4513) grad_norm 1.5828 (1.6487) [2022-10-01 02:37:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 171 training takes 0:06:05 [2022-10-01 02:37:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.798 (2.798) Loss 0.9326 (0.9326) Acc@1 77.539 (77.539) Acc@5 94.922 (94.922) [2022-10-01 02:37:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.872 Acc@5 93.868 [2022-10-01 02:37:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-01 02:37:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.94% [2022-10-01 02:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][0/1251] eta 0:52:50 lr 0.000392 time 2.5340 (2.5340) loss 2.5183 (2.5183) grad_norm 1.6396 (1.6396) [2022-10-01 02:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][100/1251] eta 0:06:01 lr 0.000392 time 0.2882 (0.3138) loss 3.4522 (3.3693) grad_norm 1.6537 (1.5772) [2022-10-01 02:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][200/1251] eta 0:05:16 lr 0.000391 time 0.2858 (0.3008) loss 3.2237 (3.3617) grad_norm 1.7985 (1.6110) [2022-10-01 02:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][300/1251] eta 0:04:41 lr 0.000391 time 0.2902 (0.2965) loss 4.0199 (3.3734) grad_norm 1.4845 (1.6257) [2022-10-01 02:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][400/1251] eta 0:04:10 lr 0.000390 time 0.3788 (0.2947) loss 3.9520 (3.3756) grad_norm 1.6112 (1.6296) [2022-10-01 02:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][500/1251] eta 0:03:40 lr 0.000390 time 0.2910 (0.2933) loss 3.4870 (3.3929) grad_norm 1.4964 (1.6297) [2022-10-01 02:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][600/1251] eta 0:03:10 lr 0.000390 time 0.2885 (0.2925) loss 3.5781 (3.4007) grad_norm 1.5765 (1.6340) [2022-10-01 02:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][700/1251] eta 0:02:40 lr 0.000389 time 0.2881 (0.2918) loss 4.1029 (3.4049) grad_norm 1.7901 (1.6380) [2022-10-01 02:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][800/1251] eta 0:02:11 lr 0.000389 time 0.2852 (0.2913) loss 3.9431 (3.4129) grad_norm 1.8461 (1.6401) [2022-10-01 02:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][900/1251] eta 0:01:42 lr 0.000388 time 0.3784 (0.2910) loss 3.8410 (3.4131) grad_norm 1.6718 (1.6406) [2022-10-01 02:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1000/1251] eta 0:01:12 lr 0.000388 time 0.2873 (0.2906) loss 3.7352 (3.4203) grad_norm 1.7167 (1.6427) [2022-10-01 02:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1100/1251] eta 0:00:43 lr 0.000388 time 0.2860 (0.2903) loss 3.6521 (3.4215) grad_norm 1.7246 (1.6446) [2022-10-01 02:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1200/1251] eta 0:00:14 lr 0.000387 time 0.2898 (0.2900) loss 4.2245 (3.4201) grad_norm 1.7177 (1.6475) [2022-10-01 02:43:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 172 training takes 0:06:03 [2022-10-01 02:43:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.012 (3.012) Loss 1.0141 (1.0141) Acc@1 76.562 (76.562) Acc@5 93.262 (93.262) [2022-10-01 02:43:41 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.870 Acc@5 93.850 [2022-10-01 02:43:41 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-01 02:43:41 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.94% [2022-10-01 02:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][0/1251] eta 1:05:57 lr 0.000387 time 3.1631 (3.1631) loss 4.0313 (4.0313) grad_norm 2.0195 (2.0195) [2022-10-01 02:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][100/1251] eta 0:06:07 lr 0.000387 time 0.3781 (0.3197) loss 3.0014 (3.4290) grad_norm 1.6011 (1.6496) [2022-10-01 02:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][200/1251] eta 0:05:20 lr 0.000386 time 0.2921 (0.3052) loss 2.6196 (3.4600) grad_norm 1.5116 (1.6504) [2022-10-01 02:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][300/1251] eta 0:04:45 lr 0.000386 time 0.2862 (0.3001) loss 3.6609 (3.4616) grad_norm 1.7928 (1.6561) [2022-10-01 02:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][400/1251] eta 0:04:13 lr 0.000385 time 0.2920 (0.2976) loss 3.7228 (3.4493) grad_norm 1.8803 (1.6594) [2022-10-01 02:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][500/1251] eta 0:03:42 lr 0.000385 time 0.2873 (0.2961) loss 3.4779 (3.4534) grad_norm 1.5255 (1.6608) [2022-10-01 02:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][600/1251] eta 0:03:12 lr 0.000385 time 0.3828 (0.2951) loss 3.2473 (3.4517) grad_norm 1.8008 (1.6620) [2022-10-01 02:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][700/1251] eta 0:02:42 lr 0.000384 time 0.2873 (0.2943) loss 2.7732 (3.4612) grad_norm 1.7439 (1.6637) [2022-10-01 02:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][800/1251] eta 0:02:12 lr 0.000384 time 0.2933 (0.2937) loss 3.6412 (3.4560) grad_norm 1.6777 (1.6635) [2022-10-01 02:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][900/1251] eta 0:01:42 lr 0.000383 time 0.2898 (0.2931) loss 2.5796 (3.4617) grad_norm 1.4891 (1.6656) [2022-10-01 02:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1000/1251] eta 0:01:13 lr 0.000383 time 0.2882 (0.2927) loss 3.9103 (3.4678) grad_norm 1.6188 (1.6634) [2022-10-01 02:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1100/1251] eta 0:00:44 lr 0.000383 time 0.3762 (0.2924) loss 2.8262 (3.4623) grad_norm 1.6405 (1.6639) [2022-10-01 02:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1200/1251] eta 0:00:14 lr 0.000382 time 0.2946 (0.2921) loss 3.4295 (3.4583) grad_norm 1.6742 (1.6606) [2022-10-01 02:49:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 173 training takes 0:06:05 [2022-10-01 02:49:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.554 (2.554) Loss 0.9693 (0.9693) Acc@1 77.246 (77.246) Acc@5 94.238 (94.238) [2022-10-01 02:49:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.136 Acc@5 93.882 [2022-10-01 02:49:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-01 02:49:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.14% [2022-10-01 02:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][0/1251] eta 1:04:26 lr 0.000382 time 3.0911 (3.0911) loss 3.0359 (3.0359) grad_norm 1.5930 (1.5930) [2022-10-01 02:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][100/1251] eta 0:06:05 lr 0.000381 time 0.2934 (0.3172) loss 3.4496 (3.4235) grad_norm 1.8011 (1.6610) [2022-10-01 02:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][200/1251] eta 0:05:18 lr 0.000381 time 0.2859 (0.3032) loss 4.0189 (3.4530) grad_norm 1.5925 (1.6721) [2022-10-01 02:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][300/1251] eta 0:04:44 lr 0.000381 time 0.3824 (0.2987) loss 2.5391 (3.4417) grad_norm 1.5837 (1.6721) [2022-10-01 02:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][400/1251] eta 0:04:12 lr 0.000380 time 0.2878 (0.2963) loss 3.8825 (3.4549) grad_norm 1.4895 (1.6648) [2022-10-01 02:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][500/1251] eta 0:03:41 lr 0.000380 time 0.2879 (0.2949) loss 3.3604 (3.4639) grad_norm 2.0737 (1.6650) [2022-10-01 02:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][600/1251] eta 0:03:11 lr 0.000379 time 0.2876 (0.2938) loss 3.2875 (3.4547) grad_norm 1.6609 (1.6664) [2022-10-01 02:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][700/1251] eta 0:02:41 lr 0.000379 time 0.2882 (0.2931) loss 2.2741 (3.4420) grad_norm 1.7849 (1.6644) [2022-10-01 02:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][800/1251] eta 0:02:12 lr 0.000379 time 0.3841 (0.2928) loss 2.7957 (3.4397) grad_norm 1.6190 (1.6640) [2022-10-01 02:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][900/1251] eta 0:01:42 lr 0.000378 time 0.2908 (0.2924) loss 3.9534 (3.4459) grad_norm 1.6160 (1.6642) [2022-10-01 02:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1000/1251] eta 0:01:13 lr 0.000378 time 0.2919 (0.2921) loss 3.5441 (3.4445) grad_norm 1.6866 (1.6633) [2022-10-01 02:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1100/1251] eta 0:00:44 lr 0.000377 time 0.2878 (0.2918) loss 2.6921 (3.4447) grad_norm 1.5581 (1.6603) [2022-10-01 02:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1200/1251] eta 0:00:14 lr 0.000377 time 0.2887 (0.2915) loss 3.3371 (3.4475) grad_norm 1.6455 (1.6615) [2022-10-01 02:56:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 174 training takes 0:06:04 [2022-10-01 02:56:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.594 (2.594) Loss 0.9797 (0.9797) Acc@1 77.344 (77.344) Acc@5 93.750 (93.750) [2022-10-01 02:56:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.174 Acc@5 93.826 [2022-10-01 02:56:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-10-01 02:56:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-01 02:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][0/1251] eta 1:02:11 lr 0.000377 time 2.9831 (2.9831) loss 3.9649 (3.9649) grad_norm 1.4530 (1.4530) [2022-10-01 02:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][100/1251] eta 0:06:05 lr 0.000376 time 0.2886 (0.3175) loss 3.8544 (3.5053) grad_norm 1.7730 (1.6657) [2022-10-01 02:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][200/1251] eta 0:05:18 lr 0.000376 time 0.2884 (0.3034) loss 3.8750 (3.4503) grad_norm 1.9484 (1.6734) [2022-10-01 02:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][300/1251] eta 0:04:43 lr 0.000376 time 0.2927 (0.2986) loss 3.3631 (3.4426) grad_norm 2.0296 (1.6756) [2022-10-01 02:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][400/1251] eta 0:04:12 lr 0.000375 time 0.2847 (0.2962) loss 2.7530 (3.4315) grad_norm 1.6054 (1.6740) [2022-10-01 02:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][500/1251] eta 0:03:41 lr 0.000375 time 0.3860 (0.2951) loss 3.5536 (3.4392) grad_norm 1.5547 (1.6722) [2022-10-01 02:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][600/1251] eta 0:03:11 lr 0.000374 time 0.2909 (0.2941) loss 3.2561 (3.4465) grad_norm 1.4638 (1.6759) [2022-10-01 02:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][700/1251] eta 0:02:41 lr 0.000374 time 0.2885 (0.2934) loss 3.3833 (3.4345) grad_norm 1.5935 (1.6828) [2022-10-01 03:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][800/1251] eta 0:02:12 lr 0.000374 time 0.2864 (0.2929) loss 3.5893 (3.4302) grad_norm 1.9655 (1.6863) [2022-10-01 03:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][900/1251] eta 0:01:42 lr 0.000373 time 0.2884 (0.2925) loss 4.0309 (3.4330) grad_norm 2.0527 (1.6891) [2022-10-01 03:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1000/1251] eta 0:01:13 lr 0.000373 time 0.3760 (0.2922) loss 3.7874 (3.4267) grad_norm 1.5737 (1.6916) [2022-10-01 03:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1100/1251] eta 0:00:44 lr 0.000372 time 0.2884 (0.2918) loss 3.6081 (3.4291) grad_norm 1.7871 (1.6892) [2022-10-01 03:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1200/1251] eta 0:00:14 lr 0.000372 time 0.2851 (0.2915) loss 3.6300 (3.4261) grad_norm 1.7634 (1.6908) [2022-10-01 03:02:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 175 training takes 0:06:04 [2022-10-01 03:02:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.303 (3.303) Loss 0.9109 (0.9109) Acc@1 78.906 (78.906) Acc@5 93.848 (93.848) [2022-10-01 03:02:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.970 Acc@5 93.868 [2022-10-01 03:02:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-01 03:02:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-01 03:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][0/1251] eta 0:56:50 lr 0.000372 time 2.7263 (2.7263) loss 3.3639 (3.3639) grad_norm 1.6423 (1.6423) [2022-10-01 03:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][100/1251] eta 0:06:05 lr 0.000371 time 0.2866 (0.3172) loss 4.0208 (3.4698) grad_norm 1.8096 (1.6972) [2022-10-01 03:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][200/1251] eta 0:05:20 lr 0.000371 time 0.3831 (0.3045) loss 3.6223 (3.4112) grad_norm 1.6390 (1.6823) [2022-10-01 03:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][300/1251] eta 0:04:45 lr 0.000371 time 0.2874 (0.3002) loss 3.5502 (3.4202) grad_norm 1.6493 (1.6948) [2022-10-01 03:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][400/1251] eta 0:04:13 lr 0.000370 time 0.2910 (0.2981) loss 3.1714 (3.4498) grad_norm 2.0559 (1.7001) [2022-10-01 03:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][500/1251] eta 0:03:42 lr 0.000370 time 0.2885 (0.2966) loss 3.2525 (3.4636) grad_norm 1.5611 (1.7023) [2022-10-01 03:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][600/1251] eta 0:03:12 lr 0.000369 time 0.2880 (0.2957) loss 3.1107 (3.4560) grad_norm 1.6637 (1.7064) [2022-10-01 03:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][700/1251] eta 0:02:42 lr 0.000369 time 0.3788 (0.2951) loss 2.4223 (3.4467) grad_norm 1.9463 (1.7101) [2022-10-01 03:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][800/1251] eta 0:02:12 lr 0.000369 time 0.2866 (0.2945) loss 2.2805 (3.4425) grad_norm 1.7784 (1.7058) [2022-10-01 03:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][900/1251] eta 0:01:43 lr 0.000368 time 0.2880 (0.2940) loss 3.2523 (3.4332) grad_norm 1.6282 (1.7055) [2022-10-01 03:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1000/1251] eta 0:01:13 lr 0.000368 time 0.2866 (0.2936) loss 3.7616 (3.4346) grad_norm 1.5971 (1.7030) [2022-10-01 03:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1100/1251] eta 0:00:44 lr 0.000368 time 0.2865 (0.2932) loss 3.3309 (3.4349) grad_norm 1.7465 (1.7069) [2022-10-01 03:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1200/1251] eta 0:00:14 lr 0.000367 time 0.3760 (0.2931) loss 2.7543 (3.4307) grad_norm 1.8801 (1.7083) [2022-10-01 03:08:42 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 176 training takes 0:06:06 [2022-10-01 03:08:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.248 (2.248) Loss 1.0457 (1.0457) Acc@1 77.246 (77.246) Acc@5 92.578 (92.578) [2022-10-01 03:08:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.988 Acc@5 93.832 [2022-10-01 03:08:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-01 03:08:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-01 03:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][0/1251] eta 1:07:29 lr 0.000367 time 3.2367 (3.2367) loss 4.3660 (4.3660) grad_norm 1.9928 (1.9928) [2022-10-01 03:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][100/1251] eta 0:06:08 lr 0.000367 time 0.2901 (0.3205) loss 3.2281 (3.3753) grad_norm 1.4977 (1.6960) [2022-10-01 03:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][200/1251] eta 0:05:21 lr 0.000366 time 0.2974 (0.3060) loss 3.5786 (3.4004) grad_norm 1.6224 (1.7021) [2022-10-01 03:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][300/1251] eta 0:04:46 lr 0.000366 time 0.2880 (0.3009) loss 3.7523 (3.4447) grad_norm 1.7818 (1.7331) [2022-10-01 03:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][400/1251] eta 0:04:14 lr 0.000365 time 0.3855 (0.2986) loss 3.8733 (3.3970) grad_norm 1.7203 (1.7371) [2022-10-01 03:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][500/1251] eta 0:03:43 lr 0.000365 time 0.2879 (0.2970) loss 4.0148 (3.3961) grad_norm 1.6098 (1.7243) [2022-10-01 03:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][600/1251] eta 0:03:12 lr 0.000365 time 0.2870 (0.2957) loss 4.1931 (3.4245) grad_norm 1.8028 (1.7238) [2022-10-01 03:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][700/1251] eta 0:02:42 lr 0.000364 time 0.2922 (0.2950) loss 2.2797 (3.4313) grad_norm 1.7362 (1.7318) [2022-10-01 03:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][800/1251] eta 0:02:12 lr 0.000364 time 0.2928 (0.2945) loss 3.5481 (3.4288) grad_norm 1.5985 (1.7300) [2022-10-01 03:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][900/1251] eta 0:01:43 lr 0.000363 time 0.3847 (0.2941) loss 3.9846 (3.4250) grad_norm 2.0231 (1.7310) [2022-10-01 03:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1000/1251] eta 0:01:13 lr 0.000363 time 0.2941 (0.2938) loss 4.3411 (3.4267) grad_norm 1.7679 (1.7343) [2022-10-01 03:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1100/1251] eta 0:00:44 lr 0.000363 time 0.2891 (0.2935) loss 2.7051 (3.4263) grad_norm 1.5780 (1.7300) [2022-10-01 03:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1200/1251] eta 0:00:14 lr 0.000362 time 0.2932 (0.2933) loss 3.4709 (3.4207) grad_norm 1.9239 (1.7300) [2022-10-01 03:15:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 177 training takes 0:06:07 [2022-10-01 03:15:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.282 (3.282) Loss 1.0024 (1.0024) Acc@1 76.953 (76.953) Acc@5 93.945 (93.945) [2022-10-01 03:15:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.376 Acc@5 94.070 [2022-10-01 03:15:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-01 03:15:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.38% [2022-10-01 03:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][0/1251] eta 0:47:43 lr 0.000362 time 2.2886 (2.2886) loss 2.5125 (2.5125) grad_norm 2.0942 (2.0942) [2022-10-01 03:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][100/1251] eta 0:06:03 lr 0.000362 time 0.3858 (0.3156) loss 4.2272 (3.4365) grad_norm 2.5825 (1.7013) [2022-10-01 03:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][200/1251] eta 0:05:18 lr 0.000361 time 0.2923 (0.3030) loss 3.6470 (3.4581) grad_norm 1.7224 (1.7260) [2022-10-01 03:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][300/1251] eta 0:04:43 lr 0.000361 time 0.2873 (0.2986) loss 3.7894 (3.4473) grad_norm 1.5812 (1.7095) [2022-10-01 03:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][400/1251] eta 0:04:12 lr 0.000360 time 0.2918 (0.2965) loss 3.1154 (3.4395) grad_norm 1.5860 (1.7149) [2022-10-01 03:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][500/1251] eta 0:03:41 lr 0.000360 time 0.2885 (0.2952) loss 2.4313 (3.4344) grad_norm 1.7083 (1.7192) [2022-10-01 03:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][600/1251] eta 0:03:11 lr 0.000360 time 0.3855 (0.2945) loss 3.7060 (3.4245) grad_norm 1.5258 (1.7254) [2022-10-01 03:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][700/1251] eta 0:02:41 lr 0.000359 time 0.2923 (0.2939) loss 3.0792 (3.4263) grad_norm 1.7718 (1.7239) [2022-10-01 03:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][800/1251] eta 0:02:12 lr 0.000359 time 0.2943 (0.2934) loss 3.1722 (3.4369) grad_norm 1.8370 (1.7178) [2022-10-01 03:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][900/1251] eta 0:01:42 lr 0.000358 time 0.2882 (0.2930) loss 3.7485 (3.4439) grad_norm 1.8464 (1.7162) [2022-10-01 03:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1000/1251] eta 0:01:13 lr 0.000358 time 0.2924 (0.2927) loss 2.6042 (3.4322) grad_norm 1.7265 (1.7198) [2022-10-01 03:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1100/1251] eta 0:00:44 lr 0.000358 time 0.3741 (0.2926) loss 3.9115 (3.4359) grad_norm 1.7428 (1.7209) [2022-10-01 03:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1200/1251] eta 0:00:14 lr 0.000357 time 0.2900 (0.2923) loss 3.7650 (3.4331) grad_norm 2.1493 (1.7242) [2022-10-01 03:21:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 178 training takes 0:06:05 [2022-10-01 03:21:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.943 (2.943) Loss 0.9164 (0.9164) Acc@1 77.344 (77.344) Acc@5 94.141 (94.141) [2022-10-01 03:21:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.102 Acc@5 93.874 [2022-10-01 03:21:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-01 03:21:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.38% [2022-10-01 03:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][0/1251] eta 0:57:19 lr 0.000357 time 2.7493 (2.7493) loss 3.6314 (3.6314) grad_norm 2.2569 (2.2569) [2022-10-01 03:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][100/1251] eta 0:06:02 lr 0.000357 time 0.2856 (0.3151) loss 3.0889 (3.3946) grad_norm 1.7904 (1.7217) [2022-10-01 03:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][200/1251] eta 0:05:17 lr 0.000356 time 0.2893 (0.3019) loss 2.5636 (3.4170) grad_norm 1.6831 (1.6943) [2022-10-01 03:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][300/1251] eta 0:04:43 lr 0.000356 time 0.3845 (0.2978) loss 2.6928 (3.4043) grad_norm 1.6668 (1.7016) [2022-10-01 03:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][400/1251] eta 0:04:11 lr 0.000355 time 0.2864 (0.2955) loss 2.9925 (3.4009) grad_norm 1.6219 (1.7017) [2022-10-01 03:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][500/1251] eta 0:03:40 lr 0.000355 time 0.2905 (0.2940) loss 3.7305 (3.4000) grad_norm 2.0127 (1.7024) [2022-10-01 03:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][600/1251] eta 0:03:10 lr 0.000355 time 0.2867 (0.2931) loss 3.4870 (3.4087) grad_norm 1.5938 (1.7049) [2022-10-01 03:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][700/1251] eta 0:02:41 lr 0.000354 time 0.2858 (0.2924) loss 3.2292 (3.4143) grad_norm 1.6374 (1.7114) [2022-10-01 03:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][800/1251] eta 0:02:11 lr 0.000354 time 0.3806 (0.2919) loss 3.7624 (3.4115) grad_norm 1.8432 (1.7130) [2022-10-01 03:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][900/1251] eta 0:01:42 lr 0.000353 time 0.2887 (0.2914) loss 3.8939 (3.4040) grad_norm 1.7047 (1.7177) [2022-10-01 03:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1000/1251] eta 0:01:13 lr 0.000353 time 0.2861 (0.2910) loss 3.5628 (3.4047) grad_norm 1.5711 (1.7199) [2022-10-01 03:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1100/1251] eta 0:00:43 lr 0.000353 time 0.2882 (0.2907) loss 4.2116 (3.4018) grad_norm 1.8898 (1.7225) [2022-10-01 03:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1200/1251] eta 0:00:14 lr 0.000352 time 0.2866 (0.2904) loss 3.7255 (3.4007) grad_norm 1.5135 (1.7209) [2022-10-01 03:27:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 179 training takes 0:06:03 [2022-10-01 03:27:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.404 (2.404) Loss 0.9245 (0.9245) Acc@1 76.953 (76.953) Acc@5 94.531 (94.531) [2022-10-01 03:27:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.114 Acc@5 94.036 [2022-10-01 03:27:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-01 03:27:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.38% [2022-10-01 03:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][0/1251] eta 1:01:14 lr 0.000352 time 2.9371 (2.9371) loss 3.2430 (3.2430) grad_norm 1.6214 (1.6214) [2022-10-01 03:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][100/1251] eta 0:06:05 lr 0.000352 time 0.2859 (0.3177) loss 2.9322 (3.3861) grad_norm 2.1445 (1.7681) [2022-10-01 03:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][200/1251] eta 0:05:19 lr 0.000351 time 0.2951 (0.3037) loss 3.2813 (3.3813) grad_norm 1.6815 (1.7587) [2022-10-01 03:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][300/1251] eta 0:04:44 lr 0.000351 time 0.2859 (0.2990) loss 3.6699 (3.3757) grad_norm 2.2133 (1.7405) [2022-10-01 03:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][400/1251] eta 0:04:12 lr 0.000350 time 0.2937 (0.2966) loss 3.7433 (3.3884) grad_norm 1.6341 (1.7362) [2022-10-01 03:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][500/1251] eta 0:03:41 lr 0.000350 time 0.3824 (0.2954) loss 2.3810 (3.3847) grad_norm 1.7432 (1.7387) [2022-10-01 03:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][600/1251] eta 0:03:11 lr 0.000350 time 0.2897 (0.2947) loss 3.0126 (3.3935) grad_norm 1.7290 (1.7395) [2022-10-01 03:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][700/1251] eta 0:02:42 lr 0.000349 time 0.2893 (0.2941) loss 3.9242 (3.4002) grad_norm 2.2383 (1.7370) [2022-10-01 03:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][800/1251] eta 0:02:12 lr 0.000349 time 0.2902 (0.2936) loss 3.3749 (3.4065) grad_norm 1.8731 (1.7347) [2022-10-01 03:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][900/1251] eta 0:01:42 lr 0.000348 time 0.2882 (0.2932) loss 3.9545 (3.4113) grad_norm 1.6192 (1.7360) [2022-10-01 03:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1000/1251] eta 0:01:13 lr 0.000348 time 0.3851 (0.2929) loss 3.4514 (3.4062) grad_norm 1.6748 (1.7357) [2022-10-01 03:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1100/1251] eta 0:00:44 lr 0.000348 time 0.2932 (0.2927) loss 3.6863 (3.4094) grad_norm 1.5946 (1.7373) [2022-10-01 03:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1200/1251] eta 0:00:14 lr 0.000347 time 0.2898 (0.2924) loss 4.0618 (3.4087) grad_norm 1.5077 (1.7382) [2022-10-01 03:33:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 180 training takes 0:06:06 [2022-10-01 03:33:56 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saving...... [2022-10-01 03:33:56 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saved !!! [2022-10-01 03:33:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.796 (2.796) Loss 0.9589 (0.9589) Acc@1 78.809 (78.809) Acc@5 93.457 (93.457) [2022-10-01 03:34:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.232 Acc@5 94.026 [2022-10-01 03:34:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-10-01 03:34:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.38% [2022-10-01 03:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][0/1251] eta 1:10:20 lr 0.000347 time 3.3735 (3.3735) loss 3.8234 (3.8234) grad_norm 1.5865 (1.5865) [2022-10-01 03:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][100/1251] eta 0:06:08 lr 0.000347 time 0.2881 (0.3200) loss 2.8979 (3.5043) grad_norm 1.5687 (1.7607) [2022-10-01 03:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][200/1251] eta 0:05:20 lr 0.000346 time 0.3821 (0.3049) loss 3.9926 (3.4858) grad_norm 1.8038 (1.7655) [2022-10-01 03:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][300/1251] eta 0:04:44 lr 0.000346 time 0.2872 (0.2994) loss 2.4596 (3.4595) grad_norm 1.5882 (1.7493) [2022-10-01 03:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][400/1251] eta 0:04:12 lr 0.000346 time 0.2918 (0.2968) loss 3.0833 (3.4294) grad_norm 1.6194 (1.7738) [2022-10-01 03:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][500/1251] eta 0:03:41 lr 0.000345 time 0.2874 (0.2951) loss 3.4972 (3.4430) grad_norm 1.8683 (1.7831) [2022-10-01 03:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][600/1251] eta 0:03:11 lr 0.000345 time 0.2907 (0.2940) loss 3.6508 (3.4378) grad_norm 1.5758 (1.7800) [2022-10-01 03:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][700/1251] eta 0:02:41 lr 0.000344 time 0.3748 (0.2933) loss 3.8179 (3.4364) grad_norm 1.7389 (1.7780) [2022-10-01 03:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][800/1251] eta 0:02:12 lr 0.000344 time 0.2893 (0.2927) loss 3.6410 (3.4262) grad_norm 1.7538 (1.7754) [2022-10-01 03:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][900/1251] eta 0:01:42 lr 0.000344 time 0.2866 (0.2923) loss 3.5947 (3.4205) grad_norm 1.8047 (1.7728) [2022-10-01 03:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1000/1251] eta 0:01:13 lr 0.000343 time 0.2884 (0.2919) loss 2.9863 (3.4144) grad_norm 2.5793 (1.7735) [2022-10-01 03:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1100/1251] eta 0:00:44 lr 0.000343 time 0.2860 (0.2916) loss 2.8762 (3.4171) grad_norm 1.7243 (1.7687) [2022-10-01 03:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1200/1251] eta 0:00:14 lr 0.000342 time 0.3850 (0.2914) loss 3.0019 (3.4166) grad_norm 1.9874 (1.7688) [2022-10-01 03:40:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 181 training takes 0:06:04 [2022-10-01 03:40:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.737 (2.737) Loss 0.9309 (0.9309) Acc@1 78.223 (78.223) Acc@5 93.652 (93.652) [2022-10-01 03:40:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.554 Acc@5 94.068 [2022-10-01 03:40:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-01 03:40:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.55% [2022-10-01 03:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][0/1251] eta 0:49:20 lr 0.000342 time 2.3663 (2.3663) loss 2.5747 (2.5747) grad_norm 1.6729 (1.6729) [2022-10-01 03:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][100/1251] eta 0:06:05 lr 0.000342 time 0.2927 (0.3176) loss 3.6626 (3.3489) grad_norm 1.6332 (1.7304) [2022-10-01 03:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][200/1251] eta 0:05:19 lr 0.000341 time 0.2906 (0.3040) loss 3.6504 (3.3628) grad_norm 1.8590 (1.7583) [2022-10-01 03:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][300/1251] eta 0:04:44 lr 0.000341 time 0.2925 (0.2995) loss 4.0616 (3.3840) grad_norm 1.6482 (1.7583) [2022-10-01 03:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][400/1251] eta 0:04:13 lr 0.000341 time 0.3777 (0.2974) loss 3.8798 (3.4236) grad_norm 1.7381 (1.7581) [2022-10-01 03:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][500/1251] eta 0:03:42 lr 0.000340 time 0.2917 (0.2959) loss 3.6836 (3.4181) grad_norm 1.7562 (1.7550) [2022-10-01 03:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][600/1251] eta 0:03:11 lr 0.000340 time 0.2943 (0.2949) loss 3.4216 (3.4198) grad_norm 1.6438 (1.7557) [2022-10-01 03:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][700/1251] eta 0:02:42 lr 0.000339 time 0.2980 (0.2942) loss 3.5314 (3.4198) grad_norm 1.7838 (1.7594) [2022-10-01 03:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][800/1251] eta 0:02:12 lr 0.000339 time 0.2944 (0.2937) loss 2.4598 (3.4024) grad_norm 1.7223 (1.7576) [2022-10-01 03:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][900/1251] eta 0:01:42 lr 0.000339 time 0.3884 (0.2934) loss 3.5814 (3.4009) grad_norm 2.0146 (1.7561) [2022-10-01 03:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1000/1251] eta 0:01:13 lr 0.000338 time 0.2872 (0.2930) loss 3.5631 (3.4075) grad_norm 1.5618 (1.7556) [2022-10-01 03:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1100/1251] eta 0:00:44 lr 0.000338 time 0.2967 (0.2927) loss 3.3315 (3.4149) grad_norm 1.8170 (1.7571) [2022-10-01 03:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1200/1251] eta 0:00:14 lr 0.000338 time 0.2842 (0.2924) loss 3.3269 (3.4103) grad_norm 1.9738 (1.7571) [2022-10-01 03:46:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 182 training takes 0:06:05 [2022-10-01 03:46:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.834 (2.834) Loss 0.9825 (0.9825) Acc@1 76.660 (76.660) Acc@5 94.336 (94.336) [2022-10-01 03:46:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.304 Acc@5 94.070 [2022-10-01 03:46:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-01 03:46:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.55% [2022-10-01 03:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][0/1251] eta 1:08:20 lr 0.000337 time 3.2776 (3.2776) loss 3.0582 (3.0582) grad_norm 1.7234 (1.7234) [2022-10-01 03:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][100/1251] eta 0:06:07 lr 0.000337 time 0.3811 (0.3191) loss 3.0309 (3.3556) grad_norm 1.9302 (1.7851) [2022-10-01 03:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][200/1251] eta 0:05:19 lr 0.000337 time 0.2893 (0.3041) loss 3.6977 (3.4494) grad_norm 1.9090 (1.7670) [2022-10-01 03:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][300/1251] eta 0:04:44 lr 0.000336 time 0.2866 (0.2990) loss 3.1559 (3.4416) grad_norm 1.7165 (1.7584) [2022-10-01 03:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][400/1251] eta 0:04:12 lr 0.000336 time 0.2862 (0.2966) loss 3.8231 (3.4254) grad_norm 1.6968 (1.7619) [2022-10-01 03:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][500/1251] eta 0:03:41 lr 0.000335 time 0.2890 (0.2954) loss 3.6799 (3.4029) grad_norm 1.7831 (1.7622) [2022-10-01 03:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][600/1251] eta 0:03:11 lr 0.000335 time 0.3817 (0.2947) loss 3.7921 (3.4079) grad_norm 1.6395 (1.7579) [2022-10-01 03:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][700/1251] eta 0:02:42 lr 0.000335 time 0.2887 (0.2940) loss 3.6538 (3.4162) grad_norm 1.6420 (1.7543) [2022-10-01 03:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][800/1251] eta 0:02:12 lr 0.000334 time 0.2887 (0.2935) loss 3.7539 (3.4108) grad_norm 1.9951 (1.7575) [2022-10-01 03:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][900/1251] eta 0:01:42 lr 0.000334 time 0.2884 (0.2931) loss 3.3452 (3.4104) grad_norm 1.5309 (1.7608) [2022-10-01 03:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1000/1251] eta 0:01:13 lr 0.000333 time 0.2895 (0.2928) loss 2.5099 (3.4113) grad_norm 1.6998 (1.7589) [2022-10-01 03:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1100/1251] eta 0:00:44 lr 0.000333 time 0.3871 (0.2926) loss 3.4651 (3.4124) grad_norm 2.0119 (1.7612) [2022-10-01 03:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1200/1251] eta 0:00:14 lr 0.000333 time 0.2898 (0.2923) loss 3.6594 (3.4025) grad_norm 1.6958 (1.7648) [2022-10-01 03:52:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 183 training takes 0:06:05 [2022-10-01 03:52:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.195 (3.195) Loss 0.9267 (0.9267) Acc@1 77.344 (77.344) Acc@5 94.336 (94.336) [2022-10-01 03:53:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.402 Acc@5 94.006 [2022-10-01 03:53:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-01 03:53:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.55% [2022-10-01 03:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][0/1251] eta 0:52:59 lr 0.000332 time 2.5418 (2.5418) loss 2.6470 (2.6470) grad_norm 1.7579 (1.7579) [2022-10-01 03:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][100/1251] eta 0:06:04 lr 0.000332 time 0.2985 (0.3167) loss 2.1166 (3.3602) grad_norm 1.5798 (1.7803) [2022-10-01 03:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][200/1251] eta 0:05:19 lr 0.000332 time 0.2915 (0.3039) loss 2.9901 (3.3848) grad_norm 1.6092 (1.8033) [2022-10-01 03:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][300/1251] eta 0:04:45 lr 0.000331 time 0.3862 (0.2999) loss 3.1363 (3.4161) grad_norm 2.0103 (1.8041) [2022-10-01 03:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][400/1251] eta 0:04:13 lr 0.000331 time 0.2885 (0.2978) loss 3.6147 (3.4211) grad_norm 1.6468 (1.7889) [2022-10-01 03:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][500/1251] eta 0:03:42 lr 0.000331 time 0.2945 (0.2964) loss 3.4162 (3.4242) grad_norm 1.9641 (1.7887) [2022-10-01 03:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][600/1251] eta 0:03:12 lr 0.000330 time 0.2917 (0.2954) loss 3.5595 (3.4366) grad_norm 1.6770 (1.7801) [2022-10-01 03:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][700/1251] eta 0:02:42 lr 0.000330 time 0.2939 (0.2946) loss 3.9354 (3.4355) grad_norm 1.7906 (1.7769) [2022-10-01 03:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][800/1251] eta 0:02:12 lr 0.000329 time 0.3763 (0.2942) loss 3.6431 (3.4318) grad_norm 1.7920 (1.7851) [2022-10-01 03:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][900/1251] eta 0:01:43 lr 0.000329 time 0.2989 (0.2938) loss 3.8281 (3.4276) grad_norm 1.5794 (1.7809) [2022-10-01 03:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1000/1251] eta 0:01:13 lr 0.000329 time 0.2886 (0.2933) loss 2.4413 (3.4172) grad_norm 1.5976 (1.7771) [2022-10-01 03:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1100/1251] eta 0:00:44 lr 0.000328 time 0.2934 (0.2930) loss 3.0708 (3.4159) grad_norm 1.5568 (1.7740) [2022-10-01 03:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1200/1251] eta 0:00:14 lr 0.000328 time 0.2865 (0.2926) loss 3.5200 (3.4194) grad_norm 1.7654 (1.7717) [2022-10-01 03:59:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 184 training takes 0:06:06 [2022-10-01 03:59:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.284 (3.284) Loss 1.0022 (1.0022) Acc@1 77.051 (77.051) Acc@5 93.066 (93.066) [2022-10-01 03:59:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.688 Acc@5 93.976 [2022-10-01 03:59:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-01 03:59:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.69% [2022-10-01 03:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][0/1251] eta 1:02:14 lr 0.000328 time 2.9856 (2.9856) loss 3.3017 (3.3017) grad_norm 1.7506 (1.7506) [2022-10-01 03:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][100/1251] eta 0:06:08 lr 0.000327 time 0.2925 (0.3201) loss 2.3240 (3.3633) grad_norm 1.8324 (1.7988) [2022-10-01 04:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][200/1251] eta 0:05:20 lr 0.000327 time 0.2898 (0.3050) loss 3.4086 (3.3635) grad_norm 1.9140 (1.7968) [2022-10-01 04:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][300/1251] eta 0:04:45 lr 0.000326 time 0.2916 (0.2999) loss 3.1810 (3.3826) grad_norm 1.7793 (1.7952) [2022-10-01 04:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][400/1251] eta 0:04:13 lr 0.000326 time 0.2925 (0.2974) loss 3.6700 (3.3814) grad_norm 1.6085 (1.8007) [2022-10-01 04:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][500/1251] eta 0:03:42 lr 0.000326 time 0.3833 (0.2960) loss 3.4465 (3.3783) grad_norm 1.6980 (1.8018) [2022-10-01 04:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][600/1251] eta 0:03:11 lr 0.000325 time 0.2891 (0.2948) loss 3.3539 (3.3786) grad_norm 1.5728 (1.8026) [2022-10-01 04:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][700/1251] eta 0:02:42 lr 0.000325 time 0.2871 (0.2943) loss 3.2463 (3.3934) grad_norm 1.6741 (1.7994) [2022-10-01 04:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][800/1251] eta 0:02:12 lr 0.000325 time 0.2882 (0.2943) loss 3.5723 (3.3975) grad_norm 1.7905 (1.8017) [2022-10-01 04:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][900/1251] eta 0:01:43 lr 0.000324 time 0.2925 (0.2936) loss 3.3958 (3.3931) grad_norm 1.5295 (1.8005) [2022-10-01 04:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1000/1251] eta 0:01:13 lr 0.000324 time 0.3785 (0.2931) loss 3.4043 (3.3973) grad_norm 1.8543 (1.7999) [2022-10-01 04:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1100/1251] eta 0:00:44 lr 0.000323 time 0.2891 (0.2929) loss 2.2053 (3.3967) grad_norm 1.8665 (1.7967) [2022-10-01 04:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1200/1251] eta 0:00:14 lr 0.000323 time 0.2904 (0.2925) loss 3.7833 (3.3976) grad_norm 1.6204 (1.7967) [2022-10-01 04:05:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 185 training takes 0:06:06 [2022-10-01 04:05:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.638 (2.638) Loss 0.8527 (0.8527) Acc@1 79.590 (79.590) Acc@5 95.508 (95.508) [2022-10-01 04:05:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.310 Acc@5 93.964 [2022-10-01 04:05:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-01 04:05:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.69% [2022-10-01 04:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][0/1251] eta 1:03:51 lr 0.000323 time 3.0627 (3.0627) loss 2.5346 (2.5346) grad_norm 1.5927 (1.5927) [2022-10-01 04:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][100/1251] eta 0:06:07 lr 0.000322 time 0.2876 (0.3192) loss 2.3236 (3.3954) grad_norm 1.5241 (1.7335) [2022-10-01 04:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][200/1251] eta 0:05:20 lr 0.000322 time 0.3857 (0.3054) loss 3.4654 (3.3845) grad_norm 1.5933 (1.7632) [2022-10-01 04:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][300/1251] eta 0:04:45 lr 0.000322 time 0.2919 (0.3006) loss 3.5713 (3.3870) grad_norm 1.7965 (1.7791) [2022-10-01 04:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][400/1251] eta 0:04:13 lr 0.000321 time 0.2886 (0.2982) loss 3.7367 (3.4052) grad_norm 1.9535 (1.7710) [2022-10-01 04:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][500/1251] eta 0:03:42 lr 0.000321 time 0.2891 (0.2968) loss 2.9825 (3.3948) grad_norm 1.7328 (1.7677) [2022-10-01 04:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][600/1251] eta 0:03:12 lr 0.000320 time 0.2902 (0.2958) loss 3.7633 (3.3891) grad_norm 1.5741 (1.7633) [2022-10-01 04:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][700/1251] eta 0:02:42 lr 0.000320 time 0.3872 (0.2952) loss 3.7886 (3.3851) grad_norm 1.7398 (1.7657) [2022-10-01 04:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][800/1251] eta 0:02:12 lr 0.000320 time 0.2902 (0.2948) loss 4.0052 (3.3910) grad_norm 1.7706 (1.7697) [2022-10-01 04:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][900/1251] eta 0:01:43 lr 0.000319 time 0.2883 (0.2943) loss 3.4913 (3.3933) grad_norm 1.5848 (1.7670) [2022-10-01 04:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1000/1251] eta 0:01:13 lr 0.000319 time 0.2906 (0.2940) loss 2.6168 (3.3845) grad_norm 1.7650 (1.7648) [2022-10-01 04:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1100/1251] eta 0:00:44 lr 0.000319 time 0.2906 (0.2936) loss 3.7175 (3.3783) grad_norm 1.6948 (1.7626) [2022-10-01 04:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1200/1251] eta 0:00:14 lr 0.000318 time 0.3800 (0.2935) loss 3.6856 (3.3740) grad_norm 1.8695 (1.7679) [2022-10-01 04:11:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 186 training takes 0:06:07 [2022-10-01 04:11:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.862 (2.862) Loss 0.8981 (0.8981) Acc@1 78.613 (78.613) Acc@5 95.508 (95.508) [2022-10-01 04:12:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.596 Acc@5 94.094 [2022-10-01 04:12:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-01 04:12:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.69% [2022-10-01 04:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][0/1251] eta 1:10:59 lr 0.000318 time 3.4047 (3.4047) loss 3.2715 (3.2715) grad_norm 1.5485 (1.5485) [2022-10-01 04:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][100/1251] eta 0:06:08 lr 0.000318 time 0.2886 (0.3200) loss 3.8722 (3.3892) grad_norm 1.6186 (1.8032) [2022-10-01 04:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][200/1251] eta 0:05:20 lr 0.000317 time 0.2911 (0.3046) loss 3.8322 (3.3944) grad_norm 1.7816 (1.7930) [2022-10-01 04:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][300/1251] eta 0:04:44 lr 0.000317 time 0.2901 (0.2995) loss 3.6450 (3.4026) grad_norm 1.6671 (1.7903) [2022-10-01 04:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][400/1251] eta 0:04:12 lr 0.000316 time 0.3834 (0.2972) loss 4.1294 (3.3846) grad_norm 1.7271 (1.7987) [2022-10-01 04:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][500/1251] eta 0:03:42 lr 0.000316 time 0.2917 (0.2958) loss 2.7219 (3.3926) grad_norm 1.5613 (1.8028) [2022-10-01 04:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][600/1251] eta 0:03:11 lr 0.000316 time 0.2950 (0.2948) loss 3.9548 (3.4057) grad_norm 1.8858 (1.7963) [2022-10-01 04:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][700/1251] eta 0:02:42 lr 0.000315 time 0.2887 (0.2942) loss 3.0493 (3.3895) grad_norm 1.8946 (1.7903) [2022-10-01 04:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][800/1251] eta 0:02:12 lr 0.000315 time 0.2957 (0.2936) loss 2.5313 (3.3778) grad_norm 1.7682 (1.7892) [2022-10-01 04:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][900/1251] eta 0:01:42 lr 0.000315 time 0.3846 (0.2933) loss 2.2177 (3.3651) grad_norm 1.7702 (1.7911) [2022-10-01 04:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1000/1251] eta 0:01:13 lr 0.000314 time 0.2925 (0.2929) loss 3.7004 (3.3758) grad_norm 1.6006 (1.7899) [2022-10-01 04:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1100/1251] eta 0:00:44 lr 0.000314 time 0.2907 (0.2926) loss 3.7335 (3.3765) grad_norm 1.6808 (1.7918) [2022-10-01 04:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1200/1251] eta 0:00:14 lr 0.000313 time 0.2927 (0.2923) loss 3.7260 (3.3803) grad_norm 1.8371 (1.7918) [2022-10-01 04:18:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 187 training takes 0:06:05 [2022-10-01 04:18:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.943 (2.943) Loss 0.8606 (0.8606) Acc@1 79.883 (79.883) Acc@5 95.410 (95.410) [2022-10-01 04:18:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.554 Acc@5 94.152 [2022-10-01 04:18:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-01 04:18:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.69% [2022-10-01 04:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][0/1251] eta 1:06:33 lr 0.000313 time 3.1920 (3.1920) loss 2.8734 (2.8734) grad_norm 1.5585 (1.5585) [2022-10-01 04:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][100/1251] eta 0:06:09 lr 0.000313 time 0.3875 (0.3211) loss 3.4026 (3.3609) grad_norm 2.1148 (1.7714) [2022-10-01 04:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][200/1251] eta 0:05:21 lr 0.000312 time 0.2882 (0.3059) loss 3.1826 (3.3966) grad_norm 1.8114 (1.7944) [2022-10-01 04:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][300/1251] eta 0:04:46 lr 0.000312 time 0.2899 (0.3009) loss 3.2330 (3.3887) grad_norm 1.7432 (1.8022) [2022-10-01 04:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][400/1251] eta 0:04:13 lr 0.000312 time 0.2932 (0.2984) loss 3.9573 (3.4046) grad_norm 1.9917 (1.8005) [2022-10-01 04:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][500/1251] eta 0:03:42 lr 0.000311 time 0.2918 (0.2968) loss 3.0565 (3.4011) grad_norm 1.9095 (1.8013) [2022-10-01 04:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][600/1251] eta 0:03:12 lr 0.000311 time 0.3895 (0.2959) loss 3.4350 (3.3824) grad_norm 1.8726 (1.8015) [2022-10-01 04:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][700/1251] eta 0:02:42 lr 0.000311 time 0.2963 (0.2951) loss 2.7581 (3.3938) grad_norm 2.0357 (1.8015) [2022-10-01 04:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][800/1251] eta 0:02:12 lr 0.000310 time 0.2895 (0.2946) loss 4.0091 (3.3969) grad_norm 1.8296 (1.8006) [2022-10-01 04:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][900/1251] eta 0:01:43 lr 0.000310 time 0.2868 (0.2941) loss 3.9279 (3.3953) grad_norm 1.6198 (1.8000) [2022-10-01 04:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1000/1251] eta 0:01:13 lr 0.000309 time 0.2913 (0.2937) loss 2.6759 (3.3876) grad_norm 1.8616 (1.8020) [2022-10-01 04:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1100/1251] eta 0:00:44 lr 0.000309 time 0.3826 (0.2934) loss 2.6909 (3.3804) grad_norm 1.9609 (1.8075) [2022-10-01 04:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1200/1251] eta 0:00:14 lr 0.000309 time 0.2876 (0.2931) loss 3.9362 (3.3825) grad_norm 1.6617 (1.8045) [2022-10-01 04:24:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 188 training takes 0:06:06 [2022-10-01 04:24:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.148 (3.148) Loss 0.8967 (0.8967) Acc@1 79.297 (79.297) Acc@5 94.141 (94.141) [2022-10-01 04:24:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.758 Acc@5 94.178 [2022-10-01 04:24:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-01 04:24:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.76% [2022-10-01 04:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][0/1251] eta 1:06:41 lr 0.000308 time 3.1990 (3.1990) loss 3.4104 (3.4104) grad_norm 1.6531 (1.6531) [2022-10-01 04:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][100/1251] eta 0:06:07 lr 0.000308 time 0.2884 (0.3195) loss 3.5795 (3.3478) grad_norm 1.8210 (1.8338) [2022-10-01 04:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][200/1251] eta 0:05:19 lr 0.000308 time 0.2950 (0.3044) loss 2.8717 (3.3946) grad_norm 1.8268 (1.8344) [2022-10-01 04:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][300/1251] eta 0:04:45 lr 0.000307 time 0.3839 (0.3001) loss 4.0208 (3.3849) grad_norm 2.0926 (1.8306) [2022-10-01 04:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][400/1251] eta 0:04:13 lr 0.000307 time 0.2953 (0.2974) loss 4.1105 (3.3705) grad_norm 1.5838 (1.8237) [2022-10-01 04:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][500/1251] eta 0:03:42 lr 0.000307 time 0.2970 (0.2959) loss 3.7566 (3.3598) grad_norm 1.7053 (1.8339) [2022-10-01 04:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][600/1251] eta 0:03:11 lr 0.000306 time 0.2985 (0.2948) loss 3.7612 (3.3467) grad_norm 1.9714 (1.8344) [2022-10-01 04:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][700/1251] eta 0:02:42 lr 0.000306 time 0.2856 (0.2940) loss 3.7253 (3.3416) grad_norm 1.7328 (1.8309) [2022-10-01 04:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][800/1251] eta 0:02:12 lr 0.000305 time 0.3872 (0.2935) loss 3.5403 (3.3503) grad_norm 1.8620 (1.8269) [2022-10-01 04:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][900/1251] eta 0:01:42 lr 0.000305 time 0.2860 (0.2930) loss 3.6262 (3.3470) grad_norm 1.8301 (1.8235) [2022-10-01 04:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1000/1251] eta 0:01:13 lr 0.000305 time 0.2874 (0.2926) loss 3.3626 (3.3528) grad_norm 1.7751 (1.8256) [2022-10-01 04:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1100/1251] eta 0:00:44 lr 0.000304 time 0.2843 (0.2922) loss 3.4903 (3.3478) grad_norm 2.0988 (1.8242) [2022-10-01 04:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1200/1251] eta 0:00:14 lr 0.000304 time 0.2862 (0.2918) loss 3.7560 (3.3535) grad_norm 2.0966 (1.8235) [2022-10-01 04:30:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 189 training takes 0:06:05 [2022-10-01 04:30:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.749 (2.749) Loss 0.8183 (0.8183) Acc@1 79.980 (79.980) Acc@5 95.996 (95.996) [2022-10-01 04:30:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.772 Acc@5 94.200 [2022-10-01 04:30:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-01 04:30:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.77% [2022-10-01 04:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][0/1251] eta 1:03:44 lr 0.000304 time 3.0571 (3.0571) loss 3.3952 (3.3952) grad_norm 1.5878 (1.5878) [2022-10-01 04:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][100/1251] eta 0:06:05 lr 0.000303 time 0.2875 (0.3172) loss 3.6494 (3.3247) grad_norm 1.6489 (1.7868) [2022-10-01 04:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][200/1251] eta 0:05:18 lr 0.000303 time 0.2884 (0.3029) loss 2.6337 (3.3717) grad_norm 1.7116 (1.7930) [2022-10-01 04:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][300/1251] eta 0:04:43 lr 0.000303 time 0.2912 (0.2981) loss 3.5318 (3.3436) grad_norm 1.8698 (1.8193) [2022-10-01 04:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][400/1251] eta 0:04:11 lr 0.000302 time 0.2864 (0.2955) loss 2.8932 (3.3495) grad_norm 1.7635 (1.8202) [2022-10-01 04:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][500/1251] eta 0:03:41 lr 0.000302 time 0.3894 (0.2943) loss 4.0475 (3.3632) grad_norm 1.8932 (1.8317) [2022-10-01 04:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][600/1251] eta 0:03:10 lr 0.000301 time 0.2863 (0.2933) loss 3.4623 (3.3569) grad_norm 1.9577 (1.8354) [2022-10-01 04:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][700/1251] eta 0:02:41 lr 0.000301 time 0.2872 (0.2925) loss 3.7439 (3.3657) grad_norm 1.5768 (1.8397) [2022-10-01 04:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][800/1251] eta 0:02:11 lr 0.000301 time 0.2888 (0.2920) loss 3.5399 (3.3622) grad_norm 2.3093 (1.8416) [2022-10-01 04:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][900/1251] eta 0:01:42 lr 0.000300 time 0.2903 (0.2916) loss 2.5624 (3.3652) grad_norm 1.8506 (1.8432) [2022-10-01 04:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1000/1251] eta 0:01:13 lr 0.000300 time 0.3846 (0.2913) loss 3.4942 (3.3640) grad_norm 2.3605 (1.8432) [2022-10-01 04:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1100/1251] eta 0:00:43 lr 0.000300 time 0.2897 (0.2911) loss 3.7686 (3.3684) grad_norm 1.8481 (1.8438) [2022-10-01 04:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1200/1251] eta 0:00:14 lr 0.000299 time 0.2877 (0.2908) loss 2.0716 (3.3628) grad_norm 1.6459 (1.8417) [2022-10-01 04:37:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 190 training takes 0:06:04 [2022-10-01 04:37:02 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saving...... [2022-10-01 04:37:03 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saved !!! [2022-10-01 04:37:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.347 (2.347) Loss 1.0167 (1.0167) Acc@1 76.758 (76.758) Acc@5 93.750 (93.750) [2022-10-01 04:37:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.524 Acc@5 94.176 [2022-10-01 04:37:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-10-01 04:37:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.77% [2022-10-01 04:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][0/1251] eta 1:12:20 lr 0.000299 time 3.4698 (3.4698) loss 3.5719 (3.5719) grad_norm 1.6395 (1.6395) [2022-10-01 04:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][100/1251] eta 0:06:11 lr 0.000299 time 0.2901 (0.3231) loss 3.5129 (3.3424) grad_norm 1.7465 (1.9097) [2022-10-01 04:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][200/1251] eta 0:05:23 lr 0.000298 time 0.3850 (0.3076) loss 3.2612 (3.3605) grad_norm 2.0573 (1.8866) [2022-10-01 04:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][300/1251] eta 0:04:47 lr 0.000298 time 0.2943 (0.3023) loss 3.1270 (3.3723) grad_norm 1.8047 (1.8699) [2022-10-01 04:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][400/1251] eta 0:04:14 lr 0.000297 time 0.2917 (0.2996) loss 3.8353 (3.3963) grad_norm 1.8430 (1.8680) [2022-10-01 04:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][500/1251] eta 0:03:43 lr 0.000297 time 0.2922 (0.2979) loss 3.8354 (3.3831) grad_norm 1.8995 (1.8614) [2022-10-01 04:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][600/1251] eta 0:03:13 lr 0.000297 time 0.2899 (0.2970) loss 3.7148 (3.3904) grad_norm 2.0771 (1.8594) [2022-10-01 04:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][700/1251] eta 0:02:43 lr 0.000296 time 0.3905 (0.2963) loss 2.8499 (3.3879) grad_norm 1.5706 (1.8630) [2022-10-01 04:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][800/1251] eta 0:02:13 lr 0.000296 time 0.2896 (0.2958) loss 3.3461 (3.3914) grad_norm 1.8076 (1.8641) [2022-10-01 04:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][900/1251] eta 0:01:43 lr 0.000296 time 0.2904 (0.2953) loss 3.4340 (3.3849) grad_norm 1.5915 (1.8564) [2022-10-01 04:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1000/1251] eta 0:01:14 lr 0.000295 time 0.2867 (0.2949) loss 3.8716 (3.3697) grad_norm 1.9201 (1.8521) [2022-10-01 04:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1100/1251] eta 0:00:44 lr 0.000295 time 0.2870 (0.2945) loss 3.0338 (3.3712) grad_norm 2.0707 (1.8506) [2022-10-01 04:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1200/1251] eta 0:00:15 lr 0.000294 time 0.3833 (0.2942) loss 2.3776 (3.3717) grad_norm 1.8538 (1.8487) [2022-10-01 04:43:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 191 training takes 0:06:08 [2022-10-01 04:43:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.227 (2.227) Loss 0.8434 (0.8434) Acc@1 78.711 (78.711) Acc@5 95.117 (95.117) [2022-10-01 04:43:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.772 Acc@5 94.210 [2022-10-01 04:43:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-01 04:43:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.77% [2022-10-01 04:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][0/1251] eta 1:00:03 lr 0.000294 time 2.8804 (2.8804) loss 2.1841 (2.1841) grad_norm 2.0226 (2.0226) [2022-10-01 04:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][100/1251] eta 0:06:01 lr 0.000294 time 0.2889 (0.3141) loss 4.0303 (3.3796) grad_norm 1.9844 (1.8317) [2022-10-01 04:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][200/1251] eta 0:05:17 lr 0.000293 time 0.2865 (0.3017) loss 3.7698 (3.3460) grad_norm 1.5673 (1.8376) [2022-10-01 04:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][300/1251] eta 0:04:42 lr 0.000293 time 0.2900 (0.2974) loss 2.3777 (3.3420) grad_norm 2.1626 (1.8570) [2022-10-01 04:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][400/1251] eta 0:04:11 lr 0.000293 time 0.3878 (0.2954) loss 3.8286 (3.3476) grad_norm 1.8045 (1.8660) [2022-10-01 04:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][500/1251] eta 0:03:40 lr 0.000292 time 0.2899 (0.2941) loss 3.6851 (3.3443) grad_norm 2.2307 (1.8588) [2022-10-01 04:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][600/1251] eta 0:03:10 lr 0.000292 time 0.2861 (0.2931) loss 3.9882 (3.3480) grad_norm 1.8992 (1.8606) [2022-10-01 04:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][700/1251] eta 0:02:41 lr 0.000292 time 0.2891 (0.2925) loss 2.7998 (3.3493) grad_norm 1.9296 (1.8605) [2022-10-01 04:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][800/1251] eta 0:02:11 lr 0.000291 time 0.2879 (0.2919) loss 2.8521 (3.3531) grad_norm 1.6653 (1.8656) [2022-10-01 04:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][900/1251] eta 0:01:42 lr 0.000291 time 0.3894 (0.2917) loss 4.0921 (3.3699) grad_norm 1.7268 (1.8682) [2022-10-01 04:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1000/1251] eta 0:01:13 lr 0.000290 time 0.2893 (0.2914) loss 3.2545 (3.3688) grad_norm 1.9181 (1.8676) [2022-10-01 04:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1100/1251] eta 0:00:43 lr 0.000290 time 0.2880 (0.2911) loss 3.3767 (3.3683) grad_norm 1.9271 (1.8643) [2022-10-01 04:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1200/1251] eta 0:00:14 lr 0.000290 time 0.2862 (0.2908) loss 2.2624 (3.3703) grad_norm 1.7999 (1.8628) [2022-10-01 04:49:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 192 training takes 0:06:04 [2022-10-01 04:49:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.021 (3.021) Loss 0.9353 (0.9353) Acc@1 77.832 (77.832) Acc@5 94.043 (94.043) [2022-10-01 04:49:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.706 Acc@5 94.230 [2022-10-01 04:49:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-01 04:49:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.77% [2022-10-01 04:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][0/1251] eta 0:48:29 lr 0.000290 time 2.3260 (2.3260) loss 3.8812 (3.8812) grad_norm 1.5814 (1.5814) [2022-10-01 04:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][100/1251] eta 0:06:04 lr 0.000289 time 0.3869 (0.3170) loss 2.9259 (3.3886) grad_norm 1.7117 (1.9100) [2022-10-01 04:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][200/1251] eta 0:05:18 lr 0.000289 time 0.2873 (0.3026) loss 2.7221 (3.3718) grad_norm 1.8009 (1.8920) [2022-10-01 04:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][300/1251] eta 0:04:43 lr 0.000288 time 0.2905 (0.2979) loss 2.2098 (3.3534) grad_norm 1.8648 (1.8927) [2022-10-01 04:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][400/1251] eta 0:04:11 lr 0.000288 time 0.2886 (0.2956) loss 3.7178 (3.3300) grad_norm 1.7717 (1.8923) [2022-10-01 04:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][500/1251] eta 0:03:40 lr 0.000288 time 0.2958 (0.2942) loss 3.1885 (3.3239) grad_norm 1.6853 (1.8833) [2022-10-01 04:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][600/1251] eta 0:03:11 lr 0.000287 time 0.3813 (0.2935) loss 3.8178 (3.3355) grad_norm 1.8462 (1.8793) [2022-10-01 04:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][700/1251] eta 0:02:41 lr 0.000287 time 0.2906 (0.2927) loss 3.6081 (3.3394) grad_norm 1.9562 (1.8783) [2022-10-01 04:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][800/1251] eta 0:02:11 lr 0.000287 time 0.2877 (0.2922) loss 3.6017 (3.3439) grad_norm 1.6736 (1.8805) [2022-10-01 04:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][900/1251] eta 0:01:42 lr 0.000286 time 0.2918 (0.2918) loss 3.7712 (3.3401) grad_norm 2.3309 (1.8798) [2022-10-01 04:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1000/1251] eta 0:01:13 lr 0.000286 time 0.2919 (0.2915) loss 3.4268 (3.3401) grad_norm 1.7548 (1.8801) [2022-10-01 04:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1100/1251] eta 0:00:43 lr 0.000285 time 0.3892 (0.2913) loss 2.7920 (3.3379) grad_norm 1.9694 (1.8829) [2022-10-01 04:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1200/1251] eta 0:00:14 lr 0.000285 time 0.2875 (0.2911) loss 2.9546 (3.3309) grad_norm 1.8683 (1.8797) [2022-10-01 04:55:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 193 training takes 0:06:04 [2022-10-01 04:56:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.583 (2.583) Loss 0.9625 (0.9625) Acc@1 76.855 (76.855) Acc@5 94.434 (94.434) [2022-10-01 04:56:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.982 Acc@5 94.298 [2022-10-01 04:56:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-01 04:56:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.98% [2022-10-01 04:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][0/1251] eta 1:06:43 lr 0.000285 time 3.2000 (3.2000) loss 3.1065 (3.1065) grad_norm 2.2571 (2.2571) [2022-10-01 04:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][100/1251] eta 0:06:06 lr 0.000285 time 0.2873 (0.3185) loss 3.8131 (3.3909) grad_norm 2.0158 (1.9304) [2022-10-01 04:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][200/1251] eta 0:05:19 lr 0.000284 time 0.2851 (0.3038) loss 3.5718 (3.3415) grad_norm 1.8803 (1.8811) [2022-10-01 04:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][300/1251] eta 0:04:44 lr 0.000284 time 0.3822 (0.2989) loss 3.3068 (3.3404) grad_norm 1.8089 (1.8991) [2022-10-01 04:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][400/1251] eta 0:04:12 lr 0.000283 time 0.2906 (0.2962) loss 3.6567 (3.3373) grad_norm 1.5745 (1.9014) [2022-10-01 04:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][500/1251] eta 0:03:41 lr 0.000283 time 0.2862 (0.2945) loss 3.1325 (3.3347) grad_norm 1.7451 (1.8989) [2022-10-01 04:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][600/1251] eta 0:03:11 lr 0.000283 time 0.2834 (0.2935) loss 3.5171 (3.3488) grad_norm 1.7762 (1.8928) [2022-10-01 04:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][700/1251] eta 0:02:41 lr 0.000282 time 0.2862 (0.2928) loss 3.6858 (3.3345) grad_norm 2.0398 (1.8844) [2022-10-01 05:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][800/1251] eta 0:02:11 lr 0.000282 time 0.3788 (0.2923) loss 3.8602 (3.3441) grad_norm 1.7527 (1.8866) [2022-10-01 05:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][900/1251] eta 0:01:42 lr 0.000282 time 0.2844 (0.2918) loss 3.8206 (3.3326) grad_norm 1.6977 (1.8848) [2022-10-01 05:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1000/1251] eta 0:01:13 lr 0.000281 time 0.2860 (0.2913) loss 2.4275 (3.3389) grad_norm 1.8623 (1.8800) [2022-10-01 05:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1100/1251] eta 0:00:43 lr 0.000281 time 0.2873 (0.2909) loss 3.5478 (3.3328) grad_norm 2.1233 (1.8799) [2022-10-01 05:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1200/1251] eta 0:00:14 lr 0.000280 time 0.2866 (0.2907) loss 2.3606 (3.3348) grad_norm 1.7571 (1.8808) [2022-10-01 05:02:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 194 training takes 0:06:03 [2022-10-01 05:02:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.687 (2.687) Loss 0.8295 (0.8295) Acc@1 81.445 (81.445) Acc@5 95.703 (95.703) [2022-10-01 05:02:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.004 Acc@5 94.396 [2022-10-01 05:02:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-01 05:02:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.00% [2022-10-01 05:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][0/1251] eta 1:03:37 lr 0.000280 time 3.0515 (3.0515) loss 3.6123 (3.6123) grad_norm 1.6755 (1.6755) [2022-10-01 05:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][100/1251] eta 0:06:06 lr 0.000280 time 0.2873 (0.3187) loss 4.2615 (3.3016) grad_norm 2.2756 (1.8629) [2022-10-01 05:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][200/1251] eta 0:05:19 lr 0.000280 time 0.2882 (0.3038) loss 3.4064 (3.3436) grad_norm 1.7036 (1.8676) [2022-10-01 05:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][300/1251] eta 0:04:44 lr 0.000279 time 0.2886 (0.2989) loss 2.5387 (3.3568) grad_norm 2.2867 (1.8599) [2022-10-01 05:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][400/1251] eta 0:04:12 lr 0.000279 time 0.2900 (0.2965) loss 3.6753 (3.3552) grad_norm 1.8244 (1.8704) [2022-10-01 05:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][500/1251] eta 0:03:41 lr 0.000278 time 0.3914 (0.2953) loss 3.3071 (3.3500) grad_norm 1.5460 (1.8675) [2022-10-01 05:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][600/1251] eta 0:03:11 lr 0.000278 time 0.2896 (0.2944) loss 3.6325 (3.3438) grad_norm 1.7463 (1.8662) [2022-10-01 05:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][700/1251] eta 0:02:41 lr 0.000278 time 0.2895 (0.2937) loss 2.4558 (3.3490) grad_norm 1.9336 (1.8710) [2022-10-01 05:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][800/1251] eta 0:02:12 lr 0.000277 time 0.2897 (0.2931) loss 3.0446 (3.3560) grad_norm 1.8727 (1.8737) [2022-10-01 05:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][900/1251] eta 0:01:42 lr 0.000277 time 0.2929 (0.2927) loss 2.6591 (3.3460) grad_norm 1.7692 (1.8706) [2022-10-01 05:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1000/1251] eta 0:01:13 lr 0.000277 time 0.3851 (0.2924) loss 3.9341 (3.3570) grad_norm 2.0012 (1.8726) [2022-10-01 05:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1100/1251] eta 0:00:44 lr 0.000276 time 0.2872 (0.2920) loss 2.6726 (3.3561) grad_norm 1.7014 (1.8728) [2022-10-01 05:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1200/1251] eta 0:00:14 lr 0.000276 time 0.2928 (0.2918) loss 3.3004 (3.3515) grad_norm 1.9137 (1.8780) [2022-10-01 05:08:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 195 training takes 0:06:05 [2022-10-01 05:08:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.962 (2.962) Loss 0.8086 (0.8086) Acc@1 81.055 (81.055) Acc@5 95.117 (95.117) [2022-10-01 05:08:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.952 Acc@5 94.352 [2022-10-01 05:08:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-01 05:08:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.00% [2022-10-01 05:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][0/1251] eta 0:49:35 lr 0.000276 time 2.3784 (2.3784) loss 3.7656 (3.7656) grad_norm 1.7121 (1.7121) [2022-10-01 05:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][100/1251] eta 0:06:03 lr 0.000275 time 0.2878 (0.3154) loss 2.5479 (3.3059) grad_norm 1.7111 (1.9222) [2022-10-01 05:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][200/1251] eta 0:05:18 lr 0.000275 time 0.3809 (0.3030) loss 2.3375 (3.2868) grad_norm 2.0084 (1.9198) [2022-10-01 05:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][300/1251] eta 0:04:43 lr 0.000275 time 0.2902 (0.2986) loss 3.6886 (3.2592) grad_norm 2.0226 (1.9076) [2022-10-01 05:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][400/1251] eta 0:04:12 lr 0.000274 time 0.2871 (0.2965) loss 3.4807 (3.2730) grad_norm 2.0328 (1.9133) [2022-10-01 05:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][500/1251] eta 0:03:41 lr 0.000274 time 0.2893 (0.2952) loss 3.9004 (3.3004) grad_norm 1.8099 (1.9123) [2022-10-01 05:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][600/1251] eta 0:03:11 lr 0.000273 time 0.2869 (0.2943) loss 3.6646 (3.3121) grad_norm 1.9258 (1.8981) [2022-10-01 05:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][700/1251] eta 0:02:41 lr 0.000273 time 0.3838 (0.2938) loss 3.6180 (3.3291) grad_norm 1.8338 (1.8946) [2022-10-01 05:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][800/1251] eta 0:02:12 lr 0.000273 time 0.2908 (0.2933) loss 3.8776 (3.3311) grad_norm 1.5822 (1.8934) [2022-10-01 05:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][900/1251] eta 0:01:42 lr 0.000272 time 0.2932 (0.2928) loss 2.5193 (3.3211) grad_norm 1.7800 (1.8935) [2022-10-01 05:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1000/1251] eta 0:01:13 lr 0.000272 time 0.2887 (0.2924) loss 2.8078 (3.3262) grad_norm 1.8044 (1.8976) [2022-10-01 05:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1100/1251] eta 0:00:44 lr 0.000272 time 0.2915 (0.2922) loss 2.7477 (3.3314) grad_norm 1.7320 (1.8966) [2022-10-01 05:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1200/1251] eta 0:00:14 lr 0.000271 time 0.3793 (0.2920) loss 3.7835 (3.3371) grad_norm 1.9173 (1.9001) [2022-10-01 05:14:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 196 training takes 0:06:05 [2022-10-01 05:14:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.540 (2.540) Loss 0.9065 (0.9065) Acc@1 78.125 (78.125) Acc@5 95.703 (95.703) [2022-10-01 05:15:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.918 Acc@5 94.272 [2022-10-01 05:15:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-10-01 05:15:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.00% [2022-10-01 05:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][0/1251] eta 0:49:11 lr 0.000271 time 2.3589 (2.3589) loss 3.4783 (3.4783) grad_norm 1.8403 (1.8403) [2022-10-01 05:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][100/1251] eta 0:06:05 lr 0.000271 time 0.2893 (0.3172) loss 3.6219 (3.3193) grad_norm 2.1063 (1.9039) [2022-10-01 05:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][200/1251] eta 0:05:18 lr 0.000270 time 0.2885 (0.3033) loss 3.4182 (3.3370) grad_norm 1.8454 (1.8866) [2022-10-01 05:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][300/1251] eta 0:04:43 lr 0.000270 time 0.2870 (0.2986) loss 3.6285 (3.3492) grad_norm 2.1633 (1.8980) [2022-10-01 05:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][400/1251] eta 0:04:12 lr 0.000270 time 0.3814 (0.2965) loss 2.7737 (3.3308) grad_norm 1.5207 (1.9282) [2022-10-01 05:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][500/1251] eta 0:03:41 lr 0.000269 time 0.2917 (0.2951) loss 3.4036 (3.3300) grad_norm 2.0191 (1.9282) [2022-10-01 05:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][600/1251] eta 0:03:11 lr 0.000269 time 0.2912 (0.2941) loss 3.7685 (3.3249) grad_norm 1.7578 (1.9181) [2022-10-01 05:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][700/1251] eta 0:02:41 lr 0.000269 time 0.2910 (0.2935) loss 3.4545 (3.3319) grad_norm 2.1584 (1.9171) [2022-10-01 05:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][800/1251] eta 0:02:12 lr 0.000268 time 0.2917 (0.2930) loss 3.0483 (3.3394) grad_norm 1.9257 (1.9155) [2022-10-01 05:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][900/1251] eta 0:01:42 lr 0.000268 time 0.3846 (0.2927) loss 3.6635 (3.3324) grad_norm 1.8289 (1.9086) [2022-10-01 05:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1000/1251] eta 0:01:13 lr 0.000267 time 0.2920 (0.2923) loss 3.5806 (3.3365) grad_norm 1.6876 (1.9084) [2022-10-01 05:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1100/1251] eta 0:00:44 lr 0.000267 time 0.2891 (0.2919) loss 3.7741 (3.3386) grad_norm 1.9075 (1.9093) [2022-10-01 05:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1200/1251] eta 0:00:14 lr 0.000267 time 0.2892 (0.2918) loss 3.6133 (3.3428) grad_norm 1.6398 (1.9073) [2022-10-01 05:21:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 197 training takes 0:06:05 [2022-10-01 05:21:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.723 (2.723) Loss 0.9243 (0.9243) Acc@1 77.246 (77.246) Acc@5 95.215 (95.215) [2022-10-01 05:21:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.952 Acc@5 94.338 [2022-10-01 05:21:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-01 05:21:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.00% [2022-10-01 05:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][0/1251] eta 0:44:34 lr 0.000267 time 2.1381 (2.1381) loss 3.4253 (3.4253) grad_norm 1.7265 (1.7265) [2022-10-01 05:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][100/1251] eta 0:06:07 lr 0.000266 time 0.3834 (0.3190) loss 3.2440 (3.3693) grad_norm 1.7865 (1.8647) [2022-10-01 05:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][200/1251] eta 0:05:21 lr 0.000266 time 0.2963 (0.3059) loss 3.9630 (3.3358) grad_norm 1.9164 (1.8858) [2022-10-01 05:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][300/1251] eta 0:04:46 lr 0.000265 time 0.2929 (0.3016) loss 2.9438 (3.3149) grad_norm 1.8306 (1.8892) [2022-10-01 05:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][400/1251] eta 0:04:14 lr 0.000265 time 0.2954 (0.2993) loss 3.3485 (3.3197) grad_norm 1.6064 (1.9030) [2022-10-01 05:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][500/1251] eta 0:03:43 lr 0.000265 time 0.2891 (0.2979) loss 3.2024 (3.3171) grad_norm 1.8735 (1.9192) [2022-10-01 05:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][600/1251] eta 0:03:13 lr 0.000264 time 0.3872 (0.2972) loss 2.9708 (3.3310) grad_norm 1.7307 (1.9247) [2022-10-01 05:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][700/1251] eta 0:02:43 lr 0.000264 time 0.2906 (0.2963) loss 3.3032 (3.3236) grad_norm 1.7773 (1.9225) [2022-10-01 05:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][800/1251] eta 0:02:13 lr 0.000264 time 0.2890 (0.2957) loss 2.6423 (3.3238) grad_norm 1.6705 (1.9220) [2022-10-01 05:25:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][900/1251] eta 0:01:43 lr 0.000263 time 0.2928 (0.2952) loss 2.8471 (3.3188) grad_norm 2.1637 (1.9315) [2022-10-01 05:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1000/1251] eta 0:01:14 lr 0.000263 time 0.2946 (0.2949) loss 2.9438 (3.3153) grad_norm 2.1156 (1.9320) [2022-10-01 05:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1100/1251] eta 0:00:44 lr 0.000263 time 0.3859 (0.2947) loss 3.7047 (3.3144) grad_norm 2.0294 (1.9347) [2022-10-01 05:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1200/1251] eta 0:00:15 lr 0.000262 time 0.2896 (0.2945) loss 3.3034 (3.3223) grad_norm 1.9046 (1.9374) [2022-10-01 05:27:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 198 training takes 0:06:08 [2022-10-01 05:27:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.646 (2.646) Loss 0.9318 (0.9318) Acc@1 77.051 (77.051) Acc@5 94.434 (94.434) [2022-10-01 05:27:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.116 Acc@5 94.458 [2022-10-01 05:27:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-01 05:27:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.12% [2022-10-01 05:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][0/1251] eta 1:06:06 lr 0.000262 time 3.1708 (3.1708) loss 2.4385 (2.4385) grad_norm 1.7013 (1.7013) [2022-10-01 05:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][100/1251] eta 0:06:09 lr 0.000262 time 0.2903 (0.3211) loss 3.9200 (3.2700) grad_norm 1.7609 (1.9478) [2022-10-01 05:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][200/1251] eta 0:05:22 lr 0.000261 time 0.2903 (0.3068) loss 2.2565 (3.3031) grad_norm 2.3633 (1.9439) [2022-10-01 05:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][300/1251] eta 0:04:47 lr 0.000261 time 0.3816 (0.3020) loss 2.8277 (3.3150) grad_norm 1.6377 (1.9600) [2022-10-01 05:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][400/1251] eta 0:04:14 lr 0.000261 time 0.2902 (0.2995) loss 3.2809 (3.3101) grad_norm 1.7857 (1.9440) [2022-10-01 05:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][500/1251] eta 0:03:43 lr 0.000260 time 0.2879 (0.2981) loss 2.9042 (3.3056) grad_norm 1.9997 (1.9421) [2022-10-01 05:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][600/1251] eta 0:03:13 lr 0.000260 time 0.2885 (0.2971) loss 3.6131 (3.3070) grad_norm 2.2018 (1.9391) [2022-10-01 05:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][700/1251] eta 0:02:43 lr 0.000259 time 0.2901 (0.2963) loss 3.2401 (3.3114) grad_norm 1.8683 (1.9386) [2022-10-01 05:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][800/1251] eta 0:02:13 lr 0.000259 time 0.3814 (0.2959) loss 4.0894 (3.3165) grad_norm 1.8059 (1.9299) [2022-10-01 05:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][900/1251] eta 0:01:43 lr 0.000259 time 0.2892 (0.2954) loss 2.6600 (3.3254) grad_norm 1.7791 (1.9295) [2022-10-01 05:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1000/1251] eta 0:01:14 lr 0.000258 time 0.2886 (0.2949) loss 3.7862 (3.3241) grad_norm 1.7728 (1.9293) [2022-10-01 05:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1100/1251] eta 0:00:44 lr 0.000258 time 0.2883 (0.2946) loss 3.7415 (3.3174) grad_norm 2.2212 (1.9308) [2022-10-01 05:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1200/1251] eta 0:00:15 lr 0.000258 time 0.2916 (0.2943) loss 3.4478 (3.3067) grad_norm 1.8047 (1.9302) [2022-10-01 05:33:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 199 training takes 0:06:08 [2022-10-01 05:33:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.355 (2.355) Loss 0.8884 (0.8884) Acc@1 78.320 (78.320) Acc@5 94.629 (94.629) [2022-10-01 05:34:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.172 Acc@5 94.396 [2022-10-01 05:34:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-01 05:34:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.17% [2022-10-01 05:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][0/1251] eta 1:01:46 lr 0.000258 time 2.9629 (2.9629) loss 3.2431 (3.2431) grad_norm 1.7827 (1.7827) [2022-10-01 05:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][100/1251] eta 0:06:06 lr 0.000257 time 0.2882 (0.3182) loss 3.6315 (3.2977) grad_norm 2.1919 (1.9267) [2022-10-01 05:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][200/1251] eta 0:05:19 lr 0.000257 time 0.2881 (0.3040) loss 2.9902 (3.2936) grad_norm 2.1250 (1.9229) [2022-10-01 05:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][300/1251] eta 0:04:44 lr 0.000256 time 0.2917 (0.2993) loss 3.2129 (3.2922) grad_norm 1.8111 (1.9471) [2022-10-01 05:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][400/1251] eta 0:04:12 lr 0.000256 time 0.2907 (0.2968) loss 3.3322 (3.2975) grad_norm 1.9789 (1.9360) [2022-10-01 05:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][500/1251] eta 0:03:41 lr 0.000256 time 0.3767 (0.2955) loss 3.5769 (3.2925) grad_norm 1.7096 (1.9462) [2022-10-01 05:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][600/1251] eta 0:03:11 lr 0.000255 time 0.2877 (0.2944) loss 3.3780 (3.2979) grad_norm 1.7692 (1.9529) [2022-10-01 05:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][700/1251] eta 0:02:41 lr 0.000255 time 0.2903 (0.2937) loss 2.9594 (3.2994) grad_norm 1.8855 (1.9485) [2022-10-01 05:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][800/1251] eta 0:02:12 lr 0.000255 time 0.2898 (0.2931) loss 3.4779 (3.3067) grad_norm 1.9563 (1.9576) [2022-10-01 05:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][900/1251] eta 0:01:42 lr 0.000254 time 0.2873 (0.2926) loss 3.6870 (3.3098) grad_norm 1.8300 (1.9565) [2022-10-01 05:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1000/1251] eta 0:01:13 lr 0.000254 time 0.3819 (0.2923) loss 2.6470 (3.3123) grad_norm 1.9982 (1.9575) [2022-10-01 05:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1100/1251] eta 0:00:44 lr 0.000254 time 0.2858 (0.2921) loss 3.8875 (3.3112) grad_norm 1.9200 (1.9584) [2022-10-01 05:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1200/1251] eta 0:00:14 lr 0.000253 time 0.2918 (0.2918) loss 3.7434 (3.3188) grad_norm 1.9615 (1.9591) [2022-10-01 05:40:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 200 training takes 0:06:05 [2022-10-01 05:40:09 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saving...... [2022-10-01 05:40:09 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saved !!! [2022-10-01 05:40:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.607 (2.607) Loss 0.9239 (0.9239) Acc@1 79.199 (79.199) Acc@5 93.945 (93.945) [2022-10-01 05:40:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.242 Acc@5 94.388 [2022-10-01 05:40:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-01 05:40:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.24% [2022-10-01 05:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][0/1251] eta 0:47:49 lr 0.000253 time 2.2938 (2.2938) loss 2.5460 (2.5460) grad_norm 1.9294 (1.9294) [2022-10-01 05:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][100/1251] eta 0:06:06 lr 0.000253 time 0.2975 (0.3185) loss 3.2158 (3.3255) grad_norm 1.9012 (1.9329) [2022-10-01 05:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][200/1251] eta 0:05:20 lr 0.000252 time 0.3930 (0.3052) loss 4.0901 (3.3729) grad_norm 1.6441 (1.9424) [2022-10-01 05:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][300/1251] eta 0:04:45 lr 0.000252 time 0.2927 (0.3006) loss 3.7468 (3.3258) grad_norm 2.1578 (1.9382) [2022-10-01 05:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][400/1251] eta 0:04:13 lr 0.000252 time 0.2920 (0.2983) loss 3.5438 (3.3281) grad_norm 2.0133 (1.9377) [2022-10-01 05:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][500/1251] eta 0:03:42 lr 0.000251 time 0.2952 (0.2969) loss 3.5238 (3.3414) grad_norm 1.9524 (1.9529) [2022-10-01 05:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][600/1251] eta 0:03:12 lr 0.000251 time 0.2906 (0.2959) loss 3.5508 (3.3331) grad_norm 2.1127 (1.9561) [2022-10-01 05:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][700/1251] eta 0:02:42 lr 0.000251 time 0.3894 (0.2953) loss 3.5305 (3.3263) grad_norm 2.2697 (1.9521) [2022-10-01 05:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][800/1251] eta 0:02:12 lr 0.000250 time 0.2937 (0.2948) loss 3.9719 (3.3322) grad_norm 1.9577 (1.9511) [2022-10-01 05:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][900/1251] eta 0:01:43 lr 0.000250 time 0.2941 (0.2945) loss 3.9858 (3.3344) grad_norm 1.8667 (1.9517) [2022-10-01 05:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1000/1251] eta 0:01:13 lr 0.000249 time 0.2940 (0.2941) loss 3.0830 (3.3344) grad_norm 1.8478 (1.9480) [2022-10-01 05:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1100/1251] eta 0:00:44 lr 0.000249 time 0.2958 (0.2938) loss 3.8728 (3.3298) grad_norm 1.7701 (1.9493) [2022-10-01 05:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1200/1251] eta 0:00:14 lr 0.000249 time 0.3870 (0.2937) loss 3.2964 (3.3291) grad_norm 1.6638 (1.9506) [2022-10-01 05:46:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 201 training takes 0:06:07 [2022-10-01 05:46:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.788 (2.788) Loss 0.8450 (0.8450) Acc@1 80.762 (80.762) Acc@5 94.824 (94.824) [2022-10-01 05:46:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.188 Acc@5 94.330 [2022-10-01 05:46:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-01 05:46:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.24% [2022-10-01 05:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][0/1251] eta 0:47:22 lr 0.000249 time 2.2726 (2.2726) loss 3.7572 (3.7572) grad_norm 1.7741 (1.7741) [2022-10-01 05:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][100/1251] eta 0:06:04 lr 0.000248 time 0.2909 (0.3163) loss 3.8592 (3.3984) grad_norm 1.8077 (2.0101) [2022-10-01 05:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][200/1251] eta 0:05:18 lr 0.000248 time 0.2911 (0.3026) loss 3.4713 (3.3479) grad_norm 2.1547 (1.9794) [2022-10-01 05:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][300/1251] eta 0:04:43 lr 0.000248 time 0.2865 (0.2981) loss 3.5279 (3.3303) grad_norm 2.2212 (1.9821) [2022-10-01 05:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][400/1251] eta 0:04:11 lr 0.000247 time 0.3836 (0.2960) loss 4.0007 (3.3202) grad_norm 1.7892 (1.9712) [2022-10-01 05:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][500/1251] eta 0:03:41 lr 0.000247 time 0.2879 (0.2946) loss 3.8358 (3.3291) grad_norm 1.9257 (1.9590) [2022-10-01 05:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][600/1251] eta 0:03:11 lr 0.000246 time 0.2893 (0.2937) loss 3.9192 (3.3367) grad_norm 1.7843 (1.9590) [2022-10-01 05:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][700/1251] eta 0:02:41 lr 0.000246 time 0.2868 (0.2930) loss 3.8754 (3.3334) grad_norm 1.9597 (1.9555) [2022-10-01 05:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][800/1251] eta 0:02:11 lr 0.000246 time 0.2881 (0.2925) loss 4.1803 (3.3254) grad_norm 1.9166 (1.9636) [2022-10-01 05:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][900/1251] eta 0:01:42 lr 0.000245 time 0.3789 (0.2921) loss 3.1814 (3.3131) grad_norm 1.9017 (1.9640) [2022-10-01 05:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1000/1251] eta 0:01:13 lr 0.000245 time 0.2893 (0.2918) loss 3.6718 (3.3095) grad_norm 2.0610 (1.9648) [2022-10-01 05:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1100/1251] eta 0:00:44 lr 0.000245 time 0.2892 (0.2915) loss 3.2576 (3.3106) grad_norm 1.8990 (1.9625) [2022-10-01 05:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1200/1251] eta 0:00:14 lr 0.000244 time 0.2923 (0.2912) loss 3.2821 (3.3118) grad_norm 2.0629 (1.9628) [2022-10-01 05:52:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 202 training takes 0:06:04 [2022-10-01 05:52:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.547 (2.547) Loss 0.8225 (0.8225) Acc@1 80.859 (80.859) Acc@5 94.824 (94.824) [2022-10-01 05:52:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.476 Acc@5 94.458 [2022-10-01 05:52:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-01 05:52:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.48% [2022-10-01 05:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][0/1251] eta 0:56:57 lr 0.000244 time 2.7318 (2.7318) loss 2.8719 (2.8719) grad_norm 1.8294 (1.8294) [2022-10-01 05:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][100/1251] eta 0:06:06 lr 0.000244 time 0.3758 (0.3187) loss 3.7466 (3.3293) grad_norm 1.9242 (1.9416) [2022-10-01 05:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][200/1251] eta 0:05:19 lr 0.000243 time 0.2910 (0.3044) loss 3.9698 (3.3530) grad_norm 1.9820 (1.9523) [2022-10-01 05:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][300/1251] eta 0:04:44 lr 0.000243 time 0.2894 (0.2996) loss 3.2980 (3.3236) grad_norm 1.7596 (1.9689) [2022-10-01 05:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][400/1251] eta 0:04:12 lr 0.000243 time 0.2910 (0.2971) loss 3.1260 (3.3089) grad_norm 2.2039 (1.9759) [2022-10-01 05:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][500/1251] eta 0:03:41 lr 0.000242 time 0.2981 (0.2956) loss 2.2300 (3.3011) grad_norm 2.2282 (1.9904) [2022-10-01 05:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][600/1251] eta 0:03:11 lr 0.000242 time 0.3855 (0.2947) loss 3.9712 (3.3119) grad_norm 2.0168 (1.9979) [2022-10-01 05:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][700/1251] eta 0:02:41 lr 0.000242 time 0.2884 (0.2939) loss 3.0572 (3.3107) grad_norm 1.8512 (1.9984) [2022-10-01 05:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][800/1251] eta 0:02:12 lr 0.000241 time 0.2907 (0.2932) loss 2.9951 (3.3243) grad_norm 1.7595 (2.0089) [2022-10-01 05:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][900/1251] eta 0:01:42 lr 0.000241 time 0.2864 (0.2927) loss 3.0041 (3.3253) grad_norm 1.9632 (2.0058) [2022-10-01 05:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1000/1251] eta 0:01:13 lr 0.000241 time 0.2884 (0.2924) loss 2.4394 (3.3299) grad_norm 1.7580 (2.0042) [2022-10-01 05:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1100/1251] eta 0:00:44 lr 0.000240 time 0.3794 (0.2922) loss 2.5275 (3.3274) grad_norm 2.0286 (2.0038) [2022-10-01 05:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1200/1251] eta 0:00:14 lr 0.000240 time 0.2928 (0.2919) loss 3.7440 (3.3307) grad_norm 1.8648 (2.0030) [2022-10-01 05:59:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 203 training takes 0:06:05 [2022-10-01 05:59:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.727 (2.727) Loss 0.8938 (0.8938) Acc@1 79.199 (79.199) Acc@5 94.141 (94.141) [2022-10-01 05:59:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.470 Acc@5 94.436 [2022-10-01 05:59:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-01 05:59:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.48% [2022-10-01 05:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][0/1251] eta 1:13:09 lr 0.000240 time 3.5086 (3.5086) loss 2.4884 (2.4884) grad_norm 2.0173 (2.0173) [2022-10-01 05:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][100/1251] eta 0:06:07 lr 0.000239 time 0.2855 (0.3195) loss 3.2952 (3.3610) grad_norm 2.4428 (2.0405) [2022-10-01 06:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][200/1251] eta 0:05:19 lr 0.000239 time 0.2880 (0.3038) loss 3.5309 (3.3087) grad_norm 1.9012 (2.0047) [2022-10-01 06:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][300/1251] eta 0:04:44 lr 0.000239 time 0.3822 (0.2989) loss 2.2541 (3.3256) grad_norm 1.7901 (2.0064) [2022-10-01 06:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][400/1251] eta 0:04:12 lr 0.000238 time 0.2873 (0.2961) loss 3.1029 (3.3470) grad_norm 2.8332 (2.0119) [2022-10-01 06:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][500/1251] eta 0:03:41 lr 0.000238 time 0.2871 (0.2945) loss 3.5224 (3.3391) grad_norm 2.4786 (2.0124) [2022-10-01 06:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][600/1251] eta 0:03:10 lr 0.000238 time 0.2893 (0.2934) loss 2.4820 (3.3522) grad_norm 2.1818 (2.0060) [2022-10-01 06:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][700/1251] eta 0:02:41 lr 0.000237 time 0.2895 (0.2926) loss 3.4699 (3.3477) grad_norm 1.8759 (2.0022) [2022-10-01 06:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][800/1251] eta 0:02:11 lr 0.000237 time 0.3756 (0.2921) loss 3.3058 (3.3497) grad_norm 2.5024 (2.0031) [2022-10-01 06:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][900/1251] eta 0:01:42 lr 0.000237 time 0.2869 (0.2916) loss 2.6968 (3.3386) grad_norm 1.9866 (1.9990) [2022-10-01 06:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1000/1251] eta 0:01:13 lr 0.000236 time 0.2864 (0.2913) loss 2.8879 (3.3319) grad_norm 2.1774 (1.9931) [2022-10-01 06:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1100/1251] eta 0:00:43 lr 0.000236 time 0.2884 (0.2910) loss 2.7095 (3.3385) grad_norm 1.9482 (1.9919) [2022-10-01 06:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1200/1251] eta 0:00:14 lr 0.000236 time 0.2871 (0.2907) loss 3.0073 (3.3377) grad_norm 1.7588 (1.9954) [2022-10-01 06:05:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 204 training takes 0:06:03 [2022-10-01 06:05:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.361 (2.361) Loss 1.0029 (1.0029) Acc@1 77.441 (77.441) Acc@5 93.359 (93.359) [2022-10-01 06:05:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.414 Acc@5 94.454 [2022-10-01 06:05:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-01 06:05:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.48% [2022-10-01 06:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][0/1251] eta 1:01:35 lr 0.000235 time 2.9538 (2.9538) loss 3.4290 (3.4290) grad_norm 2.0889 (2.0889) [2022-10-01 06:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][100/1251] eta 0:06:04 lr 0.000235 time 0.2879 (0.3170) loss 3.2431 (3.4002) grad_norm 1.9612 (1.9808) [2022-10-01 06:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][200/1251] eta 0:05:18 lr 0.000235 time 0.2882 (0.3030) loss 3.9308 (3.3410) grad_norm 2.1671 (1.9888) [2022-10-01 06:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][300/1251] eta 0:04:43 lr 0.000234 time 0.2860 (0.2983) loss 3.4977 (3.2968) grad_norm 1.8596 (1.9874) [2022-10-01 06:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][400/1251] eta 0:04:11 lr 0.000234 time 0.2887 (0.2958) loss 3.5511 (3.2842) grad_norm 1.9823 (1.9913) [2022-10-01 06:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][500/1251] eta 0:03:41 lr 0.000234 time 0.3749 (0.2948) loss 4.0080 (3.3077) grad_norm 1.9836 (2.0031) [2022-10-01 06:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][600/1251] eta 0:03:11 lr 0.000233 time 0.2864 (0.2938) loss 3.3671 (3.3105) grad_norm 2.3352 (2.0000) [2022-10-01 06:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][700/1251] eta 0:02:41 lr 0.000233 time 0.2850 (0.2931) loss 3.2251 (3.2971) grad_norm 1.9429 (2.0033) [2022-10-01 06:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][800/1251] eta 0:02:11 lr 0.000233 time 0.2870 (0.2926) loss 3.5142 (3.2949) grad_norm 1.9358 (2.0041) [2022-10-01 06:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][900/1251] eta 0:01:42 lr 0.000232 time 0.2896 (0.2922) loss 3.6808 (3.2928) grad_norm 1.8329 (2.0011) [2022-10-01 06:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1000/1251] eta 0:01:13 lr 0.000232 time 0.3731 (0.2919) loss 2.5486 (3.2942) grad_norm 1.7885 (1.9969) [2022-10-01 06:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1100/1251] eta 0:00:44 lr 0.000232 time 0.2895 (0.2916) loss 2.1803 (3.2950) grad_norm 2.2658 (2.0015) [2022-10-01 06:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1200/1251] eta 0:00:14 lr 0.000231 time 0.2855 (0.2913) loss 3.7125 (3.2959) grad_norm 2.5032 (2.0036) [2022-10-01 06:11:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 205 training takes 0:06:04 [2022-10-01 06:11:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.286 (3.286) Loss 0.9294 (0.9294) Acc@1 76.562 (76.562) Acc@5 94.727 (94.727) [2022-10-01 06:11:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.510 Acc@5 94.508 [2022-10-01 06:11:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-01 06:11:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.51% [2022-10-01 06:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][0/1251] eta 0:49:07 lr 0.000231 time 2.3565 (2.3565) loss 3.7666 (3.7666) grad_norm 1.8785 (1.8785) [2022-10-01 06:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][100/1251] eta 0:05:57 lr 0.000231 time 0.2873 (0.3102) loss 2.5736 (3.2842) grad_norm 1.8974 (2.0157) [2022-10-01 06:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][200/1251] eta 0:05:14 lr 0.000230 time 0.3797 (0.2996) loss 3.9004 (3.3153) grad_norm 2.1084 (2.0329) [2022-10-01 06:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][300/1251] eta 0:04:41 lr 0.000230 time 0.2858 (0.2956) loss 2.2838 (3.3146) grad_norm 2.0443 (2.0487) [2022-10-01 06:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][400/1251] eta 0:04:09 lr 0.000230 time 0.2943 (0.2935) loss 3.3874 (3.3001) grad_norm 2.0887 (2.0351) [2022-10-01 06:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][500/1251] eta 0:03:39 lr 0.000229 time 0.2878 (0.2923) loss 1.9224 (3.2740) grad_norm 1.7504 (2.0344) [2022-10-01 06:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][600/1251] eta 0:03:09 lr 0.000229 time 0.2876 (0.2915) loss 3.4046 (3.2761) grad_norm 2.0017 (2.0337) [2022-10-01 06:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][700/1251] eta 0:02:40 lr 0.000229 time 0.3774 (0.2911) loss 3.3634 (3.2852) grad_norm 1.9051 (2.0289) [2022-10-01 06:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][800/1251] eta 0:02:11 lr 0.000228 time 0.2885 (0.2909) loss 3.5341 (3.2756) grad_norm 1.9997 (2.0264) [2022-10-01 06:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][900/1251] eta 0:01:42 lr 0.000228 time 0.2887 (0.2906) loss 3.3651 (3.2874) grad_norm 2.3161 (2.0304) [2022-10-01 06:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1000/1251] eta 0:01:12 lr 0.000228 time 0.2896 (0.2904) loss 3.3434 (3.2855) grad_norm 2.1507 (2.0315) [2022-10-01 06:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1100/1251] eta 0:00:43 lr 0.000227 time 0.2905 (0.2903) loss 3.1768 (3.2963) grad_norm 2.0410 (2.0366) [2022-10-01 06:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1200/1251] eta 0:00:14 lr 0.000227 time 0.3820 (0.2902) loss 3.6344 (3.2982) grad_norm 1.7943 (2.0412) [2022-10-01 06:17:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 206 training takes 0:06:03 [2022-10-01 06:17:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.884 (2.884) Loss 0.8875 (0.8875) Acc@1 78.418 (78.418) Acc@5 95.020 (95.020) [2022-10-01 06:18:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.438 Acc@5 94.564 [2022-10-01 06:18:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-01 06:18:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.51% [2022-10-01 06:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][0/1251] eta 0:58:01 lr 0.000227 time 2.7830 (2.7830) loss 3.3820 (3.3820) grad_norm 2.2254 (2.2254) [2022-10-01 06:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][100/1251] eta 0:06:05 lr 0.000226 time 0.2887 (0.3180) loss 3.4003 (3.2412) grad_norm 1.8419 (2.0048) [2022-10-01 06:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][200/1251] eta 0:05:20 lr 0.000226 time 0.2903 (0.3049) loss 3.4994 (3.2882) grad_norm 1.9058 (2.0163) [2022-10-01 06:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][300/1251] eta 0:04:45 lr 0.000226 time 0.2919 (0.3004) loss 3.6783 (3.2973) grad_norm 2.1114 (2.0372) [2022-10-01 06:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][400/1251] eta 0:04:13 lr 0.000225 time 0.3852 (0.2981) loss 2.8336 (3.2944) grad_norm 2.0476 (2.0330) [2022-10-01 06:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][500/1251] eta 0:03:42 lr 0.000225 time 0.2909 (0.2966) loss 3.5410 (3.3028) grad_norm 1.8246 (2.0311) [2022-10-01 06:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][600/1251] eta 0:03:12 lr 0.000225 time 0.2924 (0.2956) loss 3.4254 (3.2888) grad_norm 2.2128 (2.0277) [2022-10-01 06:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][700/1251] eta 0:02:42 lr 0.000224 time 0.2928 (0.2950) loss 3.7967 (3.2810) grad_norm 2.1743 (2.0285) [2022-10-01 06:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][800/1251] eta 0:02:12 lr 0.000224 time 0.2905 (0.2944) loss 3.7502 (3.2808) grad_norm 1.8487 (2.0297) [2022-10-01 06:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][900/1251] eta 0:01:43 lr 0.000224 time 0.3826 (0.2941) loss 3.2668 (3.2778) grad_norm 2.0002 (2.0285) [2022-10-01 06:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1000/1251] eta 0:01:13 lr 0.000223 time 0.2894 (0.2937) loss 3.8894 (3.2851) grad_norm 2.1146 (2.0256) [2022-10-01 06:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1100/1251] eta 0:00:44 lr 0.000223 time 0.2947 (0.2934) loss 4.0981 (3.2918) grad_norm 1.9569 (2.0287) [2022-10-01 06:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1200/1251] eta 0:00:14 lr 0.000223 time 0.2961 (0.2932) loss 2.6769 (3.2859) grad_norm 2.0740 (2.0245) [2022-10-01 06:24:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 207 training takes 0:06:06 [2022-10-01 06:24:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.448 (2.448) Loss 0.9063 (0.9063) Acc@1 79.199 (79.199) Acc@5 94.336 (94.336) [2022-10-01 06:24:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.546 Acc@5 94.516 [2022-10-01 06:24:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-01 06:24:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.55% [2022-10-01 06:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][0/1251] eta 0:46:57 lr 0.000222 time 2.2519 (2.2519) loss 4.1694 (4.1694) grad_norm 2.4992 (2.4992) [2022-10-01 06:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][100/1251] eta 0:06:03 lr 0.000222 time 0.3791 (0.3155) loss 3.2604 (3.2599) grad_norm 2.2575 (2.0288) [2022-10-01 06:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][200/1251] eta 0:05:18 lr 0.000222 time 0.2960 (0.3030) loss 2.4095 (3.2703) grad_norm 1.8823 (2.0111) [2022-10-01 06:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][300/1251] eta 0:04:44 lr 0.000221 time 0.2921 (0.2988) loss 2.4849 (3.2596) grad_norm 1.7185 (2.0118) [2022-10-01 06:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][400/1251] eta 0:04:12 lr 0.000221 time 0.2856 (0.2966) loss 2.9758 (3.2612) grad_norm 1.9315 (2.0142) [2022-10-01 06:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][500/1251] eta 0:03:41 lr 0.000221 time 0.2846 (0.2952) loss 3.8608 (3.2745) grad_norm 1.6550 (2.0120) [2022-10-01 06:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][600/1251] eta 0:03:11 lr 0.000220 time 0.3821 (0.2944) loss 3.5972 (3.2764) grad_norm 1.9142 (2.0281) [2022-10-01 06:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][700/1251] eta 0:02:41 lr 0.000220 time 0.2874 (0.2937) loss 3.9538 (3.2774) grad_norm 2.3486 (2.0274) [2022-10-01 06:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][800/1251] eta 0:02:12 lr 0.000220 time 0.2903 (0.2932) loss 2.6092 (3.2697) grad_norm 2.0383 (2.0372) [2022-10-01 06:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][900/1251] eta 0:01:42 lr 0.000219 time 0.2901 (0.2928) loss 3.8200 (3.2655) grad_norm 2.1083 (2.0350) [2022-10-01 06:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1000/1251] eta 0:01:13 lr 0.000219 time 0.2888 (0.2924) loss 3.7543 (3.2672) grad_norm 2.0209 (2.0341) [2022-10-01 06:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1100/1251] eta 0:00:44 lr 0.000219 time 0.3815 (0.2922) loss 2.2003 (3.2681) grad_norm 2.0057 (2.0353) [2022-10-01 06:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1200/1251] eta 0:00:14 lr 0.000218 time 0.2874 (0.2920) loss 3.2486 (3.2698) grad_norm 1.8302 (2.0377) [2022-10-01 06:30:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 208 training takes 0:06:05 [2022-10-01 06:30:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.905 (2.905) Loss 0.8628 (0.8628) Acc@1 78.320 (78.320) Acc@5 96.289 (96.289) [2022-10-01 06:30:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.640 Acc@5 94.600 [2022-10-01 06:30:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-01 06:30:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.64% [2022-10-01 06:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][0/1251] eta 1:08:49 lr 0.000218 time 3.3012 (3.3012) loss 3.1204 (3.1204) grad_norm 2.1798 (2.1798) [2022-10-01 06:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][100/1251] eta 0:06:07 lr 0.000218 time 0.2893 (0.3195) loss 2.5499 (3.3443) grad_norm 1.9916 (2.0558) [2022-10-01 06:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][200/1251] eta 0:05:20 lr 0.000218 time 0.2881 (0.3046) loss 2.5858 (3.2951) grad_norm 1.7461 (2.0872) [2022-10-01 06:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][300/1251] eta 0:04:45 lr 0.000217 time 0.3854 (0.2999) loss 3.2621 (3.2817) grad_norm 2.1168 (2.0815) [2022-10-01 06:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][400/1251] eta 0:04:13 lr 0.000217 time 0.2867 (0.2973) loss 3.6526 (3.2740) grad_norm 2.1983 (2.0783) [2022-10-01 06:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][500/1251] eta 0:03:43 lr 0.000217 time 0.2913 (0.2974) loss 3.6669 (3.2745) grad_norm 1.8857 (2.0713) [2022-10-01 06:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][600/1251] eta 0:03:12 lr 0.000216 time 0.2886 (0.2962) loss 2.3399 (3.2609) grad_norm 1.8643 (2.0655) [2022-10-01 06:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][700/1251] eta 0:02:42 lr 0.000216 time 0.2921 (0.2958) loss 2.6975 (3.2661) grad_norm 2.0478 (2.0686) [2022-10-01 06:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][800/1251] eta 0:02:13 lr 0.000216 time 0.3787 (0.2952) loss 3.2505 (3.2709) grad_norm 2.0363 (2.0774) [2022-10-01 06:35:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][900/1251] eta 0:01:43 lr 0.000215 time 0.2934 (0.2947) loss 4.0220 (3.2702) grad_norm 2.2081 (2.0818) [2022-10-01 06:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1000/1251] eta 0:01:13 lr 0.000215 time 0.2884 (0.2944) loss 3.7657 (3.2807) grad_norm 4.8054 (2.0851) [2022-10-01 06:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1100/1251] eta 0:00:44 lr 0.000215 time 0.2898 (0.2941) loss 3.1000 (3.2829) grad_norm 1.8998 (2.0888) [2022-10-01 06:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1200/1251] eta 0:00:14 lr 0.000214 time 0.2922 (0.2938) loss 3.1988 (3.2827) grad_norm 1.9836 (2.0929) [2022-10-01 06:36:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 209 training takes 0:06:07 [2022-10-01 06:36:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.897 (2.897) Loss 0.8828 (0.8828) Acc@1 79.199 (79.199) Acc@5 94.531 (94.531) [2022-10-01 06:37:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.514 Acc@5 94.560 [2022-10-01 06:37:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-01 06:37:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.64% [2022-10-01 06:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][0/1251] eta 1:04:31 lr 0.000214 time 3.0950 (3.0950) loss 3.9400 (3.9400) grad_norm 2.1681 (2.1681) [2022-10-01 06:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][100/1251] eta 0:06:09 lr 0.000214 time 0.2891 (0.3209) loss 2.4642 (3.1996) grad_norm 2.2277 (2.1071) [2022-10-01 06:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][200/1251] eta 0:05:22 lr 0.000213 time 0.2886 (0.3070) loss 3.3803 (3.2294) grad_norm 2.5435 (2.1153) [2022-10-01 06:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][300/1251] eta 0:04:46 lr 0.000213 time 0.2901 (0.3017) loss 4.0560 (3.2709) grad_norm 2.2368 (2.1067) [2022-10-01 06:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][400/1251] eta 0:04:14 lr 0.000213 time 0.2903 (0.2990) loss 3.6752 (3.2830) grad_norm 2.2564 (2.1097) [2022-10-01 06:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][500/1251] eta 0:03:43 lr 0.000212 time 0.3870 (0.2976) loss 3.4439 (3.2718) grad_norm 2.0230 (2.1212) [2022-10-01 06:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][600/1251] eta 0:03:12 lr 0.000212 time 0.2913 (0.2964) loss 3.6336 (3.2727) grad_norm 1.7368 (2.1098) [2022-10-01 06:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][700/1251] eta 0:02:42 lr 0.000212 time 0.2893 (0.2956) loss 3.7216 (3.2843) grad_norm 1.9704 (2.1060) [2022-10-01 06:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][800/1251] eta 0:02:13 lr 0.000211 time 0.2982 (0.2949) loss 2.6289 (3.2809) grad_norm 1.9066 (2.1014) [2022-10-01 06:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][900/1251] eta 0:01:43 lr 0.000211 time 0.2917 (0.2944) loss 3.3276 (3.2897) grad_norm 1.7517 (2.0968) [2022-10-01 06:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1000/1251] eta 0:01:13 lr 0.000211 time 0.3808 (0.2941) loss 3.4754 (3.2883) grad_norm 2.0924 (2.1036) [2022-10-01 06:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1100/1251] eta 0:00:44 lr 0.000210 time 0.2910 (0.2938) loss 3.5898 (3.2852) grad_norm 2.9491 (2.1041) [2022-10-01 06:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1200/1251] eta 0:00:14 lr 0.000210 time 0.2864 (0.2936) loss 3.2631 (3.2834) grad_norm 2.0904 (2.1030) [2022-10-01 06:43:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 210 training takes 0:06:07 [2022-10-01 06:43:14 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saving...... [2022-10-01 06:43:14 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saved !!! [2022-10-01 06:43:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.894 (2.894) Loss 0.8601 (0.8601) Acc@1 79.004 (79.004) Acc@5 95.117 (95.117) [2022-10-01 06:43:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.672 Acc@5 94.716 [2022-10-01 06:43:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-01 06:43:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.67% [2022-10-01 06:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][0/1251] eta 1:08:33 lr 0.000210 time 3.2881 (3.2881) loss 3.0565 (3.0565) grad_norm 1.9570 (1.9570) [2022-10-01 06:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][100/1251] eta 0:06:10 lr 0.000210 time 0.2882 (0.3217) loss 2.3640 (3.1803) grad_norm 1.9776 (2.1575) [2022-10-01 06:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][200/1251] eta 0:05:22 lr 0.000209 time 0.4098 (0.3067) loss 2.8122 (3.1787) grad_norm 2.1052 (2.1221) [2022-10-01 06:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][300/1251] eta 0:04:46 lr 0.000209 time 0.2891 (0.3012) loss 3.9702 (3.2257) grad_norm 1.9049 (2.1142) [2022-10-01 06:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][400/1251] eta 0:04:14 lr 0.000209 time 0.2917 (0.2985) loss 3.1467 (3.2261) grad_norm 2.0716 (2.1095) [2022-10-01 06:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][500/1251] eta 0:03:42 lr 0.000208 time 0.2904 (0.2968) loss 3.8750 (3.2361) grad_norm 2.3123 (2.1108) [2022-10-01 06:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][600/1251] eta 0:03:12 lr 0.000208 time 0.2924 (0.2957) loss 2.9212 (3.2449) grad_norm 1.9122 (2.1054) [2022-10-01 06:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][700/1251] eta 0:02:42 lr 0.000208 time 0.3844 (0.2949) loss 3.0496 (3.2541) grad_norm 2.2493 (2.1073) [2022-10-01 06:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][800/1251] eta 0:02:12 lr 0.000207 time 0.2941 (0.2943) loss 2.7537 (3.2582) grad_norm 1.8203 (2.1080) [2022-10-01 06:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][900/1251] eta 0:01:43 lr 0.000207 time 0.2908 (0.2938) loss 2.5064 (3.2649) grad_norm 2.0601 (2.1004) [2022-10-01 06:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1000/1251] eta 0:01:13 lr 0.000207 time 0.2940 (0.2934) loss 3.3311 (3.2624) grad_norm 2.1824 (2.1018) [2022-10-01 06:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1100/1251] eta 0:00:44 lr 0.000206 time 0.2911 (0.2931) loss 3.4162 (3.2582) grad_norm 2.0595 (2.1088) [2022-10-01 06:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1200/1251] eta 0:00:14 lr 0.000206 time 0.3850 (0.2929) loss 2.1338 (3.2529) grad_norm 2.0726 (2.1084) [2022-10-01 06:49:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 211 training takes 0:06:06 [2022-10-01 06:49:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.963 (2.963) Loss 0.8566 (0.8566) Acc@1 79.785 (79.785) Acc@5 94.531 (94.531) [2022-10-01 06:49:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.728 Acc@5 94.680 [2022-10-01 06:49:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-01 06:49:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.73% [2022-10-01 06:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][0/1251] eta 0:58:23 lr 0.000206 time 2.8005 (2.8005) loss 2.4807 (2.4807) grad_norm 2.0092 (2.0092) [2022-10-01 06:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][100/1251] eta 0:06:06 lr 0.000205 time 0.2884 (0.3183) loss 2.2861 (3.2049) grad_norm 2.5691 (2.1339) [2022-10-01 06:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][200/1251] eta 0:05:19 lr 0.000205 time 0.2901 (0.3040) loss 3.4356 (3.2812) grad_norm 2.2447 (2.1040) [2022-10-01 06:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][300/1251] eta 0:04:44 lr 0.000205 time 0.2892 (0.2990) loss 2.3431 (3.2414) grad_norm 1.8978 (2.1002) [2022-10-01 06:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][400/1251] eta 0:04:12 lr 0.000204 time 0.3832 (0.2967) loss 3.7460 (3.2321) grad_norm 2.2150 (2.1159) [2022-10-01 06:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][500/1251] eta 0:03:41 lr 0.000204 time 0.2945 (0.2951) loss 3.7087 (3.2306) grad_norm 2.1932 (2.1166) [2022-10-01 06:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][600/1251] eta 0:03:11 lr 0.000204 time 0.2909 (0.2943) loss 2.9326 (3.2390) grad_norm 2.0351 (2.1210) [2022-10-01 06:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][700/1251] eta 0:02:41 lr 0.000203 time 0.2914 (0.2935) loss 3.3198 (3.2446) grad_norm 2.0733 (2.1159) [2022-10-01 06:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][800/1251] eta 0:02:12 lr 0.000203 time 0.2866 (0.2929) loss 3.2620 (3.2461) grad_norm 2.1890 (2.1131) [2022-10-01 06:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][900/1251] eta 0:01:42 lr 0.000203 time 0.3837 (0.2925) loss 3.7884 (3.2468) grad_norm 1.8687 (2.1183) [2022-10-01 06:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1000/1251] eta 0:01:13 lr 0.000202 time 0.2870 (0.2921) loss 3.8387 (3.2460) grad_norm 2.1257 (2.1127) [2022-10-01 06:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1100/1251] eta 0:00:44 lr 0.000202 time 0.2922 (0.2918) loss 2.6680 (3.2441) grad_norm 2.2331 (2.1161) [2022-10-01 06:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1200/1251] eta 0:00:14 lr 0.000202 time 0.2878 (0.2916) loss 3.2041 (3.2456) grad_norm 1.8066 (2.1146) [2022-10-01 06:55:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 212 training takes 0:06:05 [2022-10-01 06:55:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.100 (3.100) Loss 0.8277 (0.8277) Acc@1 80.469 (80.469) Acc@5 95.410 (95.410) [2022-10-01 06:56:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.708 Acc@5 94.612 [2022-10-01 06:56:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-01 06:56:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.73% [2022-10-01 06:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][0/1251] eta 1:01:15 lr 0.000202 time 2.9383 (2.9383) loss 3.8353 (3.8353) grad_norm 2.0240 (2.0240) [2022-10-01 06:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][100/1251] eta 0:06:06 lr 0.000201 time 0.3866 (0.3185) loss 3.7616 (3.2312) grad_norm 2.0039 (2.0916) [2022-10-01 06:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][200/1251] eta 0:05:20 lr 0.000201 time 0.2902 (0.3046) loss 3.7276 (3.2543) grad_norm 2.0129 (2.0905) [2022-10-01 06:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][300/1251] eta 0:04:45 lr 0.000201 time 0.2909 (0.3000) loss 3.6901 (3.2708) grad_norm 1.7202 (2.1048) [2022-10-01 06:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][400/1251] eta 0:04:13 lr 0.000200 time 0.2900 (0.2976) loss 2.5560 (3.2597) grad_norm 1.9802 (2.0981) [2022-10-01 06:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][500/1251] eta 0:03:42 lr 0.000200 time 0.2928 (0.2961) loss 2.3728 (3.2405) grad_norm 1.9579 (2.0987) [2022-10-01 06:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][600/1251] eta 0:03:12 lr 0.000200 time 0.3813 (0.2953) loss 3.5325 (3.2514) grad_norm 2.2073 (2.0926) [2022-10-01 06:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][700/1251] eta 0:02:42 lr 0.000199 time 0.2911 (0.2945) loss 2.3168 (3.2574) grad_norm 2.1246 (2.0901) [2022-10-01 06:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][800/1251] eta 0:02:12 lr 0.000199 time 0.2855 (0.2940) loss 3.9282 (3.2529) grad_norm 1.8157 (2.0848) [2022-10-01 07:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][900/1251] eta 0:01:43 lr 0.000199 time 0.2958 (0.2936) loss 3.9348 (3.2533) grad_norm 2.2250 (2.0878) [2022-10-01 07:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1000/1251] eta 0:01:13 lr 0.000198 time 0.2868 (0.2933) loss 3.0492 (3.2503) grad_norm 2.5158 (2.0882) [2022-10-01 07:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1100/1251] eta 0:00:44 lr 0.000198 time 0.4031 (0.2931) loss 3.5927 (3.2593) grad_norm 2.3023 (2.0890) [2022-10-01 07:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1200/1251] eta 0:00:14 lr 0.000198 time 0.2900 (0.2928) loss 4.0606 (3.2572) grad_norm 2.3030 (2.0932) [2022-10-01 07:02:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 213 training takes 0:06:06 [2022-10-01 07:02:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.776 (2.776) Loss 0.9070 (0.9070) Acc@1 80.469 (80.469) Acc@5 94.043 (94.043) [2022-10-01 07:02:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.686 Acc@5 94.690 [2022-10-01 07:02:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-01 07:02:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.73% [2022-10-01 07:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][0/1251] eta 0:45:47 lr 0.000198 time 2.1962 (2.1962) loss 3.3910 (3.3910) grad_norm 1.8767 (1.8767) [2022-10-01 07:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][100/1251] eta 0:06:05 lr 0.000197 time 0.2891 (0.3173) loss 3.9286 (3.2693) grad_norm 2.1140 (2.0986) [2022-10-01 07:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][200/1251] eta 0:05:18 lr 0.000197 time 0.2902 (0.3032) loss 3.3968 (3.2570) grad_norm 2.0068 (2.1077) [2022-10-01 07:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][300/1251] eta 0:04:44 lr 0.000197 time 0.3795 (0.2989) loss 3.4201 (3.2782) grad_norm 2.1403 (2.1109) [2022-10-01 07:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][400/1251] eta 0:04:12 lr 0.000196 time 0.2901 (0.2965) loss 3.7479 (3.2579) grad_norm 1.8991 (2.1184) [2022-10-01 07:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][500/1251] eta 0:03:41 lr 0.000196 time 0.2894 (0.2951) loss 2.8995 (3.2611) grad_norm 2.0107 (2.1123) [2022-10-01 07:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][600/1251] eta 0:03:11 lr 0.000196 time 0.2882 (0.2941) loss 3.3907 (3.2754) grad_norm 2.0472 (2.1198) [2022-10-01 07:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][700/1251] eta 0:02:41 lr 0.000195 time 0.2864 (0.2934) loss 4.0784 (3.2658) grad_norm 2.1009 (2.1170) [2022-10-01 07:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][800/1251] eta 0:02:12 lr 0.000195 time 0.3915 (0.2930) loss 3.1917 (3.2690) grad_norm 2.0797 (2.1234) [2022-10-01 07:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][900/1251] eta 0:01:42 lr 0.000195 time 0.2879 (0.2926) loss 3.8272 (3.2712) grad_norm 1.9667 (2.1232) [2022-10-01 07:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1000/1251] eta 0:01:13 lr 0.000194 time 0.2880 (0.2922) loss 3.2399 (3.2716) grad_norm 2.0485 (2.1226) [2022-10-01 07:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1100/1251] eta 0:00:44 lr 0.000194 time 0.2883 (0.2919) loss 2.4113 (3.2691) grad_norm 2.2656 (2.1197) [2022-10-01 07:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1200/1251] eta 0:00:14 lr 0.000194 time 0.2885 (0.2917) loss 3.5142 (3.2684) grad_norm 2.5231 (2.1240) [2022-10-01 07:08:28 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 214 training takes 0:06:05 [2022-10-01 07:08:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.156 (2.156) Loss 0.8897 (0.8897) Acc@1 79.785 (79.785) Acc@5 95.312 (95.312) [2022-10-01 07:08:41 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.852 Acc@5 94.756 [2022-10-01 07:08:41 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-01 07:08:41 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.85% [2022-10-01 07:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][0/1251] eta 1:04:49 lr 0.000193 time 3.1088 (3.1088) loss 2.6050 (2.6050) grad_norm 2.1669 (2.1669) [2022-10-01 07:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][100/1251] eta 0:06:08 lr 0.000193 time 0.2953 (0.3199) loss 3.7081 (3.2742) grad_norm 2.3892 (2.1284) [2022-10-01 07:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][200/1251] eta 0:05:20 lr 0.000193 time 0.2881 (0.3050) loss 3.8757 (3.2617) grad_norm 2.0446 (2.1433) [2022-10-01 07:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][300/1251] eta 0:04:45 lr 0.000193 time 0.2889 (0.2998) loss 3.6894 (3.2559) grad_norm 2.5983 (2.1541) [2022-10-01 07:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][400/1251] eta 0:04:12 lr 0.000192 time 0.2894 (0.2973) loss 3.2102 (3.2649) grad_norm 2.0816 (2.1625) [2022-10-01 07:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][500/1251] eta 0:03:42 lr 0.000192 time 0.3842 (0.2961) loss 3.4201 (3.2683) grad_norm 1.9946 (2.1570) [2022-10-01 07:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][600/1251] eta 0:03:12 lr 0.000192 time 0.2900 (0.2951) loss 2.3854 (3.2644) grad_norm 2.0083 (2.1630) [2022-10-01 07:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][700/1251] eta 0:02:42 lr 0.000191 time 0.2887 (0.2944) loss 2.3317 (3.2612) grad_norm 1.9161 (2.1547) [2022-10-01 07:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][800/1251] eta 0:02:12 lr 0.000191 time 0.2917 (0.2938) loss 3.4410 (3.2613) grad_norm 1.8825 (2.1462) [2022-10-01 07:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][900/1251] eta 0:01:42 lr 0.000191 time 0.2927 (0.2934) loss 3.4697 (3.2657) grad_norm 1.6533 (2.1438) [2022-10-01 07:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1000/1251] eta 0:01:13 lr 0.000190 time 0.3840 (0.2931) loss 3.4807 (3.2607) grad_norm 2.5016 (2.1433) [2022-10-01 07:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1100/1251] eta 0:00:44 lr 0.000190 time 0.2906 (0.2928) loss 2.2745 (3.2635) grad_norm 2.2278 (2.1457) [2022-10-01 07:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1200/1251] eta 0:00:14 lr 0.000190 time 0.2883 (0.2925) loss 3.6471 (3.2585) grad_norm 1.9311 (2.1453) [2022-10-01 07:14:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 215 training takes 0:06:06 [2022-10-01 07:14:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.021 (3.021) Loss 0.8440 (0.8440) Acc@1 79.883 (79.883) Acc@5 95.508 (95.508) [2022-10-01 07:15:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.922 Acc@5 94.650 [2022-10-01 07:15:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-01 07:15:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-01 07:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][0/1251] eta 1:09:47 lr 0.000189 time 3.3469 (3.3469) loss 3.2451 (3.2451) grad_norm 2.3986 (2.3986) [2022-10-01 07:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][100/1251] eta 0:06:09 lr 0.000189 time 0.2935 (0.3214) loss 3.8113 (3.2401) grad_norm 2.3573 (2.1923) [2022-10-01 07:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][200/1251] eta 0:05:21 lr 0.000189 time 0.3808 (0.3063) loss 3.3440 (3.2170) grad_norm 2.1238 (2.1620) [2022-10-01 07:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][300/1251] eta 0:04:46 lr 0.000189 time 0.2922 (0.3010) loss 3.4317 (3.2104) grad_norm 1.8878 (2.1708) [2022-10-01 07:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][400/1251] eta 0:04:13 lr 0.000188 time 0.2895 (0.2982) loss 3.4886 (3.2227) grad_norm 2.0215 (2.1673) [2022-10-01 07:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][500/1251] eta 0:03:42 lr 0.000188 time 0.2912 (0.2966) loss 3.0441 (3.2247) grad_norm 1.8720 (2.1659) [2022-10-01 07:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][600/1251] eta 0:03:12 lr 0.000188 time 0.2867 (0.2955) loss 3.3457 (3.2048) grad_norm 2.0537 (2.1646) [2022-10-01 07:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][700/1251] eta 0:02:42 lr 0.000187 time 0.3795 (0.2949) loss 3.4769 (3.2051) grad_norm 2.4320 (2.1684) [2022-10-01 07:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][800/1251] eta 0:02:12 lr 0.000187 time 0.2948 (0.2943) loss 2.9096 (3.2031) grad_norm 2.1376 (2.1721) [2022-10-01 07:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][900/1251] eta 0:01:43 lr 0.000187 time 0.2975 (0.2938) loss 3.1690 (3.2084) grad_norm 2.2808 (2.1699) [2022-10-01 07:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1000/1251] eta 0:01:13 lr 0.000186 time 0.2921 (0.2934) loss 2.9552 (3.2110) grad_norm 2.1791 (2.1704) [2022-10-01 07:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1100/1251] eta 0:00:44 lr 0.000186 time 0.2916 (0.2930) loss 3.3764 (3.2267) grad_norm 2.3103 (2.1695) [2022-10-01 07:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1200/1251] eta 0:00:14 lr 0.000186 time 0.3787 (0.2928) loss 3.5459 (3.2285) grad_norm 2.4992 (2.1693) [2022-10-01 07:21:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 216 training takes 0:06:06 [2022-10-01 07:21:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.190 (2.190) Loss 0.9739 (0.9739) Acc@1 76.367 (76.367) Acc@5 94.336 (94.336) [2022-10-01 07:21:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.864 Acc@5 94.726 [2022-10-01 07:21:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-01 07:21:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-01 07:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][0/1251] eta 1:09:14 lr 0.000185 time 3.3212 (3.3212) loss 3.1203 (3.1203) grad_norm 2.2334 (2.2334) [2022-10-01 07:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][100/1251] eta 0:06:11 lr 0.000185 time 0.2905 (0.3228) loss 3.4709 (3.2190) grad_norm 1.9197 (2.1679) [2022-10-01 07:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][200/1251] eta 0:05:23 lr 0.000185 time 0.2934 (0.3074) loss 3.4402 (3.2313) grad_norm 2.3514 (2.1842) [2022-10-01 07:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][300/1251] eta 0:04:47 lr 0.000185 time 0.2898 (0.3020) loss 3.2158 (3.2155) grad_norm 1.7852 (2.1999) [2022-10-01 07:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][400/1251] eta 0:04:14 lr 0.000184 time 0.3907 (0.2994) loss 3.5093 (3.2259) grad_norm 2.0062 (2.1895) [2022-10-01 07:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][500/1251] eta 0:03:43 lr 0.000184 time 0.2900 (0.2978) loss 2.7894 (3.2338) grad_norm 2.3268 (2.1821) [2022-10-01 07:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][600/1251] eta 0:03:13 lr 0.000184 time 0.2927 (0.2965) loss 3.0361 (3.2440) grad_norm 2.2485 (2.1830) [2022-10-01 07:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][700/1251] eta 0:02:42 lr 0.000183 time 0.2927 (0.2956) loss 3.3046 (3.2433) grad_norm 2.0546 (2.1827) [2022-10-01 07:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][800/1251] eta 0:02:13 lr 0.000183 time 0.2934 (0.2950) loss 2.6617 (3.2453) grad_norm 2.2499 (2.1848) [2022-10-01 07:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][900/1251] eta 0:01:43 lr 0.000183 time 0.3890 (0.2945) loss 3.3689 (3.2364) grad_norm 2.1136 (2.1811) [2022-10-01 07:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1000/1251] eta 0:01:13 lr 0.000182 time 0.2878 (0.2940) loss 3.6427 (3.2349) grad_norm 1.8691 (2.1794) [2022-10-01 07:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1100/1251] eta 0:00:44 lr 0.000182 time 0.2910 (0.2935) loss 3.2880 (3.2215) grad_norm 2.0204 (2.1821) [2022-10-01 07:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1200/1251] eta 0:00:14 lr 0.000182 time 0.2876 (0.2932) loss 3.1924 (3.2183) grad_norm 2.1495 (2.1821) [2022-10-01 07:27:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 217 training takes 0:06:07 [2022-10-01 07:27:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.208 (3.208) Loss 0.8533 (0.8533) Acc@1 79.688 (79.688) Acc@5 95.703 (95.703) [2022-10-01 07:27:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.074 Acc@5 94.760 [2022-10-01 07:27:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-01 07:27:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.07% [2022-10-01 07:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][0/1251] eta 0:57:33 lr 0.000182 time 2.7602 (2.7602) loss 3.3091 (3.3091) grad_norm 1.9198 (1.9198) [2022-10-01 07:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][100/1251] eta 0:06:05 lr 0.000181 time 0.3830 (0.3179) loss 2.5908 (3.2287) grad_norm 2.2567 (2.1568) [2022-10-01 07:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][200/1251] eta 0:05:19 lr 0.000181 time 0.2873 (0.3042) loss 3.3375 (3.2529) grad_norm 2.1140 (2.1710) [2022-10-01 07:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][300/1251] eta 0:04:44 lr 0.000181 time 0.2837 (0.2995) loss 3.5949 (3.2376) grad_norm 2.2135 (2.1679) [2022-10-01 07:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][400/1251] eta 0:04:13 lr 0.000180 time 0.2915 (0.2974) loss 3.5138 (3.2355) grad_norm 2.2149 (2.1759) [2022-10-01 07:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][500/1251] eta 0:03:42 lr 0.000180 time 0.2901 (0.2961) loss 3.2084 (3.2360) grad_norm 1.8919 (2.1706) [2022-10-01 07:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][600/1251] eta 0:03:12 lr 0.000180 time 0.3846 (0.2953) loss 3.8965 (3.2403) grad_norm 2.5906 (2.1654) [2022-10-01 07:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][700/1251] eta 0:02:42 lr 0.000179 time 0.2860 (0.2946) loss 3.6489 (3.2430) grad_norm 2.0458 (2.1617) [2022-10-01 07:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][800/1251] eta 0:02:12 lr 0.000179 time 0.2909 (0.2941) loss 3.4234 (3.2446) grad_norm 2.2539 (2.1624) [2022-10-01 07:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][900/1251] eta 0:01:43 lr 0.000179 time 0.2921 (0.2938) loss 3.5037 (3.2457) grad_norm 2.2888 (2.1623) [2022-10-01 07:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1000/1251] eta 0:01:13 lr 0.000178 time 0.2928 (0.2935) loss 2.8420 (3.2416) grad_norm 2.3376 (2.1621) [2022-10-01 07:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1100/1251] eta 0:00:44 lr 0.000178 time 0.3858 (0.2933) loss 2.4643 (3.2418) grad_norm 2.0370 (2.1625) [2022-10-01 07:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1200/1251] eta 0:00:14 lr 0.000178 time 0.2881 (0.2931) loss 3.4606 (3.2423) grad_norm 1.9566 (2.1633) [2022-10-01 07:33:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 218 training takes 0:06:06 [2022-10-01 07:33:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.522 (2.522) Loss 0.9519 (0.9519) Acc@1 77.344 (77.344) Acc@5 93.555 (93.555) [2022-10-01 07:33:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.176 Acc@5 94.776 [2022-10-01 07:33:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-01 07:33:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.18% [2022-10-01 07:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][0/1251] eta 1:01:22 lr 0.000178 time 2.9434 (2.9434) loss 3.4559 (3.4559) grad_norm 2.0393 (2.0393) [2022-10-01 07:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][100/1251] eta 0:06:05 lr 0.000177 time 0.2865 (0.3172) loss 2.6434 (3.2069) grad_norm 2.1487 (2.1423) [2022-10-01 07:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][200/1251] eta 0:05:18 lr 0.000177 time 0.2889 (0.3029) loss 3.5619 (3.2391) grad_norm 2.4008 (2.1451) [2022-10-01 07:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][300/1251] eta 0:04:43 lr 0.000177 time 0.3801 (0.2985) loss 2.9105 (3.2438) grad_norm 2.3085 (2.1792) [2022-10-01 07:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][400/1251] eta 0:04:12 lr 0.000176 time 0.2906 (0.2962) loss 3.8213 (3.2376) grad_norm 2.2425 (2.1881) [2022-10-01 07:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][500/1251] eta 0:03:41 lr 0.000176 time 0.2918 (0.2948) loss 2.2621 (3.2268) grad_norm 2.0800 (2.1890) [2022-10-01 07:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][600/1251] eta 0:03:11 lr 0.000176 time 0.2911 (0.2938) loss 3.3844 (3.2333) grad_norm 2.1258 (2.1968) [2022-10-01 07:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][700/1251] eta 0:02:41 lr 0.000175 time 0.2874 (0.2931) loss 3.0675 (3.2331) grad_norm 2.0970 (2.2041) [2022-10-01 07:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][800/1251] eta 0:02:12 lr 0.000175 time 0.3899 (0.2927) loss 3.2818 (3.2292) grad_norm 3.5722 (2.2101) [2022-10-01 07:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][900/1251] eta 0:01:42 lr 0.000175 time 0.2877 (0.2924) loss 3.4387 (3.2329) grad_norm 1.8931 (2.2103) [2022-10-01 07:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1000/1251] eta 0:01:13 lr 0.000175 time 0.2911 (0.2921) loss 2.7406 (3.2338) grad_norm 2.3324 (2.2075) [2022-10-01 07:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1100/1251] eta 0:00:44 lr 0.000174 time 0.2904 (0.2918) loss 2.6147 (3.2338) grad_norm 2.2726 (2.2080) [2022-10-01 07:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1200/1251] eta 0:00:14 lr 0.000174 time 0.2905 (0.2916) loss 2.8318 (3.2267) grad_norm 2.3314 (2.2079) [2022-10-01 07:40:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 219 training takes 0:06:05 [2022-10-01 07:40:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.522 (2.522) Loss 0.9588 (0.9588) Acc@1 77.734 (77.734) Acc@5 94.922 (94.922) [2022-10-01 07:40:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.086 Acc@5 94.838 [2022-10-01 07:40:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-01 07:40:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.18% [2022-10-01 07:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][0/1251] eta 1:04:10 lr 0.000174 time 3.0778 (3.0778) loss 3.4887 (3.4887) grad_norm 2.2391 (2.2391) [2022-10-01 07:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][100/1251] eta 0:06:08 lr 0.000173 time 0.2869 (0.3201) loss 3.3517 (3.2005) grad_norm 2.1245 (2.1909) [2022-10-01 07:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][200/1251] eta 0:05:20 lr 0.000173 time 0.2926 (0.3052) loss 3.3794 (3.2434) grad_norm 1.9678 (2.1834) [2022-10-01 07:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][300/1251] eta 0:04:45 lr 0.000173 time 0.2875 (0.3002) loss 2.9494 (3.2331) grad_norm 2.1457 (2.1930) [2022-10-01 07:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][400/1251] eta 0:04:13 lr 0.000173 time 0.2934 (0.2979) loss 2.2481 (3.2280) grad_norm 2.2191 (2.1943) [2022-10-01 07:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][500/1251] eta 0:03:42 lr 0.000172 time 0.3796 (0.2966) loss 3.3549 (3.2188) grad_norm 2.4014 (2.1961) [2022-10-01 07:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][600/1251] eta 0:03:12 lr 0.000172 time 0.2895 (0.2956) loss 3.1850 (3.2144) grad_norm 2.0491 (2.1918) [2022-10-01 07:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][700/1251] eta 0:02:42 lr 0.000172 time 0.2930 (0.2949) loss 3.7163 (3.2264) grad_norm 2.1328 (2.1871) [2022-10-01 07:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][800/1251] eta 0:02:12 lr 0.000171 time 0.2899 (0.2944) loss 3.0016 (3.2274) grad_norm 2.3514 (2.1904) [2022-10-01 07:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][900/1251] eta 0:01:43 lr 0.000171 time 0.2860 (0.2940) loss 2.7737 (3.2311) grad_norm 2.4449 (2.1985) [2022-10-01 07:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1000/1251] eta 0:01:13 lr 0.000171 time 0.3927 (0.2937) loss 2.5266 (3.2373) grad_norm 2.1308 (2.2041) [2022-10-01 07:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1100/1251] eta 0:00:44 lr 0.000170 time 0.2872 (0.2934) loss 2.6723 (3.2260) grad_norm 2.2976 (2.2045) [2022-10-01 07:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1200/1251] eta 0:00:14 lr 0.000170 time 0.2912 (0.2932) loss 3.7822 (3.2281) grad_norm 2.3545 (2.2061) [2022-10-01 07:46:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 220 training takes 0:06:07 [2022-10-01 07:46:24 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saving...... [2022-10-01 07:46:25 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saved !!! [2022-10-01 07:46:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.078 (3.078) Loss 0.9417 (0.9417) Acc@1 78.125 (78.125) Acc@5 94.531 (94.531) [2022-10-01 07:46:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.910 Acc@5 94.774 [2022-10-01 07:46:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-01 07:46:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.18% [2022-10-01 07:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][0/1251] eta 0:55:41 lr 0.000170 time 2.6715 (2.6715) loss 2.6086 (2.6086) grad_norm 2.2291 (2.2291) [2022-10-01 07:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][100/1251] eta 0:06:07 lr 0.000170 time 0.2910 (0.3190) loss 2.1945 (3.2163) grad_norm 2.5698 (2.2324) [2022-10-01 07:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][200/1251] eta 0:05:20 lr 0.000169 time 0.3875 (0.3053) loss 2.8730 (3.2286) grad_norm 3.5658 (2.2311) [2022-10-01 07:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][300/1251] eta 0:04:46 lr 0.000169 time 0.2932 (0.3009) loss 3.5320 (3.2198) grad_norm 2.3514 (2.2344) [2022-10-01 07:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][400/1251] eta 0:04:13 lr 0.000169 time 0.2880 (0.2984) loss 3.7477 (3.2162) grad_norm 1.9855 (2.2319) [2022-10-01 07:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][500/1251] eta 0:03:42 lr 0.000168 time 0.2982 (0.2969) loss 3.7491 (3.2075) grad_norm 2.5599 (2.2254) [2022-10-01 07:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][600/1251] eta 0:03:12 lr 0.000168 time 0.2886 (0.2960) loss 3.5985 (3.2198) grad_norm 2.2680 (2.2224) [2022-10-01 07:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][700/1251] eta 0:02:42 lr 0.000168 time 0.3953 (0.2954) loss 3.6083 (3.2276) grad_norm 2.1106 (2.2242) [2022-10-01 07:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][800/1251] eta 0:02:12 lr 0.000168 time 0.2876 (0.2948) loss 3.6711 (3.2235) grad_norm 2.3064 (2.2273) [2022-10-01 07:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][900/1251] eta 0:01:43 lr 0.000167 time 0.2963 (0.2943) loss 2.9408 (3.2138) grad_norm 2.2003 (2.2313) [2022-10-01 07:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1000/1251] eta 0:01:13 lr 0.000167 time 0.2864 (0.2939) loss 3.5125 (3.2168) grad_norm 2.0119 (2.2350) [2022-10-01 07:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1100/1251] eta 0:00:44 lr 0.000167 time 0.2904 (0.2936) loss 2.2625 (3.2148) grad_norm 2.0680 (2.2381) [2022-10-01 07:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1200/1251] eta 0:00:14 lr 0.000166 time 0.3811 (0.2934) loss 3.5911 (3.2130) grad_norm 1.9586 (2.2384) [2022-10-01 07:52:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 221 training takes 0:06:07 [2022-10-01 07:52:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.766 (2.766) Loss 0.8427 (0.8427) Acc@1 80.566 (80.566) Acc@5 94.531 (94.531) [2022-10-01 07:52:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.100 Acc@5 94.912 [2022-10-01 07:52:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-01 07:52:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.18% [2022-10-01 07:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][0/1251] eta 0:59:11 lr 0.000166 time 2.8386 (2.8386) loss 3.6407 (3.6407) grad_norm 2.8130 (2.8130) [2022-10-01 07:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][100/1251] eta 0:06:03 lr 0.000166 time 0.2912 (0.3155) loss 3.4626 (3.1369) grad_norm 2.6819 (2.2928) [2022-10-01 07:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][200/1251] eta 0:05:17 lr 0.000166 time 0.2866 (0.3023) loss 3.5944 (3.1430) grad_norm 2.0441 (2.2631) [2022-10-01 07:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][300/1251] eta 0:04:43 lr 0.000165 time 0.2894 (0.2980) loss 2.2578 (3.1780) grad_norm 2.0762 (2.2736) [2022-10-01 07:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][400/1251] eta 0:04:11 lr 0.000165 time 0.3817 (0.2961) loss 3.0290 (3.1764) grad_norm 2.2172 (2.2623) [2022-10-01 07:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][500/1251] eta 0:03:41 lr 0.000165 time 0.2900 (0.2948) loss 2.1003 (3.1724) grad_norm 2.3534 (2.2576) [2022-10-01 07:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][600/1251] eta 0:03:11 lr 0.000164 time 0.2881 (0.2940) loss 3.5280 (3.1783) grad_norm 2.3077 (2.2480) [2022-10-01 07:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][700/1251] eta 0:02:41 lr 0.000164 time 0.2912 (0.2934) loss 3.4719 (3.1717) grad_norm 2.0312 (2.2448) [2022-10-01 07:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][800/1251] eta 0:02:12 lr 0.000164 time 0.2894 (0.2930) loss 3.9227 (3.1700) grad_norm 2.3242 (2.2484) [2022-10-01 07:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][900/1251] eta 0:01:42 lr 0.000163 time 0.3872 (0.2926) loss 2.2336 (3.1806) grad_norm 2.6203 (2.2440) [2022-10-01 07:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1000/1251] eta 0:01:13 lr 0.000163 time 0.2927 (0.2923) loss 3.7789 (3.1865) grad_norm 1.9207 (2.2461) [2022-10-01 07:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1100/1251] eta 0:00:44 lr 0.000163 time 0.2913 (0.2920) loss 2.8840 (3.1839) grad_norm 2.2606 (2.2459) [2022-10-01 07:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1200/1251] eta 0:00:14 lr 0.000163 time 0.2871 (0.2918) loss 3.2957 (3.1876) grad_norm 2.0157 (2.2479) [2022-10-01 07:59:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 222 training takes 0:06:05 [2022-10-01 07:59:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.468 (3.468) Loss 0.8959 (0.8959) Acc@1 80.078 (80.078) Acc@5 93.750 (93.750) [2022-10-01 07:59:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.206 Acc@5 94.888 [2022-10-01 07:59:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-01 07:59:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.21% [2022-10-01 07:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][0/1251] eta 0:58:26 lr 0.000162 time 2.8029 (2.8029) loss 3.4982 (3.4982) grad_norm 2.5909 (2.5909) [2022-10-01 07:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][100/1251] eta 0:06:06 lr 0.000162 time 0.3853 (0.3183) loss 2.6374 (3.2100) grad_norm 2.0906 (2.2261) [2022-10-01 08:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][200/1251] eta 0:05:19 lr 0.000162 time 0.2890 (0.3038) loss 3.4745 (3.2093) grad_norm 2.2386 (2.2172) [2022-10-01 08:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][300/1251] eta 0:04:44 lr 0.000161 time 0.2889 (0.2990) loss 3.4978 (3.1901) grad_norm 2.0791 (2.2221) [2022-10-01 08:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][400/1251] eta 0:04:12 lr 0.000161 time 0.2927 (0.2965) loss 3.5672 (3.2093) grad_norm 2.1947 (2.2276) [2022-10-01 08:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][500/1251] eta 0:03:41 lr 0.000161 time 0.2866 (0.2951) loss 3.6040 (3.2220) grad_norm 1.9569 (2.2334) [2022-10-01 08:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][600/1251] eta 0:03:11 lr 0.000161 time 0.3861 (0.2942) loss 3.2776 (3.2317) grad_norm 1.8685 (2.2330) [2022-10-01 08:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][700/1251] eta 0:02:41 lr 0.000160 time 0.2885 (0.2934) loss 3.7745 (3.2416) grad_norm 1.9081 (2.2393) [2022-10-01 08:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][800/1251] eta 0:02:12 lr 0.000160 time 0.2884 (0.2928) loss 3.1582 (3.2285) grad_norm 2.1794 (2.2448) [2022-10-01 08:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][900/1251] eta 0:01:42 lr 0.000160 time 0.2901 (0.2924) loss 3.2529 (3.2213) grad_norm 2.3471 (2.2419) [2022-10-01 08:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1000/1251] eta 0:01:13 lr 0.000159 time 0.2898 (0.2921) loss 3.8340 (3.2201) grad_norm 2.0338 (2.2397) [2022-10-01 08:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1100/1251] eta 0:00:44 lr 0.000159 time 0.3850 (0.2918) loss 1.8705 (3.2223) grad_norm 2.1592 (2.2378) [2022-10-01 08:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1200/1251] eta 0:00:14 lr 0.000159 time 0.2859 (0.2915) loss 2.0756 (3.2166) grad_norm 2.3701 (2.2415) [2022-10-01 08:05:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 223 training takes 0:06:04 [2022-10-01 08:05:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.721 (2.721) Loss 0.8911 (0.8911) Acc@1 79.395 (79.395) Acc@5 94.336 (94.336) [2022-10-01 08:05:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.096 Acc@5 94.878 [2022-10-01 08:05:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-01 08:05:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.21% [2022-10-01 08:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][0/1251] eta 0:46:39 lr 0.000159 time 2.2379 (2.2379) loss 3.1415 (3.1415) grad_norm 1.9085 (1.9085) [2022-10-01 08:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][100/1251] eta 0:06:02 lr 0.000158 time 0.2902 (0.3146) loss 3.8206 (3.2730) grad_norm 2.0135 (2.2452) [2022-10-01 08:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][200/1251] eta 0:05:17 lr 0.000158 time 0.2858 (0.3018) loss 3.0945 (3.1758) grad_norm 2.3560 (2.2497) [2022-10-01 08:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][300/1251] eta 0:04:43 lr 0.000158 time 0.3905 (0.2977) loss 3.7093 (3.2026) grad_norm 2.5156 (2.2469) [2022-10-01 08:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][400/1251] eta 0:04:11 lr 0.000157 time 0.2882 (0.2955) loss 3.0192 (3.2069) grad_norm 2.0856 (2.2555) [2022-10-01 08:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][500/1251] eta 0:03:40 lr 0.000157 time 0.2908 (0.2942) loss 3.0589 (3.2202) grad_norm 2.2592 (2.2620) [2022-10-01 08:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][600/1251] eta 0:03:10 lr 0.000157 time 0.2985 (0.2934) loss 3.8182 (3.2214) grad_norm 1.9820 (2.2627) [2022-10-01 08:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][700/1251] eta 0:02:41 lr 0.000157 time 0.2975 (0.2928) loss 3.2877 (3.2140) grad_norm 2.4290 (2.2654) [2022-10-01 08:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][800/1251] eta 0:02:11 lr 0.000156 time 0.3795 (0.2923) loss 2.4512 (3.2010) grad_norm 2.0649 (2.2730) [2022-10-01 08:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][900/1251] eta 0:01:42 lr 0.000156 time 0.2883 (0.2918) loss 3.5077 (3.2105) grad_norm 2.3564 (2.2747) [2022-10-01 08:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1000/1251] eta 0:01:13 lr 0.000156 time 0.2859 (0.2914) loss 3.5799 (3.2120) grad_norm 2.2470 (2.2802) [2022-10-01 08:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1100/1251] eta 0:00:43 lr 0.000155 time 0.2874 (0.2911) loss 3.6183 (3.2094) grad_norm 2.3700 (2.2830) [2022-10-01 08:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1200/1251] eta 0:00:14 lr 0.000155 time 0.2860 (0.2908) loss 3.6770 (3.2072) grad_norm 2.1418 (2.2826) [2022-10-01 08:11:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 224 training takes 0:06:04 [2022-10-01 08:11:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.460 (2.460) Loss 0.8618 (0.8618) Acc@1 80.078 (80.078) Acc@5 95.312 (95.312) [2022-10-01 08:11:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.200 Acc@5 94.860 [2022-10-01 08:11:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-01 08:11:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.21% [2022-10-01 08:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][0/1251] eta 1:01:19 lr 0.000155 time 2.9414 (2.9414) loss 2.3957 (2.3957) grad_norm 1.8954 (1.8954) [2022-10-01 08:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][100/1251] eta 0:06:07 lr 0.000155 time 0.2958 (0.3196) loss 3.3838 (3.0933) grad_norm 2.2045 (2.2816) [2022-10-01 08:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][200/1251] eta 0:05:21 lr 0.000154 time 0.2941 (0.3061) loss 3.0551 (3.1163) grad_norm 1.8857 (2.2559) [2022-10-01 08:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][300/1251] eta 0:04:46 lr 0.000154 time 0.3022 (0.3011) loss 2.1661 (3.1271) grad_norm 2.3197 (2.2804) [2022-10-01 08:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][400/1251] eta 0:04:13 lr 0.000154 time 0.2893 (0.2985) loss 3.4076 (3.1068) grad_norm 2.4819 (2.2763) [2022-10-01 08:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][500/1251] eta 0:03:43 lr 0.000154 time 0.3832 (0.2973) loss 3.2985 (3.1264) grad_norm 2.0116 (2.2746) [2022-10-01 08:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][600/1251] eta 0:03:12 lr 0.000153 time 0.2908 (0.2963) loss 2.2782 (3.1338) grad_norm 2.6496 (2.2841) [2022-10-01 08:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][700/1251] eta 0:02:42 lr 0.000153 time 0.3041 (0.2955) loss 1.9386 (3.1437) grad_norm 2.6415 (2.2886) [2022-10-01 08:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][800/1251] eta 0:02:13 lr 0.000153 time 0.2907 (0.2950) loss 3.2049 (3.1481) grad_norm 2.2931 (2.2932) [2022-10-01 08:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][900/1251] eta 0:01:43 lr 0.000152 time 0.2956 (0.2945) loss 2.4606 (3.1552) grad_norm 2.0666 (2.2928) [2022-10-01 08:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1000/1251] eta 0:01:13 lr 0.000152 time 0.3855 (0.2943) loss 2.7806 (3.1624) grad_norm 2.3472 (2.2959) [2022-10-01 08:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1100/1251] eta 0:00:44 lr 0.000152 time 0.2948 (0.2940) loss 3.1713 (3.1744) grad_norm 2.5963 (2.2868) [2022-10-01 08:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1200/1251] eta 0:00:14 lr 0.000151 time 0.2882 (0.2938) loss 3.5061 (3.1749) grad_norm 1.9541 (2.2877) [2022-10-01 08:17:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 225 training takes 0:06:07 [2022-10-01 08:18:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.672 (2.672) Loss 0.8834 (0.8834) Acc@1 79.199 (79.199) Acc@5 94.629 (94.629) [2022-10-01 08:18:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.186 Acc@5 94.878 [2022-10-01 08:18:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-01 08:18:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.21% [2022-10-01 08:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][0/1251] eta 0:56:43 lr 0.000151 time 2.7205 (2.7205) loss 3.6728 (3.6728) grad_norm 2.0665 (2.0665) [2022-10-01 08:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][100/1251] eta 0:06:03 lr 0.000151 time 0.2897 (0.3159) loss 2.3961 (3.1743) grad_norm 2.2699 (2.2675) [2022-10-01 08:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][200/1251] eta 0:05:18 lr 0.000151 time 0.3873 (0.3034) loss 3.6980 (3.1904) grad_norm 2.4226 (2.2598) [2022-10-01 08:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][300/1251] eta 0:04:44 lr 0.000150 time 0.2887 (0.2991) loss 3.2819 (3.1807) grad_norm 2.1144 (2.2550) [2022-10-01 08:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][400/1251] eta 0:04:12 lr 0.000150 time 0.2870 (0.2968) loss 3.2554 (3.1866) grad_norm 2.4697 (2.2685) [2022-10-01 08:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][500/1251] eta 0:03:41 lr 0.000150 time 0.2870 (0.2955) loss 3.3480 (3.1865) grad_norm 2.2095 (2.2737) [2022-10-01 08:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][600/1251] eta 0:03:11 lr 0.000150 time 0.2889 (0.2946) loss 3.2442 (3.1736) grad_norm 2.0553 (2.2814) [2022-10-01 08:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][700/1251] eta 0:02:42 lr 0.000149 time 0.3826 (0.2941) loss 3.5421 (3.1853) grad_norm 2.0239 (2.2802) [2022-10-01 08:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][800/1251] eta 0:02:12 lr 0.000149 time 0.2879 (0.2934) loss 3.7593 (3.1960) grad_norm 2.1892 (2.2829) [2022-10-01 08:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][900/1251] eta 0:01:42 lr 0.000149 time 0.2893 (0.2930) loss 3.1340 (3.2095) grad_norm 2.0392 (2.2778) [2022-10-01 08:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1000/1251] eta 0:01:13 lr 0.000148 time 0.2878 (0.2926) loss 2.8114 (3.2066) grad_norm 2.1597 (2.2770) [2022-10-01 08:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1100/1251] eta 0:00:44 lr 0.000148 time 0.2901 (0.2923) loss 3.6237 (3.2040) grad_norm 2.0641 (2.2764) [2022-10-01 08:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1200/1251] eta 0:00:14 lr 0.000148 time 0.3798 (0.2920) loss 3.5689 (3.2036) grad_norm 2.3408 (2.2755) [2022-10-01 08:24:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 226 training takes 0:06:05 [2022-10-01 08:24:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.067 (3.067) Loss 0.8802 (0.8802) Acc@1 79.004 (79.004) Acc@5 95.605 (95.605) [2022-10-01 08:24:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.422 Acc@5 94.972 [2022-10-01 08:24:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-01 08:24:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.42% [2022-10-01 08:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][0/1251] eta 0:45:38 lr 0.000148 time 2.1892 (2.1892) loss 2.4191 (2.4191) grad_norm 2.1865 (2.1865) [2022-10-01 08:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][100/1251] eta 0:06:01 lr 0.000147 time 0.2861 (0.3144) loss 2.8020 (3.2050) grad_norm 2.3290 (2.3234) [2022-10-01 08:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][200/1251] eta 0:05:16 lr 0.000147 time 0.2859 (0.3014) loss 2.9605 (3.2138) grad_norm 2.0990 (2.3388) [2022-10-01 08:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][300/1251] eta 0:04:42 lr 0.000147 time 0.2895 (0.2971) loss 3.2517 (3.1952) grad_norm 2.4054 (2.3149) [2022-10-01 08:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][400/1251] eta 0:04:11 lr 0.000147 time 0.3800 (0.2952) loss 2.7473 (3.1824) grad_norm 2.4005 (2.3198) [2022-10-01 08:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][500/1251] eta 0:03:40 lr 0.000146 time 0.2862 (0.2937) loss 3.7160 (3.1885) grad_norm 2.0330 (2.3190) [2022-10-01 08:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][600/1251] eta 0:03:10 lr 0.000146 time 0.2885 (0.2928) loss 3.7846 (3.1833) grad_norm 2.0987 (2.3055) [2022-10-01 08:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][700/1251] eta 0:02:41 lr 0.000146 time 0.2875 (0.2922) loss 3.7649 (3.1890) grad_norm 2.4490 (2.2988) [2022-10-01 08:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][800/1251] eta 0:02:11 lr 0.000145 time 0.2888 (0.2918) loss 3.2186 (3.1944) grad_norm 2.3057 (2.2971) [2022-10-01 08:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][900/1251] eta 0:01:42 lr 0.000145 time 0.3817 (0.2914) loss 3.2048 (3.1981) grad_norm 2.1708 (2.2953) [2022-10-01 08:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1000/1251] eta 0:01:13 lr 0.000145 time 0.2847 (0.2911) loss 3.6072 (3.1870) grad_norm 2.3748 (2.2973) [2022-10-01 08:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1100/1251] eta 0:00:43 lr 0.000145 time 0.2875 (0.2909) loss 2.6535 (3.1897) grad_norm 2.2414 (2.2971) [2022-10-01 08:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1200/1251] eta 0:00:14 lr 0.000144 time 0.2896 (0.2907) loss 2.2454 (3.1894) grad_norm 2.3755 (2.2977) [2022-10-01 08:30:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 227 training takes 0:06:03 [2022-10-01 08:30:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.926 (2.926) Loss 0.8672 (0.8672) Acc@1 79.395 (79.395) Acc@5 94.922 (94.922) [2022-10-01 08:30:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.460 Acc@5 94.980 [2022-10-01 08:30:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-01 08:30:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.46% [2022-10-01 08:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][0/1251] eta 0:58:58 lr 0.000144 time 2.8288 (2.8288) loss 3.8397 (3.8397) grad_norm 2.5354 (2.5354) [2022-10-01 08:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][100/1251] eta 0:06:05 lr 0.000144 time 0.3821 (0.3177) loss 3.5379 (3.2194) grad_norm 2.0627 (2.2786) [2022-10-01 08:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][200/1251] eta 0:05:19 lr 0.000144 time 0.2889 (0.3036) loss 2.4088 (3.1798) grad_norm 2.0340 (2.2892) [2022-10-01 08:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][300/1251] eta 0:04:44 lr 0.000143 time 0.2868 (0.2987) loss 4.0078 (3.2171) grad_norm 2.0605 (2.3102) [2022-10-01 08:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][400/1251] eta 0:04:12 lr 0.000143 time 0.2893 (0.2964) loss 3.7053 (3.2008) grad_norm 2.3304 (2.3065) [2022-10-01 08:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][500/1251] eta 0:03:41 lr 0.000143 time 0.2856 (0.2948) loss 3.1301 (3.1981) grad_norm 2.1835 (2.3224) [2022-10-01 08:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][600/1251] eta 0:03:11 lr 0.000142 time 0.3830 (0.2941) loss 3.4367 (3.1989) grad_norm 2.3296 (2.3203) [2022-10-01 08:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][700/1251] eta 0:02:41 lr 0.000142 time 0.2872 (0.2934) loss 3.4925 (3.2062) grad_norm 2.0211 (2.3146) [2022-10-01 08:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][800/1251] eta 0:02:12 lr 0.000142 time 0.2912 (0.2929) loss 3.6229 (3.1920) grad_norm 2.3421 (2.3175) [2022-10-01 08:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][900/1251] eta 0:01:42 lr 0.000142 time 0.2854 (0.2924) loss 2.0370 (3.1911) grad_norm 2.2471 (2.3136) [2022-10-01 08:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1000/1251] eta 0:01:13 lr 0.000141 time 0.2875 (0.2921) loss 2.1916 (3.1958) grad_norm 2.2546 (2.3152) [2022-10-01 08:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1100/1251] eta 0:00:44 lr 0.000141 time 0.3827 (0.2918) loss 3.5209 (3.2012) grad_norm 1.9535 (2.3159) [2022-10-01 08:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1200/1251] eta 0:00:14 lr 0.000141 time 0.2864 (0.2915) loss 3.7189 (3.2016) grad_norm 2.2989 (2.3139) [2022-10-01 08:36:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 228 training takes 0:06:04 [2022-10-01 08:36:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.163 (3.163) Loss 0.8600 (0.8600) Acc@1 78.906 (78.906) Acc@5 95.605 (95.605) [2022-10-01 08:37:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.496 Acc@5 95.012 [2022-10-01 08:37:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-01 08:37:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.50% [2022-10-01 08:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][0/1251] eta 1:07:59 lr 0.000141 time 3.2606 (3.2606) loss 2.4561 (2.4561) grad_norm 2.0467 (2.0467) [2022-10-01 08:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][100/1251] eta 0:06:10 lr 0.000140 time 0.2894 (0.3215) loss 3.4724 (3.1372) grad_norm 2.7516 (2.3431) [2022-10-01 08:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][200/1251] eta 0:05:22 lr 0.000140 time 0.2907 (0.3065) loss 3.0049 (3.1511) grad_norm 2.4736 (2.3157) [2022-10-01 08:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][300/1251] eta 0:04:46 lr 0.000140 time 0.3824 (0.3014) loss 3.9466 (3.1634) grad_norm 2.6188 (2.3375) [2022-10-01 08:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][400/1251] eta 0:04:14 lr 0.000140 time 0.2906 (0.2985) loss 3.1057 (3.1492) grad_norm 2.7116 (2.3347) [2022-10-01 08:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][500/1251] eta 0:03:42 lr 0.000139 time 0.2850 (0.2967) loss 2.9728 (3.1449) grad_norm 2.0995 (2.3270) [2022-10-01 08:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][600/1251] eta 0:03:12 lr 0.000139 time 0.2911 (0.2955) loss 3.5317 (3.1544) grad_norm 2.3178 (2.3278) [2022-10-01 08:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][700/1251] eta 0:02:42 lr 0.000139 time 0.2885 (0.2946) loss 3.4665 (3.1638) grad_norm 2.2493 (2.3181) [2022-10-01 08:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][800/1251] eta 0:02:12 lr 0.000138 time 0.3828 (0.2940) loss 2.8952 (3.1490) grad_norm 2.4458 (2.3157) [2022-10-01 08:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][900/1251] eta 0:01:43 lr 0.000138 time 0.2859 (0.2935) loss 3.1866 (3.1507) grad_norm 3.1341 (2.3242) [2022-10-01 08:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1000/1251] eta 0:01:13 lr 0.000138 time 0.2871 (0.2931) loss 2.7056 (3.1511) grad_norm 2.5868 (2.3245) [2022-10-01 08:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1100/1251] eta 0:00:44 lr 0.000138 time 0.2886 (0.2927) loss 3.1970 (3.1602) grad_norm 2.5272 (2.3227) [2022-10-01 08:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1200/1251] eta 0:00:14 lr 0.000137 time 0.2878 (0.2925) loss 3.4348 (3.1675) grad_norm 2.2128 (2.3204) [2022-10-01 08:43:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 229 training takes 0:06:06 [2022-10-01 08:43:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.925 (2.925) Loss 0.8420 (0.8420) Acc@1 79.199 (79.199) Acc@5 95.312 (95.312) [2022-10-01 08:43:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.518 Acc@5 94.980 [2022-10-01 08:43:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-01 08:43:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.52% [2022-10-01 08:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][0/1251] eta 1:02:58 lr 0.000137 time 3.0204 (3.0204) loss 3.4575 (3.4575) grad_norm 2.0172 (2.0172) [2022-10-01 08:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][100/1251] eta 0:06:07 lr 0.000137 time 0.2882 (0.3191) loss 2.6429 (3.1173) grad_norm 2.1952 (2.3181) [2022-10-01 08:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][200/1251] eta 0:05:20 lr 0.000137 time 0.2895 (0.3047) loss 3.6821 (3.1226) grad_norm 2.3421 (2.3035) [2022-10-01 08:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][300/1251] eta 0:04:45 lr 0.000136 time 0.2896 (0.2999) loss 3.4070 (3.1107) grad_norm 2.2998 (2.3004) [2022-10-01 08:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][400/1251] eta 0:04:13 lr 0.000136 time 0.2900 (0.2975) loss 3.6030 (3.1185) grad_norm 2.1494 (2.3124) [2022-10-01 08:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][500/1251] eta 0:03:42 lr 0.000136 time 0.3840 (0.2964) loss 3.5732 (3.1176) grad_norm 2.1762 (2.3111) [2022-10-01 08:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][600/1251] eta 0:03:12 lr 0.000135 time 0.2897 (0.2955) loss 3.3390 (3.1328) grad_norm 2.2327 (2.3194) [2022-10-01 08:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][700/1251] eta 0:02:42 lr 0.000135 time 0.2896 (0.2949) loss 3.3360 (3.1278) grad_norm 2.2008 (2.3331) [2022-10-01 08:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][800/1251] eta 0:02:12 lr 0.000135 time 0.2923 (0.2944) loss 2.8801 (3.1343) grad_norm 2.4518 (2.3313) [2022-10-01 08:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][900/1251] eta 0:01:43 lr 0.000135 time 0.2903 (0.2940) loss 3.7561 (3.1290) grad_norm 2.3669 (2.3367) [2022-10-01 08:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1000/1251] eta 0:01:13 lr 0.000134 time 0.3798 (0.2938) loss 3.5472 (3.1386) grad_norm 2.5150 (2.3358) [2022-10-01 08:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1100/1251] eta 0:00:44 lr 0.000134 time 0.2911 (0.2936) loss 3.0228 (3.1413) grad_norm 2.1055 (2.3325) [2022-10-01 08:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1200/1251] eta 0:00:14 lr 0.000134 time 0.2895 (0.2933) loss 3.6264 (3.1434) grad_norm 2.4118 (2.3358) [2022-10-01 08:49:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 230 training takes 0:06:07 [2022-10-01 08:49:30 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saving...... [2022-10-01 08:49:30 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saved !!! [2022-10-01 08:49:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.686 (2.686) Loss 0.8963 (0.8963) Acc@1 79.492 (79.492) Acc@5 93.555 (93.555) [2022-10-01 08:49:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.544 Acc@5 94.936 [2022-10-01 08:49:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-01 08:49:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.54% [2022-10-01 08:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][0/1251] eta 0:50:15 lr 0.000134 time 2.4102 (2.4102) loss 2.9860 (2.9860) grad_norm 2.2327 (2.2327) [2022-10-01 08:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][100/1251] eta 0:06:03 lr 0.000133 time 0.2940 (0.3155) loss 3.4527 (3.2548) grad_norm 2.1680 (2.3356) [2022-10-01 08:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][200/1251] eta 0:05:18 lr 0.000133 time 0.3871 (0.3032) loss 3.0842 (3.2427) grad_norm 2.4113 (2.3364) [2022-10-01 08:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][300/1251] eta 0:04:44 lr 0.000133 time 0.2899 (0.2989) loss 3.6327 (3.1976) grad_norm 2.5231 (2.3826) [2022-10-01 08:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][400/1251] eta 0:04:12 lr 0.000133 time 0.2935 (0.2966) loss 3.5231 (3.1913) grad_norm 2.0191 (2.3656) [2022-10-01 08:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][500/1251] eta 0:03:41 lr 0.000132 time 0.2911 (0.2952) loss 3.8474 (3.1891) grad_norm 2.7037 (2.3661) [2022-10-01 08:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][600/1251] eta 0:03:11 lr 0.000132 time 0.2869 (0.2941) loss 3.4117 (3.1837) grad_norm 1.9425 (2.3716) [2022-10-01 08:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][700/1251] eta 0:02:41 lr 0.000132 time 0.3823 (0.2934) loss 2.4808 (3.1797) grad_norm 2.5152 (2.3689) [2022-10-01 08:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][800/1251] eta 0:02:12 lr 0.000132 time 0.2868 (0.2928) loss 3.2782 (3.1857) grad_norm 2.1796 (2.3690) [2022-10-01 08:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][900/1251] eta 0:01:42 lr 0.000131 time 0.2884 (0.2922) loss 3.3450 (3.1748) grad_norm 2.1396 (2.3694) [2022-10-01 08:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1000/1251] eta 0:01:13 lr 0.000131 time 0.2859 (0.2918) loss 3.3672 (3.1705) grad_norm 2.3489 (2.3771) [2022-10-01 08:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1100/1251] eta 0:00:44 lr 0.000131 time 0.2900 (0.2915) loss 2.3315 (3.1643) grad_norm 2.1852 (2.3705) [2022-10-01 08:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1200/1251] eta 0:00:14 lr 0.000130 time 0.3835 (0.2913) loss 3.3188 (3.1692) grad_norm 2.0069 (2.3707) [2022-10-01 08:55:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 231 training takes 0:06:04 [2022-10-01 08:55:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.738 (2.738) Loss 0.9365 (0.9365) Acc@1 77.051 (77.051) Acc@5 95.215 (95.215) [2022-10-01 08:56:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.590 Acc@5 94.918 [2022-10-01 08:56:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-01 08:56:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.59% [2022-10-01 08:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][0/1251] eta 0:51:23 lr 0.000130 time 2.4652 (2.4652) loss 3.5203 (3.5203) grad_norm 2.1865 (2.1865) [2022-10-01 08:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][100/1251] eta 0:06:04 lr 0.000130 time 0.2881 (0.3168) loss 3.0355 (3.2178) grad_norm 2.4453 (2.3335) [2022-10-01 08:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][200/1251] eta 0:05:18 lr 0.000130 time 0.2942 (0.3028) loss 3.5574 (3.1553) grad_norm 2.1814 (2.3177) [2022-10-01 08:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][300/1251] eta 0:04:43 lr 0.000129 time 0.2860 (0.2983) loss 2.2536 (3.1629) grad_norm 2.6826 (2.3381) [2022-10-01 08:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][400/1251] eta 0:04:12 lr 0.000129 time 0.3890 (0.2963) loss 2.9670 (3.1606) grad_norm 2.5655 (2.3448) [2022-10-01 08:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][500/1251] eta 0:03:41 lr 0.000129 time 0.2884 (0.2949) loss 3.2453 (3.1619) grad_norm 2.4242 (2.3416) [2022-10-01 08:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][600/1251] eta 0:03:11 lr 0.000129 time 0.2998 (0.2939) loss 3.6969 (3.1640) grad_norm 2.3790 (2.3466) [2022-10-01 08:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][700/1251] eta 0:02:41 lr 0.000128 time 0.2867 (0.2933) loss 3.3622 (3.1713) grad_norm 2.1347 (2.3438) [2022-10-01 08:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][800/1251] eta 0:02:12 lr 0.000128 time 0.2957 (0.2928) loss 3.4535 (3.1625) grad_norm 2.0643 (2.3387) [2022-10-01 09:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][900/1251] eta 0:01:42 lr 0.000128 time 0.3828 (0.2924) loss 2.3968 (3.1494) grad_norm 2.1507 (2.3376) [2022-10-01 09:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1000/1251] eta 0:01:13 lr 0.000128 time 0.2921 (0.2921) loss 3.6903 (3.1540) grad_norm 2.6395 (2.3387) [2022-10-01 09:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1100/1251] eta 0:00:44 lr 0.000127 time 0.2916 (0.2919) loss 3.0059 (3.1547) grad_norm 2.4200 (2.3425) [2022-10-01 09:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1200/1251] eta 0:00:14 lr 0.000127 time 0.2985 (0.2917) loss 3.5194 (3.1493) grad_norm 2.5703 (2.3500) [2022-10-01 09:02:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 232 training takes 0:06:05 [2022-10-01 09:02:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.853 (2.853) Loss 0.8443 (0.8443) Acc@1 79.492 (79.492) Acc@5 95.703 (95.703) [2022-10-01 09:02:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.766 Acc@5 95.072 [2022-10-01 09:02:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-01 09:02:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-01 09:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][0/1251] eta 1:07:10 lr 0.000127 time 3.2217 (3.2217) loss 3.2914 (3.2914) grad_norm 2.1197 (2.1197) [2022-10-01 09:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][100/1251] eta 0:06:11 lr 0.000127 time 0.3842 (0.3227) loss 2.2317 (3.2307) grad_norm 2.3066 (2.4054) [2022-10-01 09:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][200/1251] eta 0:05:23 lr 0.000126 time 0.2909 (0.3074) loss 3.7617 (3.2204) grad_norm 2.2756 (2.3918) [2022-10-01 09:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][300/1251] eta 0:04:47 lr 0.000126 time 0.2901 (0.3024) loss 3.5287 (3.1926) grad_norm 2.7655 (2.4083) [2022-10-01 09:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][400/1251] eta 0:04:15 lr 0.000126 time 0.2923 (0.2997) loss 3.5798 (3.1964) grad_norm 2.7231 (2.4012) [2022-10-01 09:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][500/1251] eta 0:03:44 lr 0.000126 time 0.2950 (0.2983) loss 3.1895 (3.2047) grad_norm 2.2395 (2.4115) [2022-10-01 09:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][600/1251] eta 0:03:13 lr 0.000125 time 0.3927 (0.2975) loss 2.9709 (3.1995) grad_norm 2.4816 (2.4019) [2022-10-01 09:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][700/1251] eta 0:02:43 lr 0.000125 time 0.2948 (0.2966) loss 2.9610 (3.1918) grad_norm 2.2408 (2.3961) [2022-10-01 09:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][800/1251] eta 0:02:13 lr 0.000125 time 0.2931 (0.2959) loss 3.4587 (3.1844) grad_norm 2.2381 (2.3929) [2022-10-01 09:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][900/1251] eta 0:01:43 lr 0.000125 time 0.2903 (0.2952) loss 3.4796 (3.1810) grad_norm 2.8621 (2.4056) [2022-10-01 09:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1000/1251] eta 0:01:13 lr 0.000124 time 0.2934 (0.2947) loss 2.5944 (3.1886) grad_norm 2.4169 (2.4136) [2022-10-01 09:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1100/1251] eta 0:00:44 lr 0.000124 time 0.3893 (0.2944) loss 3.4789 (3.1948) grad_norm 2.2180 (2.4092) [2022-10-01 09:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1200/1251] eta 0:00:14 lr 0.000124 time 0.2930 (0.2940) loss 3.6137 (3.1967) grad_norm 2.1613 (2.4089) [2022-10-01 09:08:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 233 training takes 0:06:07 [2022-10-01 09:08:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.000 (3.000) Loss 0.9755 (0.9755) Acc@1 76.660 (76.660) Acc@5 93.457 (93.457) [2022-10-01 09:08:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.654 Acc@5 95.044 [2022-10-01 09:08:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-01 09:08:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-01 09:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][0/1251] eta 1:02:10 lr 0.000124 time 2.9821 (2.9821) loss 3.6483 (3.6483) grad_norm 2.9193 (2.9193) [2022-10-01 09:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][100/1251] eta 0:06:07 lr 0.000123 time 0.2917 (0.3194) loss 3.8086 (3.2911) grad_norm 2.2628 (2.3802) [2022-10-01 09:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][200/1251] eta 0:05:20 lr 0.000123 time 0.2882 (0.3054) loss 2.5580 (3.1936) grad_norm 2.2120 (2.3969) [2022-10-01 09:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][300/1251] eta 0:04:46 lr 0.000123 time 0.3890 (0.3010) loss 3.5620 (3.1648) grad_norm 2.2315 (2.3982) [2022-10-01 09:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][400/1251] eta 0:04:14 lr 0.000123 time 0.2882 (0.2989) loss 2.7331 (3.1712) grad_norm 2.4957 (2.3991) [2022-10-01 09:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][500/1251] eta 0:03:43 lr 0.000122 time 0.2907 (0.2974) loss 3.4808 (3.1618) grad_norm 2.3458 (2.4015) [2022-10-01 09:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][600/1251] eta 0:03:12 lr 0.000122 time 0.2872 (0.2964) loss 3.7593 (3.1663) grad_norm 2.5408 (2.4088) [2022-10-01 09:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][700/1251] eta 0:02:43 lr 0.000122 time 0.2941 (0.2959) loss 3.3815 (3.1629) grad_norm 2.2463 (2.4065) [2022-10-01 09:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][800/1251] eta 0:02:13 lr 0.000121 time 0.3818 (0.2955) loss 2.5403 (3.1659) grad_norm 2.1955 (2.4031) [2022-10-01 09:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][900/1251] eta 0:01:43 lr 0.000121 time 0.2931 (0.2951) loss 3.6019 (3.1690) grad_norm 2.1133 (2.4072) [2022-10-01 09:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1000/1251] eta 0:01:13 lr 0.000121 time 0.2861 (0.2948) loss 3.4666 (3.1676) grad_norm 2.4114 (2.4130) [2022-10-01 09:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1100/1251] eta 0:00:44 lr 0.000121 time 0.2912 (0.2945) loss 3.4363 (3.1692) grad_norm 2.6267 (2.4206) [2022-10-01 09:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1200/1251] eta 0:00:15 lr 0.000120 time 0.2860 (0.2942) loss 3.3954 (3.1656) grad_norm 2.5581 (2.4232) [2022-10-01 09:14:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 234 training takes 0:06:08 [2022-10-01 09:14:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.280 (3.280) Loss 0.8198 (0.8198) Acc@1 81.152 (81.152) Acc@5 94.824 (94.824) [2022-10-01 09:15:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.648 Acc@5 95.058 [2022-10-01 09:15:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-01 09:15:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-01 09:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][0/1251] eta 1:04:12 lr 0.000120 time 3.0799 (3.0799) loss 3.4315 (3.4315) grad_norm 2.5050 (2.5050) [2022-10-01 09:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][100/1251] eta 0:06:06 lr 0.000120 time 0.2874 (0.3183) loss 3.5555 (3.1346) grad_norm 2.2587 (2.3717) [2022-10-01 09:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][200/1251] eta 0:05:19 lr 0.000120 time 0.2888 (0.3037) loss 2.7327 (3.1490) grad_norm 2.1072 (2.4000) [2022-10-01 09:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][300/1251] eta 0:04:44 lr 0.000120 time 0.2909 (0.2987) loss 3.5757 (3.1592) grad_norm 2.2484 (2.3942) [2022-10-01 09:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][400/1251] eta 0:04:12 lr 0.000119 time 0.2848 (0.2963) loss 2.8845 (3.1411) grad_norm 2.4344 (2.4010) [2022-10-01 09:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][500/1251] eta 0:03:41 lr 0.000119 time 0.3802 (0.2951) loss 2.6746 (3.1328) grad_norm 2.3460 (2.4019) [2022-10-01 09:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][600/1251] eta 0:03:11 lr 0.000119 time 0.2882 (0.2943) loss 3.1704 (3.1435) grad_norm 2.6249 (2.4062) [2022-10-01 09:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][700/1251] eta 0:02:41 lr 0.000118 time 0.2881 (0.2936) loss 3.3761 (3.1453) grad_norm 2.7689 (2.4052) [2022-10-01 09:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][800/1251] eta 0:02:12 lr 0.000118 time 0.2888 (0.2931) loss 3.4344 (3.1467) grad_norm 2.1895 (2.4063) [2022-10-01 09:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][900/1251] eta 0:01:42 lr 0.000118 time 0.2870 (0.2927) loss 3.7981 (3.1541) grad_norm 3.2411 (2.4162) [2022-10-01 09:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1000/1251] eta 0:01:13 lr 0.000118 time 0.3831 (0.2926) loss 3.0224 (3.1505) grad_norm 2.6222 (2.4151) [2022-10-01 09:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1100/1251] eta 0:00:44 lr 0.000117 time 0.2863 (0.2923) loss 3.0993 (3.1551) grad_norm 2.3610 (2.4218) [2022-10-01 09:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1200/1251] eta 0:00:14 lr 0.000117 time 0.2896 (0.2922) loss 2.1821 (3.1550) grad_norm 3.0651 (2.4307) [2022-10-01 09:21:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 235 training takes 0:06:05 [2022-10-01 09:21:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.786 (2.786) Loss 0.9112 (0.9112) Acc@1 78.125 (78.125) Acc@5 94.336 (94.336) [2022-10-01 09:21:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.754 Acc@5 94.974 [2022-10-01 09:21:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-01 09:21:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-01 09:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][0/1251] eta 0:52:09 lr 0.000117 time 2.5017 (2.5017) loss 3.8840 (3.8840) grad_norm 2.6586 (2.6586) [2022-10-01 09:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][100/1251] eta 0:05:59 lr 0.000117 time 0.2903 (0.3123) loss 2.7202 (3.1840) grad_norm 2.1024 (2.4300) [2022-10-01 09:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][200/1251] eta 0:05:15 lr 0.000117 time 0.3791 (0.3006) loss 3.5555 (3.1542) grad_norm 2.7167 (2.4493) [2022-10-01 09:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][300/1251] eta 0:04:41 lr 0.000116 time 0.2889 (0.2962) loss 3.5379 (3.1472) grad_norm 2.5929 (2.4732) [2022-10-01 09:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][400/1251] eta 0:04:10 lr 0.000116 time 0.2902 (0.2940) loss 3.4004 (3.1477) grad_norm 2.4078 (2.4631) [2022-10-01 09:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][500/1251] eta 0:03:39 lr 0.000116 time 0.2899 (0.2928) loss 2.5501 (3.1420) grad_norm 2.1091 (2.4636) [2022-10-01 09:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][600/1251] eta 0:03:10 lr 0.000116 time 0.2875 (0.2919) loss 3.3446 (3.1384) grad_norm 2.5268 (2.4730) [2022-10-01 09:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][700/1251] eta 0:02:40 lr 0.000115 time 0.3838 (0.2915) loss 3.4764 (3.1331) grad_norm 2.5158 (2.4665) [2022-10-01 09:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][800/1251] eta 0:02:11 lr 0.000115 time 0.2847 (0.2910) loss 3.2550 (3.1338) grad_norm 2.4636 (2.4622) [2022-10-01 09:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][900/1251] eta 0:01:41 lr 0.000115 time 0.2871 (0.2905) loss 2.8313 (3.1299) grad_norm 2.7521 (2.4625) [2022-10-01 09:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1000/1251] eta 0:01:12 lr 0.000115 time 0.2863 (0.2902) loss 3.7045 (3.1357) grad_norm 2.3771 (2.4600) [2022-10-01 09:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1100/1251] eta 0:00:43 lr 0.000114 time 0.2896 (0.2899) loss 3.0704 (3.1389) grad_norm 2.4379 (2.4582) [2022-10-01 09:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1200/1251] eta 0:00:14 lr 0.000114 time 0.3775 (0.2898) loss 3.6440 (3.1450) grad_norm 2.1714 (2.4552) [2022-10-01 09:27:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 236 training takes 0:06:02 [2022-10-01 09:27:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.105 (3.105) Loss 0.9071 (0.9071) Acc@1 81.055 (81.055) Acc@5 94.141 (94.141) [2022-10-01 09:27:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.700 Acc@5 95.026 [2022-10-01 09:27:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-01 09:27:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-01 09:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][0/1251] eta 1:11:54 lr 0.000114 time 3.4492 (3.4492) loss 3.1007 (3.1007) grad_norm 3.2106 (3.2106) [2022-10-01 09:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][100/1251] eta 0:06:10 lr 0.000114 time 0.2887 (0.3217) loss 2.6740 (3.0791) grad_norm 2.3145 (2.4518) [2022-10-01 09:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][200/1251] eta 0:05:21 lr 0.000113 time 0.2871 (0.3059) loss 3.0936 (3.0911) grad_norm 2.2817 (2.4747) [2022-10-01 09:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][300/1251] eta 0:04:45 lr 0.000113 time 0.2884 (0.3006) loss 3.6577 (3.1115) grad_norm 3.0009 (2.4839) [2022-10-01 09:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][400/1251] eta 0:04:13 lr 0.000113 time 0.3811 (0.2983) loss 2.9769 (3.1260) grad_norm 2.8926 (2.4904) [2022-10-01 09:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][500/1251] eta 0:03:42 lr 0.000113 time 0.2891 (0.2966) loss 3.1227 (3.1280) grad_norm 2.1886 (2.4903) [2022-10-01 09:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][600/1251] eta 0:03:12 lr 0.000112 time 0.2907 (0.2953) loss 3.5080 (3.1374) grad_norm 2.3123 (2.4848) [2022-10-01 09:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][700/1251] eta 0:02:42 lr 0.000112 time 0.2938 (0.2945) loss 3.6126 (3.1351) grad_norm 2.4496 (2.4775) [2022-10-01 09:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][800/1251] eta 0:02:12 lr 0.000112 time 0.2874 (0.2939) loss 3.7984 (3.1334) grad_norm 2.3712 (2.4748) [2022-10-01 09:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][900/1251] eta 0:01:43 lr 0.000112 time 0.3790 (0.2935) loss 2.1627 (3.1340) grad_norm 2.4543 (2.4671) [2022-10-01 09:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1000/1251] eta 0:01:13 lr 0.000111 time 0.2874 (0.2931) loss 2.9582 (3.1361) grad_norm 2.8533 (2.4675) [2022-10-01 09:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1100/1251] eta 0:00:44 lr 0.000111 time 0.2873 (0.2927) loss 2.9430 (3.1326) grad_norm 2.2031 (2.4576) [2022-10-01 09:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1200/1251] eta 0:00:14 lr 0.000111 time 0.2864 (0.2924) loss 3.4431 (3.1339) grad_norm 2.2730 (2.4555) [2022-10-01 09:33:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 237 training takes 0:06:06 [2022-10-01 09:33:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.946 (2.946) Loss 0.8531 (0.8531) Acc@1 80.957 (80.957) Acc@5 94.141 (94.141) [2022-10-01 09:33:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.836 Acc@5 95.052 [2022-10-01 09:33:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-01 09:33:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.84% [2022-10-01 09:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][0/1251] eta 1:08:17 lr 0.000111 time 3.2753 (3.2753) loss 3.3982 (3.3982) grad_norm 2.3088 (2.3088) [2022-10-01 09:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][100/1251] eta 0:06:09 lr 0.000110 time 0.3843 (0.3210) loss 3.4593 (3.2074) grad_norm 2.5003 (2.4353) [2022-10-01 09:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][200/1251] eta 0:05:21 lr 0.000110 time 0.2892 (0.3057) loss 3.1850 (3.1648) grad_norm 2.2391 (2.4533) [2022-10-01 09:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][300/1251] eta 0:04:45 lr 0.000110 time 0.2877 (0.3003) loss 3.5005 (3.1536) grad_norm 2.3102 (2.4520) [2022-10-01 09:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][400/1251] eta 0:04:13 lr 0.000110 time 0.2893 (0.2977) loss 3.4106 (3.1746) grad_norm 2.2857 (2.4433) [2022-10-01 09:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][500/1251] eta 0:03:42 lr 0.000109 time 0.2896 (0.2963) loss 3.3279 (3.1410) grad_norm 2.4866 (2.4545) [2022-10-01 09:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][600/1251] eta 0:03:12 lr 0.000109 time 0.3779 (0.2955) loss 2.4084 (3.1428) grad_norm 2.5260 (2.4493) [2022-10-01 09:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][700/1251] eta 0:02:42 lr 0.000109 time 0.2878 (0.2947) loss 3.3477 (3.1403) grad_norm 2.2546 (2.4604) [2022-10-01 09:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][800/1251] eta 0:02:12 lr 0.000109 time 0.2920 (0.2941) loss 3.5516 (3.1478) grad_norm 2.5384 (2.4674) [2022-10-01 09:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][900/1251] eta 0:01:43 lr 0.000108 time 0.2930 (0.2937) loss 3.2213 (3.1461) grad_norm 3.0242 (2.4713) [2022-10-01 09:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1000/1251] eta 0:01:13 lr 0.000108 time 0.2897 (0.2934) loss 3.4461 (3.1520) grad_norm 2.3506 (2.4734) [2022-10-01 09:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1100/1251] eta 0:00:44 lr 0.000108 time 0.3896 (0.2933) loss 3.4183 (3.1536) grad_norm 2.5307 (2.4741) [2022-10-01 09:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1200/1251] eta 0:00:14 lr 0.000108 time 0.2902 (0.2930) loss 3.2781 (3.1528) grad_norm 2.3608 (2.4781) [2022-10-01 09:40:00 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 238 training takes 0:06:06 [2022-10-01 09:40:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.737 (2.737) Loss 0.8579 (0.8579) Acc@1 80.566 (80.566) Acc@5 95.117 (95.117) [2022-10-01 09:40:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.846 Acc@5 95.122 [2022-10-01 09:40:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-01 09:40:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.85% [2022-10-01 09:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][0/1251] eta 1:08:36 lr 0.000108 time 3.2902 (3.2902) loss 3.4626 (3.4626) grad_norm 2.7156 (2.7156) [2022-10-01 09:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][100/1251] eta 0:06:11 lr 0.000107 time 0.2955 (0.3225) loss 2.4828 (3.0580) grad_norm 2.3529 (2.4808) [2022-10-01 09:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][200/1251] eta 0:05:22 lr 0.000107 time 0.2911 (0.3072) loss 2.5336 (3.1427) grad_norm 2.2520 (2.4763) [2022-10-01 09:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][300/1251] eta 0:04:47 lr 0.000107 time 0.3859 (0.3020) loss 3.3515 (3.1580) grad_norm 2.5411 (2.4662) [2022-10-01 09:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][400/1251] eta 0:04:14 lr 0.000107 time 0.2901 (0.2993) loss 3.3153 (3.1618) grad_norm 2.3860 (2.4775) [2022-10-01 09:42:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][500/1251] eta 0:03:43 lr 0.000106 time 0.3013 (0.2977) loss 2.8715 (3.1569) grad_norm 2.5111 (2.4758) [2022-10-01 09:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][600/1251] eta 0:03:13 lr 0.000106 time 0.2936 (0.2966) loss 3.5426 (3.1623) grad_norm 2.4701 (2.4826) [2022-10-01 09:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][700/1251] eta 0:02:43 lr 0.000106 time 0.2927 (0.2958) loss 3.6378 (3.1577) grad_norm 2.5582 (2.4790) [2022-10-01 09:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][800/1251] eta 0:02:13 lr 0.000106 time 0.3896 (0.2953) loss 3.8602 (3.1559) grad_norm 2.6560 (2.4779) [2022-10-01 09:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][900/1251] eta 0:01:43 lr 0.000105 time 0.2891 (0.2948) loss 2.1645 (3.1509) grad_norm 2.2644 (2.4733) [2022-10-01 09:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1000/1251] eta 0:01:13 lr 0.000105 time 0.2932 (0.2945) loss 3.1199 (3.1354) grad_norm 2.3727 (2.4759) [2022-10-01 09:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1100/1251] eta 0:00:44 lr 0.000105 time 0.2870 (0.2941) loss 3.4154 (3.1364) grad_norm 2.1571 (2.4752) [2022-10-01 09:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1200/1251] eta 0:00:14 lr 0.000105 time 0.2924 (0.2938) loss 3.7329 (3.1397) grad_norm 2.3368 (2.4759) [2022-10-01 09:46:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 239 training takes 0:06:07 [2022-10-01 09:46:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.044 (3.044) Loss 0.9032 (0.9032) Acc@1 79.297 (79.297) Acc@5 94.922 (94.922) [2022-10-01 09:46:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.982 Acc@5 95.130 [2022-10-01 09:46:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-01 09:46:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.98% [2022-10-01 09:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][0/1251] eta 1:04:45 lr 0.000105 time 3.1058 (3.1058) loss 2.7118 (2.7118) grad_norm 2.3177 (2.3177) [2022-10-01 09:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][100/1251] eta 0:06:08 lr 0.000104 time 0.2883 (0.3204) loss 3.4708 (3.1899) grad_norm 2.5350 (2.4807) [2022-10-01 09:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][200/1251] eta 0:05:21 lr 0.000104 time 0.2892 (0.3060) loss 3.3393 (3.1368) grad_norm 2.4136 (2.4861) [2022-10-01 09:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][300/1251] eta 0:04:46 lr 0.000104 time 0.2883 (0.3007) loss 2.9247 (3.1470) grad_norm 2.5289 (2.5025) [2022-10-01 09:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][400/1251] eta 0:04:13 lr 0.000104 time 0.2882 (0.2980) loss 3.7246 (3.1433) grad_norm 2.3199 (2.5008) [2022-10-01 09:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][500/1251] eta 0:03:42 lr 0.000103 time 0.3811 (0.2966) loss 2.5825 (3.1562) grad_norm 2.2616 (2.5230) [2022-10-01 09:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][600/1251] eta 0:03:12 lr 0.000103 time 0.2907 (0.2955) loss 2.9727 (3.1410) grad_norm 2.3548 (2.5251) [2022-10-01 09:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][700/1251] eta 0:02:42 lr 0.000103 time 0.2882 (0.2946) loss 2.5827 (3.1445) grad_norm 2.4169 (2.5202) [2022-10-01 09:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][800/1251] eta 0:02:12 lr 0.000103 time 0.2941 (0.2940) loss 2.1578 (3.1413) grad_norm 2.2127 (2.5155) [2022-10-01 09:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][900/1251] eta 0:01:43 lr 0.000102 time 0.2893 (0.2935) loss 3.7897 (3.1495) grad_norm 2.2372 (2.5188) [2022-10-01 09:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1000/1251] eta 0:01:13 lr 0.000102 time 0.3808 (0.2932) loss 3.4234 (3.1462) grad_norm 2.5265 (2.5241) [2022-10-01 09:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1100/1251] eta 0:00:44 lr 0.000102 time 0.2865 (0.2929) loss 3.4163 (3.1422) grad_norm 2.2767 (2.5201) [2022-10-01 09:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1200/1251] eta 0:00:14 lr 0.000102 time 0.2905 (0.2926) loss 2.7033 (3.1380) grad_norm 2.2326 (2.5218) [2022-10-01 09:52:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 240 training takes 0:06:06 [2022-10-01 09:52:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saving...... [2022-10-01 09:52:40 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saved !!! [2022-10-01 09:52:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.507 (2.507) Loss 0.8026 (0.8026) Acc@1 80.957 (80.957) Acc@5 95.410 (95.410) [2022-10-01 09:52:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.926 Acc@5 95.120 [2022-10-01 09:52:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-01 09:52:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.98% [2022-10-01 09:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][0/1251] eta 1:06:12 lr 0.000102 time 3.1752 (3.1752) loss 3.0517 (3.0517) grad_norm 2.2599 (2.2599) [2022-10-01 09:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][100/1251] eta 0:06:09 lr 0.000101 time 0.2954 (0.3209) loss 3.5649 (3.1309) grad_norm 2.6453 (2.4907) [2022-10-01 09:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][200/1251] eta 0:05:22 lr 0.000101 time 0.3942 (0.3068) loss 3.4593 (3.1441) grad_norm 2.8354 (2.5523) [2022-10-01 09:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][300/1251] eta 0:04:46 lr 0.000101 time 0.2853 (0.3016) loss 3.1621 (3.1563) grad_norm 2.4152 (2.5460) [2022-10-01 09:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][400/1251] eta 0:04:14 lr 0.000101 time 0.2970 (0.2987) loss 3.6550 (3.1481) grad_norm 2.1071 (2.5482) [2022-10-01 09:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][500/1251] eta 0:03:42 lr 0.000100 time 0.2899 (0.2968) loss 2.4405 (3.1598) grad_norm 2.8872 (2.5470) [2022-10-01 09:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][600/1251] eta 0:03:12 lr 0.000100 time 0.2900 (0.2955) loss 3.4754 (3.1469) grad_norm 2.2043 (2.5425) [2022-10-01 09:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][700/1251] eta 0:02:42 lr 0.000100 time 0.3837 (0.2947) loss 2.7127 (3.1352) grad_norm 2.4264 (2.5389) [2022-10-01 09:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][800/1251] eta 0:02:12 lr 0.000100 time 0.2892 (0.2942) loss 3.3730 (3.1412) grad_norm 2.1874 (2.5308) [2022-10-01 09:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][900/1251] eta 0:01:43 lr 0.000099 time 0.2893 (0.2937) loss 3.1303 (3.1443) grad_norm 2.4019 (2.5216) [2022-10-01 09:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1000/1251] eta 0:01:13 lr 0.000099 time 0.2862 (0.2932) loss 2.4445 (3.1435) grad_norm 2.5220 (2.5286) [2022-10-01 09:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1100/1251] eta 0:00:44 lr 0.000099 time 0.2899 (0.2929) loss 3.3254 (3.1406) grad_norm 2.3518 (2.5266) [2022-10-01 09:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1200/1251] eta 0:00:14 lr 0.000099 time 0.3870 (0.2927) loss 3.1125 (3.1411) grad_norm 2.4055 (2.5286) [2022-10-01 09:58:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 241 training takes 0:06:06 [2022-10-01 09:59:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.593 (2.593) Loss 0.8215 (0.8215) Acc@1 79.395 (79.395) Acc@5 95.898 (95.898) [2022-10-01 09:59:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.976 Acc@5 95.122 [2022-10-01 09:59:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-01 09:59:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.98% [2022-10-01 09:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][0/1251] eta 0:50:44 lr 0.000099 time 2.4340 (2.4340) loss 3.4913 (3.4913) grad_norm 2.9281 (2.9281) [2022-10-01 09:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][100/1251] eta 0:06:04 lr 0.000098 time 0.2864 (0.3164) loss 3.4956 (3.1018) grad_norm 2.7058 (2.5170) [2022-10-01 10:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][200/1251] eta 0:05:17 lr 0.000098 time 0.2885 (0.3024) loss 3.3375 (3.1079) grad_norm 2.6414 (2.5225) [2022-10-01 10:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][300/1251] eta 0:04:43 lr 0.000098 time 0.2888 (0.2978) loss 3.4016 (3.1074) grad_norm 2.3284 (2.5168) [2022-10-01 10:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][400/1251] eta 0:04:11 lr 0.000098 time 0.3780 (0.2956) loss 3.5646 (3.1023) grad_norm 2.8355 (2.5239) [2022-10-01 10:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][500/1251] eta 0:03:40 lr 0.000097 time 0.2840 (0.2942) loss 3.4964 (3.1181) grad_norm 2.8590 (2.5358) [2022-10-01 10:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][600/1251] eta 0:03:10 lr 0.000097 time 0.2864 (0.2931) loss 3.5364 (3.1283) grad_norm 2.6202 (2.5371) [2022-10-01 10:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][700/1251] eta 0:02:41 lr 0.000097 time 0.2878 (0.2924) loss 3.5313 (3.1294) grad_norm 2.2879 (2.5361) [2022-10-01 10:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][800/1251] eta 0:02:11 lr 0.000097 time 0.2875 (0.2918) loss 3.6035 (3.1294) grad_norm 2.6322 (2.5418) [2022-10-01 10:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][900/1251] eta 0:01:42 lr 0.000096 time 0.3798 (0.2915) loss 2.4640 (3.1242) grad_norm 2.4619 (2.5458) [2022-10-01 10:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1000/1251] eta 0:01:13 lr 0.000096 time 0.2874 (0.2912) loss 3.2332 (3.1249) grad_norm 2.3864 (2.5452) [2022-10-01 10:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1100/1251] eta 0:00:43 lr 0.000096 time 0.2875 (0.2909) loss 1.8155 (3.1164) grad_norm 2.2876 (2.5446) [2022-10-01 10:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1200/1251] eta 0:00:14 lr 0.000096 time 0.2864 (0.2907) loss 3.2701 (3.1074) grad_norm 2.5577 (2.5479) [2022-10-01 10:05:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 242 training takes 0:06:03 [2022-10-01 10:05:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.879 (2.879) Loss 0.8839 (0.8839) Acc@1 79.395 (79.395) Acc@5 95.410 (95.410) [2022-10-01 10:05:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.010 Acc@5 95.168 [2022-10-01 10:05:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-01 10:05:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.01% [2022-10-01 10:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][0/1251] eta 1:08:55 lr 0.000096 time 3.3061 (3.3061) loss 3.0820 (3.0820) grad_norm 2.5255 (2.5255) [2022-10-01 10:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][100/1251] eta 0:06:08 lr 0.000095 time 0.3783 (0.3204) loss 3.1842 (3.2151) grad_norm 2.5701 (2.5464) [2022-10-01 10:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][200/1251] eta 0:05:20 lr 0.000095 time 0.2867 (0.3049) loss 2.3459 (3.1545) grad_norm 2.5826 (2.5621) [2022-10-01 10:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][300/1251] eta 0:04:45 lr 0.000095 time 0.2887 (0.2998) loss 3.6537 (3.1475) grad_norm 3.0116 (2.5627) [2022-10-01 10:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][400/1251] eta 0:04:12 lr 0.000095 time 0.2886 (0.2972) loss 3.1591 (3.1556) grad_norm 2.4739 (2.5869) [2022-10-01 10:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][500/1251] eta 0:03:42 lr 0.000094 time 0.2888 (0.2956) loss 3.4091 (3.1383) grad_norm 3.0252 (2.5856) [2022-10-01 10:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][600/1251] eta 0:03:11 lr 0.000094 time 0.3824 (0.2947) loss 2.9403 (3.1382) grad_norm 2.3554 (2.5783) [2022-10-01 10:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][700/1251] eta 0:02:41 lr 0.000094 time 0.2872 (0.2939) loss 1.9315 (3.1335) grad_norm 2.5434 (2.5669) [2022-10-01 10:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][800/1251] eta 0:02:12 lr 0.000094 time 0.2858 (0.2934) loss 3.4574 (3.1246) grad_norm 2.7672 (2.5648) [2022-10-01 10:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][900/1251] eta 0:01:42 lr 0.000094 time 0.2891 (0.2929) loss 2.9915 (3.1197) grad_norm 2.6623 (2.5526) [2022-10-01 10:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1000/1251] eta 0:01:13 lr 0.000093 time 0.2864 (0.2925) loss 3.2635 (3.1301) grad_norm 2.4321 (2.5537) [2022-10-01 10:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1100/1251] eta 0:00:44 lr 0.000093 time 0.3850 (0.2921) loss 3.1713 (3.1329) grad_norm 2.5142 (2.5569) [2022-10-01 10:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1200/1251] eta 0:00:14 lr 0.000093 time 0.2887 (0.2918) loss 2.9477 (3.1183) grad_norm 2.6047 (2.5544) [2022-10-01 10:11:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 243 training takes 0:06:05 [2022-10-01 10:11:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.833 (2.833) Loss 0.8253 (0.8253) Acc@1 81.543 (81.543) Acc@5 94.824 (94.824) [2022-10-01 10:11:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.090 Acc@5 95.150 [2022-10-01 10:11:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-01 10:11:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.09% [2022-10-01 10:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][0/1251] eta 1:02:09 lr 0.000093 time 2.9815 (2.9815) loss 2.6208 (2.6208) grad_norm 2.5817 (2.5817) [2022-10-01 10:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][100/1251] eta 0:06:03 lr 0.000092 time 0.2866 (0.3159) loss 2.7800 (3.1704) grad_norm 2.5141 (2.5553) [2022-10-01 10:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][200/1251] eta 0:05:17 lr 0.000092 time 0.2885 (0.3023) loss 2.0975 (3.1369) grad_norm 2.5735 (2.5487) [2022-10-01 10:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][300/1251] eta 0:04:43 lr 0.000092 time 0.3775 (0.2980) loss 3.1729 (3.1284) grad_norm 2.5826 (2.5636) [2022-10-01 10:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][400/1251] eta 0:04:11 lr 0.000092 time 0.2869 (0.2956) loss 2.9748 (3.1244) grad_norm 3.1072 (2.5727) [2022-10-01 10:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][500/1251] eta 0:03:40 lr 0.000092 time 0.2959 (0.2941) loss 2.7856 (3.1253) grad_norm 2.4078 (2.5739) [2022-10-01 10:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][600/1251] eta 0:03:10 lr 0.000091 time 0.2863 (0.2931) loss 3.5093 (3.1195) grad_norm 2.2408 (2.5699) [2022-10-01 10:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][700/1251] eta 0:02:41 lr 0.000091 time 0.2899 (0.2926) loss 3.1350 (3.1119) grad_norm 2.2416 (2.5661) [2022-10-01 10:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][800/1251] eta 0:02:11 lr 0.000091 time 0.3821 (0.2922) loss 2.6360 (3.1060) grad_norm 2.1641 (2.5533) [2022-10-01 10:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][900/1251] eta 0:01:42 lr 0.000091 time 0.2876 (0.2918) loss 3.4342 (3.1072) grad_norm 2.7453 (2.5538) [2022-10-01 10:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1000/1251] eta 0:01:13 lr 0.000090 time 0.2865 (0.2914) loss 3.3352 (3.1140) grad_norm 2.4079 (2.5536) [2022-10-01 10:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1100/1251] eta 0:00:43 lr 0.000090 time 0.2854 (0.2911) loss 3.6791 (3.1114) grad_norm 2.4904 (2.5571) [2022-10-01 10:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1200/1251] eta 0:00:14 lr 0.000090 time 0.2875 (0.2909) loss 3.2092 (3.1159) grad_norm 2.5686 (2.5621) [2022-10-01 10:17:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 244 training takes 0:06:04 [2022-10-01 10:17:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.629 (2.629) Loss 0.7839 (0.7839) Acc@1 80.469 (80.469) Acc@5 96.094 (96.094) [2022-10-01 10:18:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.092 Acc@5 95.128 [2022-10-01 10:18:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-01 10:18:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.09% [2022-10-01 10:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][0/1251] eta 1:05:43 lr 0.000090 time 3.1521 (3.1521) loss 2.6662 (2.6662) grad_norm 2.5462 (2.5462) [2022-10-01 10:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][100/1251] eta 0:06:09 lr 0.000090 time 0.2871 (0.3211) loss 2.9321 (3.1552) grad_norm 2.7283 (2.5765) [2022-10-01 10:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][200/1251] eta 0:05:21 lr 0.000089 time 0.2890 (0.3060) loss 3.8430 (3.1068) grad_norm 2.9514 (2.5867) [2022-10-01 10:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][300/1251] eta 0:04:45 lr 0.000089 time 0.2882 (0.3005) loss 3.5564 (3.0996) grad_norm 2.4038 (2.5800) [2022-10-01 10:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][400/1251] eta 0:04:13 lr 0.000089 time 0.2866 (0.2978) loss 3.6363 (3.0994) grad_norm 2.5729 (2.5858) [2022-10-01 10:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][500/1251] eta 0:03:42 lr 0.000089 time 0.3874 (0.2963) loss 3.2986 (3.0963) grad_norm 2.6060 (2.5871) [2022-10-01 10:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][600/1251] eta 0:03:12 lr 0.000089 time 0.2883 (0.2952) loss 2.6080 (3.0921) grad_norm 2.7514 (2.5763) [2022-10-01 10:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][700/1251] eta 0:02:42 lr 0.000088 time 0.2902 (0.2944) loss 3.4863 (3.1104) grad_norm 2.7491 (2.5766) [2022-10-01 10:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][800/1251] eta 0:02:12 lr 0.000088 time 0.2846 (0.2937) loss 2.6001 (3.0988) grad_norm 2.5553 (2.5700) [2022-10-01 10:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][900/1251] eta 0:01:42 lr 0.000088 time 0.2868 (0.2931) loss 2.6511 (3.1025) grad_norm 2.3530 (2.5736) [2022-10-01 10:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1000/1251] eta 0:01:13 lr 0.000088 time 0.3781 (0.2928) loss 3.3158 (3.0980) grad_norm 2.9189 (2.5789) [2022-10-01 10:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1100/1251] eta 0:00:44 lr 0.000087 time 0.2876 (0.2923) loss 3.6115 (3.0999) grad_norm 2.4731 (2.5814) [2022-10-01 10:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1200/1251] eta 0:00:14 lr 0.000087 time 0.2851 (0.2919) loss 3.3770 (3.1020) grad_norm 2.6636 (2.5842) [2022-10-01 10:24:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 245 training takes 0:06:05 [2022-10-01 10:24:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.876 (2.876) Loss 0.8550 (0.8550) Acc@1 79.688 (79.688) Acc@5 95.117 (95.117) [2022-10-01 10:24:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.944 Acc@5 95.226 [2022-10-01 10:24:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-01 10:24:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.09% [2022-10-01 10:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][0/1251] eta 1:10:29 lr 0.000087 time 3.3808 (3.3808) loss 3.3760 (3.3760) grad_norm 2.3842 (2.3842) [2022-10-01 10:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][100/1251] eta 0:06:11 lr 0.000087 time 0.2929 (0.3224) loss 3.6605 (3.0908) grad_norm 2.6505 (2.5538) [2022-10-01 10:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][200/1251] eta 0:05:23 lr 0.000087 time 0.3808 (0.3075) loss 3.4989 (3.0726) grad_norm 2.5202 (2.6075) [2022-10-01 10:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][300/1251] eta 0:04:47 lr 0.000086 time 0.2905 (0.3019) loss 2.8448 (3.0893) grad_norm 2.2702 (2.6012) [2022-10-01 10:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][400/1251] eta 0:04:14 lr 0.000086 time 0.2889 (0.2990) loss 2.2325 (3.1071) grad_norm 2.4624 (2.5962) [2022-10-01 10:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][500/1251] eta 0:03:43 lr 0.000086 time 0.2907 (0.2973) loss 2.2930 (3.1290) grad_norm 2.4291 (2.6081) [2022-10-01 10:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][600/1251] eta 0:03:12 lr 0.000086 time 0.2907 (0.2961) loss 3.9103 (3.1062) grad_norm 2.3943 (2.5951) [2022-10-01 10:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][700/1251] eta 0:02:42 lr 0.000086 time 0.3861 (0.2955) loss 2.3889 (3.0917) grad_norm 2.5537 (2.5950) [2022-10-01 10:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][800/1251] eta 0:02:12 lr 0.000085 time 0.2899 (0.2949) loss 3.5262 (3.0922) grad_norm 2.6372 (2.6019) [2022-10-01 10:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][900/1251] eta 0:01:43 lr 0.000085 time 0.2892 (0.2944) loss 3.4321 (3.0929) grad_norm 2.6489 (2.6002) [2022-10-01 10:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1000/1251] eta 0:01:13 lr 0.000085 time 0.2926 (0.2939) loss 3.3634 (3.0998) grad_norm 2.1664 (2.6043) [2022-10-01 10:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1100/1251] eta 0:00:44 lr 0.000085 time 0.2908 (0.2936) loss 2.3877 (3.0950) grad_norm 2.7052 (2.6006) [2022-10-01 10:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1200/1251] eta 0:00:14 lr 0.000084 time 0.3858 (0.2933) loss 3.3220 (3.0979) grad_norm 2.8639 (2.6017) [2022-10-01 10:30:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 246 training takes 0:06:07 [2022-10-01 10:30:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.147 (3.147) Loss 0.8362 (0.8362) Acc@1 78.809 (78.809) Acc@5 95.898 (95.898) [2022-10-01 10:30:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.226 Acc@5 95.226 [2022-10-01 10:30:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-01 10:30:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.23% [2022-10-01 10:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][0/1251] eta 1:09:27 lr 0.000084 time 3.3311 (3.3311) loss 2.4021 (2.4021) grad_norm 2.5444 (2.5444) [2022-10-01 10:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][100/1251] eta 0:06:06 lr 0.000084 time 0.2858 (0.3184) loss 3.5700 (3.0709) grad_norm 2.9342 (2.6857) [2022-10-01 10:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][200/1251] eta 0:05:19 lr 0.000084 time 0.2904 (0.3043) loss 2.9696 (3.0677) grad_norm 2.7388 (2.6569) [2022-10-01 10:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][300/1251] eta 0:04:44 lr 0.000084 time 0.2887 (0.2996) loss 2.8163 (3.0806) grad_norm 2.9880 (2.6542) [2022-10-01 10:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][400/1251] eta 0:04:13 lr 0.000083 time 0.3884 (0.2974) loss 3.3119 (3.0965) grad_norm 2.2573 (2.6537) [2022-10-01 10:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][500/1251] eta 0:03:42 lr 0.000083 time 0.2880 (0.2958) loss 3.2471 (3.0851) grad_norm 2.6604 (2.6548) [2022-10-01 10:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][600/1251] eta 0:03:11 lr 0.000083 time 0.2916 (0.2947) loss 2.5941 (3.0959) grad_norm 2.4592 (2.6426) [2022-10-01 10:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][700/1251] eta 0:02:41 lr 0.000083 time 0.2866 (0.2940) loss 1.9974 (3.0929) grad_norm 2.9741 (2.6368) [2022-10-01 10:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][800/1251] eta 0:02:12 lr 0.000083 time 0.2849 (0.2937) loss 2.5391 (3.1063) grad_norm 2.6323 (2.6391) [2022-10-01 10:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][900/1251] eta 0:01:42 lr 0.000082 time 0.3877 (0.2931) loss 2.3094 (3.1133) grad_norm 2.5956 (2.6355) [2022-10-01 10:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1000/1251] eta 0:01:13 lr 0.000082 time 0.2870 (0.2925) loss 3.6885 (3.1110) grad_norm 2.3610 (2.6441) [2022-10-01 10:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1100/1251] eta 0:00:44 lr 0.000082 time 0.2868 (0.2921) loss 3.2635 (3.1102) grad_norm 2.5648 (2.6435) [2022-10-01 10:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1200/1251] eta 0:00:14 lr 0.000082 time 0.2828 (0.2919) loss 3.7576 (3.1063) grad_norm 2.6249 (2.6464) [2022-10-01 10:36:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 247 training takes 0:06:05 [2022-10-01 10:36:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.247 (3.247) Loss 0.8471 (0.8471) Acc@1 79.883 (79.883) Acc@5 95.117 (95.117) [2022-10-01 10:37:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.140 Acc@5 95.192 [2022-10-01 10:37:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-01 10:37:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.23% [2022-10-01 10:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][0/1251] eta 1:07:20 lr 0.000082 time 3.2297 (3.2297) loss 2.0652 (2.0652) grad_norm 2.7028 (2.7028) [2022-10-01 10:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][100/1251] eta 0:06:10 lr 0.000081 time 0.3785 (0.3215) loss 2.9434 (3.0738) grad_norm 2.4282 (2.5833) [2022-10-01 10:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][200/1251] eta 0:05:21 lr 0.000081 time 0.2893 (0.3061) loss 3.7407 (3.0963) grad_norm 2.5192 (2.5986) [2022-10-01 10:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][300/1251] eta 0:04:46 lr 0.000081 time 0.2856 (0.3010) loss 2.4949 (3.0929) grad_norm 2.6147 (2.6160) [2022-10-01 10:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][400/1251] eta 0:04:14 lr 0.000081 time 0.2960 (0.2986) loss 3.3136 (3.1015) grad_norm 2.4701 (2.6337) [2022-10-01 10:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][500/1251] eta 0:03:43 lr 0.000081 time 0.2881 (0.2970) loss 3.5884 (3.1079) grad_norm 2.4553 (2.6497) [2022-10-01 10:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][600/1251] eta 0:03:12 lr 0.000080 time 0.2895 (0.2962) loss 3.4670 (3.1092) grad_norm 2.4480 (2.6569) [2022-10-01 10:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][700/1251] eta 0:02:42 lr 0.000080 time 0.2881 (0.2953) loss 3.4283 (3.1013) grad_norm 2.3508 (2.6654) [2022-10-01 10:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][800/1251] eta 0:02:12 lr 0.000080 time 0.2877 (0.2947) loss 3.0740 (3.0947) grad_norm 2.5624 (2.6689) [2022-10-01 10:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][900/1251] eta 0:01:43 lr 0.000080 time 0.2903 (0.2942) loss 3.5956 (3.0939) grad_norm 2.5872 (2.6608) [2022-10-01 10:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1000/1251] eta 0:01:13 lr 0.000079 time 0.2881 (0.2937) loss 3.6994 (3.0989) grad_norm 2.8526 (2.6636) [2022-10-01 10:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1100/1251] eta 0:00:44 lr 0.000079 time 0.2853 (0.2933) loss 3.4320 (3.1124) grad_norm 3.1657 (2.6637) [2022-10-01 10:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1200/1251] eta 0:00:14 lr 0.000079 time 0.2903 (0.2930) loss 3.7683 (3.1064) grad_norm 3.0892 (2.6641) [2022-10-01 10:43:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 248 training takes 0:06:06 [2022-10-01 10:43:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.141 (3.141) Loss 0.8035 (0.8035) Acc@1 81.543 (81.543) Acc@5 95.312 (95.312) [2022-10-01 10:43:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.148 Acc@5 95.192 [2022-10-01 10:43:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-01 10:43:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.23% [2022-10-01 10:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][0/1251] eta 1:08:35 lr 0.000079 time 3.2898 (3.2898) loss 2.9618 (2.9618) grad_norm 2.5803 (2.5803) [2022-10-01 10:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][100/1251] eta 0:06:06 lr 0.000079 time 0.2879 (0.3184) loss 3.2559 (2.9721) grad_norm 2.4937 (2.6041) [2022-10-01 10:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][200/1251] eta 0:05:18 lr 0.000079 time 0.2885 (0.3034) loss 3.3901 (3.0041) grad_norm 2.4331 (2.6148) [2022-10-01 10:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][300/1251] eta 0:04:43 lr 0.000078 time 0.2902 (0.2985) loss 3.3342 (3.0486) grad_norm 2.3764 (2.6593) [2022-10-01 10:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][400/1251] eta 0:04:11 lr 0.000078 time 0.2884 (0.2958) loss 2.4452 (3.0398) grad_norm 2.3711 (2.6476) [2022-10-01 10:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][500/1251] eta 0:03:40 lr 0.000078 time 0.2892 (0.2942) loss 2.0833 (3.0475) grad_norm 2.3893 (2.6580) [2022-10-01 10:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][600/1251] eta 0:03:10 lr 0.000078 time 0.2855 (0.2931) loss 3.5327 (3.0709) grad_norm 2.6775 (2.6599) [2022-10-01 10:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][700/1251] eta 0:02:41 lr 0.000077 time 0.2859 (0.2924) loss 3.5551 (3.0760) grad_norm 2.5866 (2.6684) [2022-10-01 10:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][800/1251] eta 0:02:11 lr 0.000077 time 0.2866 (0.2918) loss 2.0815 (3.0792) grad_norm 2.5981 (2.6621) [2022-10-01 10:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][900/1251] eta 0:01:42 lr 0.000077 time 0.2868 (0.2914) loss 3.1860 (3.0853) grad_norm 2.5293 (2.6596) [2022-10-01 10:48:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1000/1251] eta 0:01:13 lr 0.000077 time 0.2865 (0.2910) loss 2.4764 (3.0827) grad_norm 3.0683 (2.6575) [2022-10-01 10:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1100/1251] eta 0:00:43 lr 0.000077 time 0.2879 (0.2908) loss 3.7319 (3.0854) grad_norm 2.7846 (2.6561) [2022-10-01 10:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1200/1251] eta 0:00:14 lr 0.000076 time 0.2853 (0.2905) loss 1.9822 (3.0828) grad_norm 2.5964 (2.6539) [2022-10-01 10:49:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 249 training takes 0:06:03 [2022-10-01 10:49:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.439 (2.439) Loss 0.7767 (0.7767) Acc@1 80.859 (80.859) Acc@5 95.801 (95.801) [2022-10-01 10:49:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.296 Acc@5 95.316 [2022-10-01 10:49:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 10:49:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.30% [2022-10-01 10:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][0/1251] eta 1:11:56 lr 0.000076 time 3.4501 (3.4501) loss 2.1238 (2.1238) grad_norm 2.5810 (2.5810) [2022-10-01 10:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][100/1251] eta 0:06:10 lr 0.000076 time 0.2904 (0.3215) loss 3.6330 (3.0313) grad_norm 2.6829 (2.6683) [2022-10-01 10:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][200/1251] eta 0:05:21 lr 0.000076 time 0.2848 (0.3057) loss 3.3619 (3.0772) grad_norm 2.2798 (2.6915) [2022-10-01 10:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][300/1251] eta 0:04:45 lr 0.000076 time 0.2888 (0.3006) loss 3.4785 (3.0715) grad_norm 2.8503 (2.6729) [2022-10-01 10:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][400/1251] eta 0:04:13 lr 0.000075 time 0.2911 (0.2982) loss 3.2317 (3.0748) grad_norm 2.6168 (2.6739) [2022-10-01 10:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][500/1251] eta 0:03:42 lr 0.000075 time 0.2887 (0.2966) loss 3.4792 (3.0832) grad_norm 2.6305 (2.6815) [2022-10-01 10:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][600/1251] eta 0:03:12 lr 0.000075 time 0.2854 (0.2955) loss 2.9661 (3.0843) grad_norm 2.4927 (2.6848) [2022-10-01 10:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][700/1251] eta 0:02:42 lr 0.000075 time 0.2892 (0.2946) loss 2.0668 (3.0922) grad_norm 2.5867 (2.6829) [2022-10-01 10:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][800/1251] eta 0:02:12 lr 0.000075 time 0.2865 (0.2940) loss 3.2395 (3.0935) grad_norm 2.4839 (2.6811) [2022-10-01 10:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][900/1251] eta 0:01:43 lr 0.000074 time 0.2909 (0.2936) loss 3.2771 (3.0947) grad_norm 2.5036 (2.6782) [2022-10-01 10:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1000/1251] eta 0:01:13 lr 0.000074 time 0.2861 (0.2931) loss 3.1859 (3.0893) grad_norm 2.7530 (2.6670) [2022-10-01 10:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1100/1251] eta 0:00:44 lr 0.000074 time 0.2880 (0.2929) loss 3.6327 (3.0868) grad_norm 2.6236 (2.6638) [2022-10-01 10:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1200/1251] eta 0:00:14 lr 0.000074 time 0.2871 (0.2926) loss 2.8942 (3.0861) grad_norm 2.7797 (2.6646) [2022-10-01 10:55:42 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 250 training takes 0:06:06 [2022-10-01 10:55:42 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saving...... [2022-10-01 10:55:43 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saved !!! [2022-10-01 10:55:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.844 (2.844) Loss 0.8290 (0.8290) Acc@1 80.078 (80.078) Acc@5 95.605 (95.605) [2022-10-01 10:55:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.300 Acc@5 95.292 [2022-10-01 10:55:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 10:55:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.30% [2022-10-01 10:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][0/1251] eta 0:57:49 lr 0.000074 time 2.7731 (2.7731) loss 3.1395 (3.1395) grad_norm 2.9544 (2.9544) [2022-10-01 10:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][100/1251] eta 0:06:05 lr 0.000074 time 0.2890 (0.3176) loss 3.0534 (3.0388) grad_norm 2.7660 (2.6557) [2022-10-01 10:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][200/1251] eta 0:05:18 lr 0.000073 time 0.2877 (0.3028) loss 2.6614 (3.0788) grad_norm 2.5454 (2.7009) [2022-10-01 10:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][300/1251] eta 0:04:43 lr 0.000073 time 0.2890 (0.2979) loss 3.1755 (3.0670) grad_norm 3.2878 (2.6920) [2022-10-01 10:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][400/1251] eta 0:04:11 lr 0.000073 time 0.2885 (0.2955) loss 3.4272 (3.0660) grad_norm 2.4709 (2.6883) [2022-10-01 10:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][500/1251] eta 0:03:40 lr 0.000073 time 0.2853 (0.2940) loss 3.2894 (3.0662) grad_norm 2.6999 (2.6796) [2022-10-01 10:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][600/1251] eta 0:03:10 lr 0.000073 time 0.2876 (0.2929) loss 3.3052 (3.0670) grad_norm 2.4827 (2.6778) [2022-10-01 10:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][700/1251] eta 0:02:41 lr 0.000072 time 0.2870 (0.2922) loss 3.6884 (3.0632) grad_norm 2.8248 (2.6790) [2022-10-01 10:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][800/1251] eta 0:02:11 lr 0.000072 time 0.2953 (0.2917) loss 2.3277 (3.0665) grad_norm 2.2976 (2.6834) [2022-10-01 11:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][900/1251] eta 0:01:42 lr 0.000072 time 0.2903 (0.2913) loss 2.2715 (3.0698) grad_norm 2.9929 (2.6917) [2022-10-01 11:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1000/1251] eta 0:01:13 lr 0.000072 time 0.2898 (0.2909) loss 2.6207 (3.0662) grad_norm 2.6395 (2.6876) [2022-10-01 11:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1100/1251] eta 0:00:43 lr 0.000072 time 0.2879 (0.2905) loss 2.5941 (3.0686) grad_norm 2.5930 (2.6864) [2022-10-01 11:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1200/1251] eta 0:00:14 lr 0.000071 time 0.2866 (0.2902) loss 3.1189 (3.0655) grad_norm 2.7223 (2.6842) [2022-10-01 11:01:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 251 training takes 0:06:03 [2022-10-01 11:02:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.821 (2.821) Loss 0.8269 (0.8269) Acc@1 80.273 (80.273) Acc@5 95.312 (95.312) [2022-10-01 11:02:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.248 Acc@5 95.284 [2022-10-01 11:02:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-01 11:02:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.30% [2022-10-01 11:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][0/1251] eta 0:48:44 lr 0.000071 time 2.3375 (2.3375) loss 3.3053 (3.3053) grad_norm 2.6544 (2.6544) [2022-10-01 11:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][100/1251] eta 0:06:06 lr 0.000071 time 0.2885 (0.3185) loss 3.2811 (3.0292) grad_norm 3.2134 (2.7457) [2022-10-01 11:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][200/1251] eta 0:05:20 lr 0.000071 time 0.2936 (0.3047) loss 3.7277 (3.0637) grad_norm 2.5409 (2.7100) [2022-10-01 11:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][300/1251] eta 0:04:45 lr 0.000071 time 0.2868 (0.3001) loss 2.6774 (3.0648) grad_norm 2.8922 (2.7117) [2022-10-01 11:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][400/1251] eta 0:04:13 lr 0.000070 time 0.2900 (0.2977) loss 3.2777 (3.0718) grad_norm 2.7315 (2.7234) [2022-10-01 11:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][500/1251] eta 0:03:42 lr 0.000070 time 0.2859 (0.2963) loss 2.7088 (3.0644) grad_norm 3.1609 (2.7196) [2022-10-01 11:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][600/1251] eta 0:03:12 lr 0.000070 time 0.2912 (0.2953) loss 3.0684 (3.0665) grad_norm 2.7872 (2.7180) [2022-10-01 11:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][700/1251] eta 0:02:42 lr 0.000070 time 0.2875 (0.2946) loss 3.1754 (3.0620) grad_norm 2.4570 (2.7152) [2022-10-01 11:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][800/1251] eta 0:02:12 lr 0.000070 time 0.2889 (0.2940) loss 2.6530 (3.0735) grad_norm 3.1395 (2.7201) [2022-10-01 11:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][900/1251] eta 0:01:43 lr 0.000069 time 0.2865 (0.2936) loss 3.2109 (3.0781) grad_norm 2.4443 (2.7220) [2022-10-01 11:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1000/1251] eta 0:01:13 lr 0.000069 time 0.2908 (0.2931) loss 2.6594 (3.0794) grad_norm 2.2666 (2.7225) [2022-10-01 11:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1100/1251] eta 0:00:44 lr 0.000069 time 0.2864 (0.2928) loss 2.2281 (3.0814) grad_norm 2.5568 (2.7214) [2022-10-01 11:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1200/1251] eta 0:00:14 lr 0.000069 time 0.2920 (0.2925) loss 2.7205 (3.0792) grad_norm 2.5529 (2.7218) [2022-10-01 11:08:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 252 training takes 0:06:06 [2022-10-01 11:08:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.657 (2.657) Loss 0.8754 (0.8754) Acc@1 80.566 (80.566) Acc@5 94.336 (94.336) [2022-10-01 11:08:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.398 Acc@5 95.288 [2022-10-01 11:08:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 11:08:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][0/1251] eta 1:04:52 lr 0.000069 time 3.1117 (3.1117) loss 3.6731 (3.6731) grad_norm 2.7926 (2.7926) [2022-10-01 11:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][100/1251] eta 0:06:08 lr 0.000069 time 0.2906 (0.3199) loss 3.3086 (3.0265) grad_norm 2.7243 (2.7026) [2022-10-01 11:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][200/1251] eta 0:05:20 lr 0.000068 time 0.2886 (0.3051) loss 2.2856 (3.0419) grad_norm 2.5554 (2.7047) [2022-10-01 11:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][300/1251] eta 0:04:45 lr 0.000068 time 0.2911 (0.3006) loss 3.7084 (3.0645) grad_norm 2.7959 (2.7048) [2022-10-01 11:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][400/1251] eta 0:04:13 lr 0.000068 time 0.2914 (0.2981) loss 3.2469 (3.0610) grad_norm 2.7212 (2.7019) [2022-10-01 11:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][500/1251] eta 0:03:42 lr 0.000068 time 0.2935 (0.2966) loss 3.7676 (3.0725) grad_norm 2.5690 (2.7059) [2022-10-01 11:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][600/1251] eta 0:03:12 lr 0.000068 time 0.2906 (0.2955) loss 3.3354 (3.0802) grad_norm 2.6768 (2.7008) [2022-10-01 11:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][700/1251] eta 0:02:42 lr 0.000067 time 0.2885 (0.2948) loss 3.1214 (3.0867) grad_norm 3.0722 (2.7037) [2022-10-01 11:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][800/1251] eta 0:02:12 lr 0.000067 time 0.2887 (0.2942) loss 2.4624 (3.0762) grad_norm 2.6216 (2.7121) [2022-10-01 11:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][900/1251] eta 0:01:43 lr 0.000067 time 0.2902 (0.2937) loss 2.2249 (3.0781) grad_norm 2.4989 (2.7233) [2022-10-01 11:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1000/1251] eta 0:01:13 lr 0.000067 time 0.2926 (0.2933) loss 1.9146 (3.0722) grad_norm 3.8041 (2.7273) [2022-10-01 11:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1100/1251] eta 0:00:44 lr 0.000067 time 0.2925 (0.2930) loss 3.3117 (3.0730) grad_norm 2.8550 (2.7266) [2022-10-01 11:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1200/1251] eta 0:00:14 lr 0.000066 time 0.2877 (0.2927) loss 3.0435 (3.0658) grad_norm 2.7790 (2.7284) [2022-10-01 11:14:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 253 training takes 0:06:06 [2022-10-01 11:14:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.192 (2.192) Loss 0.8879 (0.8879) Acc@1 79.590 (79.590) Acc@5 94.629 (94.629) [2022-10-01 11:14:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.310 Acc@5 95.264 [2022-10-01 11:14:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 11:14:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][0/1251] eta 1:08:44 lr 0.000066 time 3.2973 (3.2973) loss 3.4165 (3.4165) grad_norm 2.5427 (2.5427) [2022-10-01 11:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][100/1251] eta 0:06:08 lr 0.000066 time 0.2874 (0.3197) loss 3.3252 (3.0904) grad_norm 2.6461 (2.7780) [2022-10-01 11:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][200/1251] eta 0:05:20 lr 0.000066 time 0.2891 (0.3047) loss 3.5413 (3.0409) grad_norm 3.0282 (2.7614) [2022-10-01 11:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][300/1251] eta 0:04:44 lr 0.000066 time 0.2875 (0.2994) loss 2.2193 (3.0288) grad_norm 2.6609 (2.7525) [2022-10-01 11:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][400/1251] eta 0:04:12 lr 0.000066 time 0.2900 (0.2966) loss 2.7635 (3.0280) grad_norm 2.5491 (2.7641) [2022-10-01 11:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][500/1251] eta 0:03:41 lr 0.000065 time 0.2881 (0.2950) loss 3.0059 (3.0278) grad_norm 2.3464 (2.7510) [2022-10-01 11:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][600/1251] eta 0:03:11 lr 0.000065 time 0.2899 (0.2938) loss 3.6044 (3.0245) grad_norm 2.7791 (2.7549) [2022-10-01 11:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][700/1251] eta 0:02:41 lr 0.000065 time 0.2864 (0.2931) loss 3.2374 (3.0318) grad_norm 2.5248 (2.7604) [2022-10-01 11:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][800/1251] eta 0:02:11 lr 0.000065 time 0.2900 (0.2925) loss 3.3372 (3.0263) grad_norm 3.6511 (2.7568) [2022-10-01 11:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][900/1251] eta 0:01:42 lr 0.000065 time 0.2863 (0.2920) loss 3.1536 (3.0382) grad_norm 2.5072 (2.7593) [2022-10-01 11:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1000/1251] eta 0:01:13 lr 0.000064 time 0.2887 (0.2917) loss 3.4712 (3.0513) grad_norm 2.9764 (2.7594) [2022-10-01 11:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1100/1251] eta 0:00:44 lr 0.000064 time 0.2866 (0.2914) loss 2.3172 (3.0492) grad_norm 2.6171 (2.7579) [2022-10-01 11:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1200/1251] eta 0:00:14 lr 0.000064 time 0.2895 (0.2911) loss 2.8557 (3.0452) grad_norm 2.3512 (2.7567) [2022-10-01 11:20:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 254 training takes 0:06:04 [2022-10-01 11:20:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.144 (3.144) Loss 0.8164 (0.8164) Acc@1 80.566 (80.566) Acc@5 95.703 (95.703) [2022-10-01 11:21:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.346 Acc@5 95.288 [2022-10-01 11:21:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 11:21:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][0/1251] eta 1:06:46 lr 0.000064 time 3.2023 (3.2023) loss 3.3579 (3.3579) grad_norm 2.4657 (2.4657) [2022-10-01 11:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][100/1251] eta 0:06:06 lr 0.000064 time 0.2869 (0.3185) loss 3.4976 (3.0942) grad_norm 2.6289 (2.7565) [2022-10-01 11:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][200/1251] eta 0:05:19 lr 0.000064 time 0.2900 (0.3038) loss 2.2012 (3.0665) grad_norm 2.7798 (2.7307) [2022-10-01 11:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][300/1251] eta 0:04:44 lr 0.000063 time 0.2859 (0.2989) loss 2.6547 (3.0511) grad_norm 2.9530 (2.7222) [2022-10-01 11:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][400/1251] eta 0:04:12 lr 0.000063 time 0.2892 (0.2964) loss 3.3885 (3.0614) grad_norm 2.9070 (2.7212) [2022-10-01 11:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][500/1251] eta 0:03:41 lr 0.000063 time 0.2849 (0.2948) loss 3.1458 (3.0391) grad_norm 2.7506 (2.7322) [2022-10-01 11:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][600/1251] eta 0:03:11 lr 0.000063 time 0.2878 (0.2939) loss 3.1210 (3.0546) grad_norm 2.6269 (2.7338) [2022-10-01 11:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][700/1251] eta 0:02:41 lr 0.000063 time 0.2880 (0.2931) loss 3.3645 (3.0520) grad_norm 2.6494 (2.7374) [2022-10-01 11:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][800/1251] eta 0:02:11 lr 0.000062 time 0.2872 (0.2925) loss 3.5565 (3.0556) grad_norm 2.8164 (2.7406) [2022-10-01 11:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][900/1251] eta 0:01:42 lr 0.000062 time 0.2863 (0.2920) loss 3.0005 (3.0544) grad_norm 2.9883 (2.7508) [2022-10-01 11:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1000/1251] eta 0:01:13 lr 0.000062 time 0.2865 (0.2916) loss 2.2689 (3.0637) grad_norm 3.1799 (2.7515) [2022-10-01 11:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1100/1251] eta 0:00:43 lr 0.000062 time 0.2869 (0.2913) loss 3.4865 (3.0663) grad_norm 2.5531 (2.7535) [2022-10-01 11:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1200/1251] eta 0:00:14 lr 0.000062 time 0.2856 (0.2911) loss 3.1660 (3.0722) grad_norm 2.7764 (2.7571) [2022-10-01 11:27:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 255 training takes 0:06:04 [2022-10-01 11:27:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.343 (3.343) Loss 0.8431 (0.8431) Acc@1 78.809 (78.809) Acc@5 95.215 (95.215) [2022-10-01 11:27:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.386 Acc@5 95.278 [2022-10-01 11:27:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 11:27:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][0/1251] eta 1:07:56 lr 0.000062 time 3.2583 (3.2583) loss 3.3930 (3.3930) grad_norm 2.9992 (2.9992) [2022-10-01 11:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][100/1251] eta 0:06:10 lr 0.000061 time 0.2924 (0.3222) loss 3.3388 (3.0008) grad_norm 2.5219 (2.7629) [2022-10-01 11:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][200/1251] eta 0:05:23 lr 0.000061 time 0.2900 (0.3079) loss 2.2939 (3.0266) grad_norm 2.9128 (2.7479) [2022-10-01 11:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][300/1251] eta 0:04:47 lr 0.000061 time 0.2935 (0.3027) loss 3.0899 (3.0300) grad_norm 2.6804 (2.7433) [2022-10-01 11:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][400/1251] eta 0:04:15 lr 0.000061 time 0.2914 (0.3002) loss 2.4140 (3.0561) grad_norm 2.1697 (2.7564) [2022-10-01 11:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][500/1251] eta 0:03:44 lr 0.000061 time 0.2945 (0.2986) loss 3.6165 (3.0612) grad_norm 2.7233 (2.7491) [2022-10-01 11:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][600/1251] eta 0:03:13 lr 0.000061 time 0.2923 (0.2976) loss 3.7967 (3.0487) grad_norm 2.9086 (2.7869) [2022-10-01 11:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][700/1251] eta 0:02:43 lr 0.000060 time 0.2923 (0.2969) loss 3.5808 (3.0499) grad_norm 2.9779 (2.7864) [2022-10-01 11:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][800/1251] eta 0:02:13 lr 0.000060 time 0.2893 (0.2962) loss 2.8447 (3.0506) grad_norm 2.8137 (2.7851) [2022-10-01 11:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][900/1251] eta 0:01:43 lr 0.000060 time 0.2961 (0.2958) loss 3.6058 (3.0506) grad_norm 2.9356 (2.7859) [2022-10-01 11:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1000/1251] eta 0:01:14 lr 0.000060 time 0.2886 (0.2953) loss 3.3068 (3.0517) grad_norm 2.5017 (2.7822) [2022-10-01 11:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1100/1251] eta 0:00:44 lr 0.000060 time 0.2900 (0.2949) loss 3.2717 (3.0599) grad_norm 2.5792 (2.7816) [2022-10-01 11:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1200/1251] eta 0:00:15 lr 0.000059 time 0.2938 (0.2945) loss 3.5153 (3.0589) grad_norm 2.8364 (2.7879) [2022-10-01 11:33:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 256 training takes 0:06:08 [2022-10-01 11:33:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.984 (2.984) Loss 0.8378 (0.8378) Acc@1 78.613 (78.613) Acc@5 95.898 (95.898) [2022-10-01 11:33:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.300 Acc@5 95.360 [2022-10-01 11:33:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 11:33:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][0/1251] eta 1:11:19 lr 0.000059 time 3.4207 (3.4207) loss 3.4378 (3.4378) grad_norm 3.2460 (3.2460) [2022-10-01 11:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][100/1251] eta 0:06:07 lr 0.000059 time 0.2848 (0.3194) loss 3.4581 (3.0647) grad_norm 2.7412 (2.7897) [2022-10-01 11:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][200/1251] eta 0:05:19 lr 0.000059 time 0.2887 (0.3038) loss 2.3014 (3.0366) grad_norm 2.7012 (2.7735) [2022-10-01 11:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][300/1251] eta 0:04:44 lr 0.000059 time 0.2891 (0.2987) loss 3.3284 (3.0485) grad_norm 2.6954 (2.8041) [2022-10-01 11:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][400/1251] eta 0:04:11 lr 0.000059 time 0.2906 (0.2961) loss 3.3186 (3.0354) grad_norm 2.5974 (2.8078) [2022-10-01 11:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][500/1251] eta 0:03:41 lr 0.000058 time 0.2887 (0.2945) loss 2.8324 (3.0475) grad_norm 2.6701 (2.7948) [2022-10-01 11:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][600/1251] eta 0:03:11 lr 0.000058 time 0.2889 (0.2936) loss 3.2321 (3.0549) grad_norm 2.6413 (2.7983) [2022-10-01 11:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][700/1251] eta 0:02:41 lr 0.000058 time 0.2898 (0.2928) loss 3.7493 (3.0597) grad_norm 3.3283 (2.8035) [2022-10-01 11:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][800/1251] eta 0:02:11 lr 0.000058 time 0.2881 (0.2922) loss 3.3691 (3.0633) grad_norm 2.6061 (2.8169) [2022-10-01 11:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][900/1251] eta 0:01:42 lr 0.000058 time 0.2861 (0.2918) loss 3.3081 (3.0659) grad_norm 2.4475 (2.8174) [2022-10-01 11:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1000/1251] eta 0:01:13 lr 0.000058 time 0.2885 (0.2914) loss 3.2865 (3.0644) grad_norm 2.8938 (2.8185) [2022-10-01 11:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1100/1251] eta 0:00:43 lr 0.000057 time 0.2879 (0.2912) loss 3.2054 (3.0619) grad_norm 2.6658 (2.8174) [2022-10-01 11:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1200/1251] eta 0:00:14 lr 0.000057 time 0.2863 (0.2909) loss 3.6376 (3.0644) grad_norm 2.6843 (2.8183) [2022-10-01 11:39:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 257 training takes 0:06:04 [2022-10-01 11:39:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.871 (2.871) Loss 0.8675 (0.8675) Acc@1 79.980 (79.980) Acc@5 94.824 (94.824) [2022-10-01 11:40:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.292 Acc@5 95.386 [2022-10-01 11:40:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-01 11:40:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-01 11:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][0/1251] eta 0:56:12 lr 0.000057 time 2.6956 (2.6956) loss 3.3650 (3.3650) grad_norm 2.7520 (2.7520) [2022-10-01 11:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][100/1251] eta 0:06:06 lr 0.000057 time 0.2868 (0.3188) loss 3.3858 (3.0206) grad_norm 2.4689 (2.8586) [2022-10-01 11:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][200/1251] eta 0:05:19 lr 0.000057 time 0.2905 (0.3042) loss 3.4398 (3.0429) grad_norm 2.7942 (2.8185) [2022-10-01 11:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][300/1251] eta 0:04:44 lr 0.000057 time 0.2899 (0.2993) loss 3.1051 (3.0336) grad_norm 2.8678 (2.8245) [2022-10-01 11:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][400/1251] eta 0:04:12 lr 0.000056 time 0.2905 (0.2969) loss 3.2664 (3.0194) grad_norm 3.1367 (2.8158) [2022-10-01 11:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][500/1251] eta 0:03:41 lr 0.000056 time 0.2879 (0.2955) loss 2.6003 (3.0261) grad_norm 2.7730 (2.8028) [2022-10-01 11:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][600/1251] eta 0:03:11 lr 0.000056 time 0.2921 (0.2944) loss 3.5721 (3.0317) grad_norm 2.6675 (2.7961) [2022-10-01 11:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][700/1251] eta 0:02:41 lr 0.000056 time 0.2907 (0.2937) loss 3.6584 (3.0313) grad_norm 3.2149 (2.7923) [2022-10-01 11:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][800/1251] eta 0:02:12 lr 0.000056 time 0.2915 (0.2932) loss 3.0222 (3.0304) grad_norm 2.9315 (2.7958) [2022-10-01 11:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][900/1251] eta 0:01:42 lr 0.000056 time 0.2956 (0.2928) loss 3.1830 (3.0333) grad_norm 2.7599 (2.7929) [2022-10-01 11:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1000/1251] eta 0:01:13 lr 0.000055 time 0.2916 (0.2924) loss 3.2208 (3.0373) grad_norm 2.6609 (2.7910) [2022-10-01 11:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1100/1251] eta 0:00:44 lr 0.000055 time 0.2930 (0.2922) loss 3.5733 (3.0395) grad_norm 2.6314 (2.7922) [2022-10-01 11:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1200/1251] eta 0:00:14 lr 0.000055 time 0.2890 (0.2919) loss 3.7902 (3.0369) grad_norm 2.7682 (2.7896) [2022-10-01 11:46:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 258 training takes 0:06:05 [2022-10-01 11:46:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.950 (2.950) Loss 0.8538 (0.8538) Acc@1 79.688 (79.688) Acc@5 95.410 (95.410) [2022-10-01 11:46:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.414 Acc@5 95.366 [2022-10-01 11:46:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 11:46:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.41% [2022-10-01 11:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][0/1251] eta 1:07:23 lr 0.000055 time 3.2323 (3.2323) loss 2.4667 (2.4667) grad_norm 2.5807 (2.5807) [2022-10-01 11:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][100/1251] eta 0:06:06 lr 0.000055 time 0.2909 (0.3183) loss 3.3103 (3.0315) grad_norm 2.6969 (2.8040) [2022-10-01 11:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][200/1251] eta 0:05:18 lr 0.000055 time 0.2926 (0.3032) loss 2.0084 (3.0537) grad_norm 2.6960 (2.7800) [2022-10-01 11:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][300/1251] eta 0:04:43 lr 0.000054 time 0.2868 (0.2982) loss 3.3031 (3.0252) grad_norm 2.7885 (2.7772) [2022-10-01 11:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][400/1251] eta 0:04:11 lr 0.000054 time 0.2882 (0.2956) loss 3.0616 (3.0307) grad_norm 2.6121 (2.7935) [2022-10-01 11:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][500/1251] eta 0:03:40 lr 0.000054 time 0.2873 (0.2941) loss 3.2449 (3.0330) grad_norm 2.7392 (2.7977) [2022-10-01 11:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][600/1251] eta 0:03:10 lr 0.000054 time 0.2859 (0.2931) loss 3.4314 (3.0245) grad_norm 3.9902 (2.8010) [2022-10-01 11:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][700/1251] eta 0:02:41 lr 0.000054 time 0.2892 (0.2923) loss 3.3226 (3.0131) grad_norm 3.3284 (2.7948) [2022-10-01 11:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][800/1251] eta 0:02:11 lr 0.000054 time 0.2900 (0.2918) loss 2.2297 (3.0140) grad_norm 2.8847 (2.8035) [2022-10-01 11:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][900/1251] eta 0:01:42 lr 0.000053 time 0.2867 (0.2913) loss 2.0352 (3.0110) grad_norm 3.3986 (2.7978) [2022-10-01 11:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1000/1251] eta 0:01:13 lr 0.000053 time 0.2923 (0.2909) loss 2.7224 (3.0182) grad_norm 2.8155 (2.8072) [2022-10-01 11:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1100/1251] eta 0:00:43 lr 0.000053 time 0.2872 (0.2906) loss 2.8138 (3.0198) grad_norm 2.7818 (2.8064) [2022-10-01 11:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1200/1251] eta 0:00:14 lr 0.000053 time 0.2877 (0.2904) loss 2.2539 (3.0265) grad_norm 2.4703 (2.8075) [2022-10-01 11:52:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 259 training takes 0:06:03 [2022-10-01 11:52:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.949 (2.949) Loss 0.8805 (0.8805) Acc@1 79.883 (79.883) Acc@5 94.824 (94.824) [2022-10-01 11:52:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.396 Acc@5 95.376 [2022-10-01 11:52:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 11:52:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.41% [2022-10-01 11:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][0/1251] eta 0:46:54 lr 0.000053 time 2.2496 (2.2496) loss 3.3496 (3.3496) grad_norm 2.8788 (2.8788) [2022-10-01 11:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][100/1251] eta 0:06:01 lr 0.000053 time 0.2899 (0.3144) loss 3.3154 (3.0721) grad_norm 2.4012 (2.8242) [2022-10-01 11:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][200/1251] eta 0:05:16 lr 0.000052 time 0.2872 (0.3011) loss 3.5004 (3.0456) grad_norm 2.5324 (2.8176) [2022-10-01 11:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][300/1251] eta 0:04:42 lr 0.000052 time 0.2889 (0.2967) loss 2.1689 (3.0521) grad_norm 2.9171 (2.8300) [2022-10-01 11:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][400/1251] eta 0:04:10 lr 0.000052 time 0.2893 (0.2946) loss 3.2650 (3.0557) grad_norm 3.1988 (2.8337) [2022-10-01 11:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][500/1251] eta 0:03:40 lr 0.000052 time 0.2874 (0.2931) loss 2.4435 (3.0508) grad_norm 2.8498 (2.8206) [2022-10-01 11:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][600/1251] eta 0:03:10 lr 0.000052 time 0.2846 (0.2922) loss 3.1857 (3.0517) grad_norm 2.7824 (2.8074) [2022-10-01 11:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][700/1251] eta 0:02:40 lr 0.000052 time 0.2886 (0.2915) loss 3.3157 (3.0557) grad_norm 2.7561 (2.8057) [2022-10-01 11:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][800/1251] eta 0:02:11 lr 0.000051 time 0.2873 (0.2909) loss 2.9793 (3.0598) grad_norm 2.6435 (2.8061) [2022-10-01 11:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][900/1251] eta 0:01:41 lr 0.000051 time 0.2859 (0.2905) loss 2.2722 (3.0512) grad_norm 2.8191 (2.8070) [2022-10-01 11:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1000/1251] eta 0:01:12 lr 0.000051 time 0.2882 (0.2901) loss 2.6184 (3.0464) grad_norm 2.4816 (2.8029) [2022-10-01 11:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1100/1251] eta 0:00:43 lr 0.000051 time 0.2861 (0.2898) loss 2.0813 (3.0480) grad_norm 2.7912 (2.8114) [2022-10-01 11:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1200/1251] eta 0:00:14 lr 0.000051 time 0.2848 (0.2896) loss 2.9187 (3.0479) grad_norm 2.9743 (2.8137) [2022-10-01 11:58:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 260 training takes 0:06:02 [2022-10-01 11:58:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saving...... [2022-10-01 11:58:40 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saved !!! [2022-10-01 11:58:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.709 (2.709) Loss 0.8237 (0.8237) Acc@1 81.250 (81.250) Acc@5 95.215 (95.215) [2022-10-01 11:58:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.590 Acc@5 95.448 [2022-10-01 11:58:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 11:58:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.59% [2022-10-01 11:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][0/1251] eta 1:07:09 lr 0.000051 time 3.2210 (3.2210) loss 3.2208 (3.2208) grad_norm 2.7071 (2.7071) [2022-10-01 11:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][100/1251] eta 0:06:07 lr 0.000051 time 0.2871 (0.3193) loss 3.3626 (2.9518) grad_norm 3.0197 (2.8429) [2022-10-01 11:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][200/1251] eta 0:05:19 lr 0.000050 time 0.2884 (0.3045) loss 3.2670 (2.9984) grad_norm 2.7487 (2.8708) [2022-10-01 12:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][300/1251] eta 0:04:45 lr 0.000050 time 0.2893 (0.2997) loss 2.8472 (2.9964) grad_norm 2.6929 (2.8699) [2022-10-01 12:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][400/1251] eta 0:04:12 lr 0.000050 time 0.2914 (0.2972) loss 2.8220 (3.0009) grad_norm 2.8949 (2.8852) [2022-10-01 12:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][500/1251] eta 0:03:42 lr 0.000050 time 0.2862 (0.2958) loss 3.3034 (3.0000) grad_norm 3.1459 (2.8787) [2022-10-01 12:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][600/1251] eta 0:03:11 lr 0.000050 time 0.2915 (0.2948) loss 2.6104 (3.0105) grad_norm 2.4434 (2.8954) [2022-10-01 12:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][700/1251] eta 0:02:42 lr 0.000050 time 0.2886 (0.2941) loss 3.3628 (3.0187) grad_norm 2.6447 (2.8870) [2022-10-01 12:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][800/1251] eta 0:02:12 lr 0.000049 time 0.2978 (0.2935) loss 3.3748 (3.0216) grad_norm 3.7867 (2.8781) [2022-10-01 12:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][900/1251] eta 0:01:42 lr 0.000049 time 0.2866 (0.2932) loss 3.3877 (3.0283) grad_norm 2.9970 (2.8837) [2022-10-01 12:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1000/1251] eta 0:01:13 lr 0.000049 time 0.2934 (0.2928) loss 3.1807 (3.0362) grad_norm 2.6259 (2.8777) [2022-10-01 12:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1100/1251] eta 0:00:44 lr 0.000049 time 0.2847 (0.2926) loss 2.9490 (3.0384) grad_norm 2.9721 (2.8729) [2022-10-01 12:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1200/1251] eta 0:00:14 lr 0.000049 time 0.2930 (0.2924) loss 3.2601 (3.0405) grad_norm 3.1892 (2.8817) [2022-10-01 12:04:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 261 training takes 0:06:05 [2022-10-01 12:05:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.464 (2.464) Loss 0.8375 (0.8375) Acc@1 79.297 (79.297) Acc@5 95.410 (95.410) [2022-10-01 12:05:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.370 Acc@5 95.378 [2022-10-01 12:05:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 12:05:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.59% [2022-10-01 12:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][0/1251] eta 0:46:26 lr 0.000049 time 2.2272 (2.2272) loss 3.1268 (3.1268) grad_norm 2.7971 (2.7971) [2022-10-01 12:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][100/1251] eta 0:06:06 lr 0.000049 time 0.2884 (0.3182) loss 3.2272 (2.9828) grad_norm 2.5061 (2.8692) [2022-10-01 12:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][200/1251] eta 0:05:19 lr 0.000048 time 0.2905 (0.3038) loss 3.5983 (2.9993) grad_norm 2.5984 (2.8744) [2022-10-01 12:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][300/1251] eta 0:04:44 lr 0.000048 time 0.2868 (0.2991) loss 2.0790 (2.9903) grad_norm 2.6130 (2.8634) [2022-10-01 12:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][400/1251] eta 0:04:12 lr 0.000048 time 0.2935 (0.2968) loss 1.8216 (3.0035) grad_norm 2.5754 (2.8717) [2022-10-01 12:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][500/1251] eta 0:03:41 lr 0.000048 time 0.2860 (0.2953) loss 3.2394 (3.0165) grad_norm 2.9841 (2.8619) [2022-10-01 12:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][600/1251] eta 0:03:11 lr 0.000048 time 0.2870 (0.2942) loss 2.9574 (3.0287) grad_norm 2.8166 (2.8591) [2022-10-01 12:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][700/1251] eta 0:02:41 lr 0.000048 time 0.2864 (0.2934) loss 2.8558 (3.0243) grad_norm 2.4477 (2.8643) [2022-10-01 12:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][800/1251] eta 0:02:11 lr 0.000047 time 0.2849 (0.2927) loss 3.7463 (3.0243) grad_norm 2.5359 (2.8649) [2022-10-01 12:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][900/1251] eta 0:01:42 lr 0.000047 time 0.2844 (0.2922) loss 3.4303 (3.0318) grad_norm 2.4121 (2.8655) [2022-10-01 12:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1000/1251] eta 0:01:13 lr 0.000047 time 0.2881 (0.2918) loss 3.0076 (3.0396) grad_norm 2.8520 (2.8610) [2022-10-01 12:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1100/1251] eta 0:00:44 lr 0.000047 time 0.2868 (0.2915) loss 2.4252 (3.0370) grad_norm 2.7540 (2.8575) [2022-10-01 12:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1200/1251] eta 0:00:14 lr 0.000047 time 0.2925 (0.2912) loss 2.4469 (3.0386) grad_norm 2.5426 (2.8577) [2022-10-01 12:11:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 262 training takes 0:06:04 [2022-10-01 12:11:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.497 (2.497) Loss 0.8802 (0.8802) Acc@1 78.516 (78.516) Acc@5 95.020 (95.020) [2022-10-01 12:11:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.422 Acc@5 95.414 [2022-10-01 12:11:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-01 12:11:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.59% [2022-10-01 12:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][0/1251] eta 0:47:50 lr 0.000047 time 2.2944 (2.2944) loss 3.3298 (3.3298) grad_norm 3.1411 (3.1411) [2022-10-01 12:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][100/1251] eta 0:06:06 lr 0.000047 time 0.2880 (0.3181) loss 3.0862 (3.0308) grad_norm 3.0075 (2.9230) [2022-10-01 12:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][200/1251] eta 0:05:19 lr 0.000046 time 0.2933 (0.3042) loss 3.2015 (3.0211) grad_norm 2.3026 (2.8727) [2022-10-01 12:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][300/1251] eta 0:04:44 lr 0.000046 time 0.2859 (0.2995) loss 2.9323 (3.0076) grad_norm 2.6667 (2.8846) [2022-10-01 12:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][400/1251] eta 0:04:12 lr 0.000046 time 0.2876 (0.2970) loss 3.4709 (3.0238) grad_norm 2.7734 (2.8911) [2022-10-01 12:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][500/1251] eta 0:03:41 lr 0.000046 time 0.2882 (0.2954) loss 3.0058 (3.0241) grad_norm 2.9113 (2.8755) [2022-10-01 12:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][600/1251] eta 0:03:11 lr 0.000046 time 0.2953 (0.2944) loss 2.4942 (3.0271) grad_norm 2.4964 (2.8790) [2022-10-01 12:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][700/1251] eta 0:02:41 lr 0.000046 time 0.2867 (0.2936) loss 2.4042 (3.0379) grad_norm 2.7102 (2.8751) [2022-10-01 12:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][800/1251] eta 0:02:12 lr 0.000045 time 0.2926 (0.2931) loss 3.3007 (3.0374) grad_norm 3.6610 (2.8764) [2022-10-01 12:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][900/1251] eta 0:01:42 lr 0.000045 time 0.2867 (0.2927) loss 2.8355 (3.0269) grad_norm 2.4903 (2.8932) [2022-10-01 12:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1000/1251] eta 0:01:13 lr 0.000045 time 0.2908 (0.2924) loss 3.3066 (3.0291) grad_norm 2.7327 (2.8923) [2022-10-01 12:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1100/1251] eta 0:00:44 lr 0.000045 time 0.2918 (0.2921) loss 2.8059 (3.0256) grad_norm 2.8447 (2.8924) [2022-10-01 12:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1200/1251] eta 0:00:14 lr 0.000045 time 0.2946 (0.2919) loss 3.5283 (3.0255) grad_norm 3.0828 (2.8996) [2022-10-01 12:17:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 263 training takes 0:06:05 [2022-10-01 12:17:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.222 (2.222) Loss 0.8451 (0.8451) Acc@1 80.176 (80.176) Acc@5 94.727 (94.727) [2022-10-01 12:17:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.682 Acc@5 95.434 [2022-10-01 12:17:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-01 12:17:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][0/1251] eta 0:57:55 lr 0.000045 time 2.7784 (2.7784) loss 3.5039 (3.5039) grad_norm 2.6394 (2.6394) [2022-10-01 12:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][100/1251] eta 0:06:01 lr 0.000045 time 0.2865 (0.3139) loss 2.6729 (3.0233) grad_norm 2.9278 (2.9165) [2022-10-01 12:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][200/1251] eta 0:05:16 lr 0.000044 time 0.2862 (0.3014) loss 3.0917 (3.0313) grad_norm 2.8066 (2.8814) [2022-10-01 12:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][300/1251] eta 0:04:42 lr 0.000044 time 0.2883 (0.2969) loss 2.7903 (3.0223) grad_norm 2.7403 (2.8854) [2022-10-01 12:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][400/1251] eta 0:04:10 lr 0.000044 time 0.2848 (0.2947) loss 3.1025 (3.0285) grad_norm 2.6737 (2.8852) [2022-10-01 12:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][500/1251] eta 0:03:40 lr 0.000044 time 0.2860 (0.2932) loss 3.2323 (3.0198) grad_norm 2.9384 (2.8935) [2022-10-01 12:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][600/1251] eta 0:03:10 lr 0.000044 time 0.2856 (0.2923) loss 2.3514 (3.0228) grad_norm 2.7104 (2.8896) [2022-10-01 12:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][700/1251] eta 0:02:40 lr 0.000044 time 0.2890 (0.2917) loss 3.2765 (3.0240) grad_norm 3.1522 (2.8913) [2022-10-01 12:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][800/1251] eta 0:02:11 lr 0.000044 time 0.2854 (0.2912) loss 2.8213 (3.0144) grad_norm 3.6489 (2.8896) [2022-10-01 12:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][900/1251] eta 0:01:42 lr 0.000043 time 0.2857 (0.2908) loss 3.5128 (3.0168) grad_norm 3.2592 (2.8848) [2022-10-01 12:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1000/1251] eta 0:01:12 lr 0.000043 time 0.2893 (0.2905) loss 3.2168 (3.0201) grad_norm 3.2281 (2.8896) [2022-10-01 12:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1100/1251] eta 0:00:43 lr 0.000043 time 0.2868 (0.2903) loss 3.5744 (3.0230) grad_norm 2.9435 (2.8936) [2022-10-01 12:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1200/1251] eta 0:00:14 lr 0.000043 time 0.2905 (0.2902) loss 3.3996 (3.0240) grad_norm 3.0339 (2.9040) [2022-10-01 12:23:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 264 training takes 0:06:03 [2022-10-01 12:23:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.186 (3.186) Loss 0.7781 (0.7781) Acc@1 81.445 (81.445) Acc@5 95.410 (95.410) [2022-10-01 12:24:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.468 Acc@5 95.440 [2022-10-01 12:24:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-01 12:24:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][0/1251] eta 1:11:11 lr 0.000043 time 3.4143 (3.4143) loss 3.5223 (3.5223) grad_norm 3.9049 (3.9049) [2022-10-01 12:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][100/1251] eta 0:06:11 lr 0.000043 time 0.2942 (0.3228) loss 3.3254 (3.0883) grad_norm 3.1785 (3.0001) [2022-10-01 12:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][200/1251] eta 0:05:23 lr 0.000043 time 0.2939 (0.3075) loss 3.1331 (3.0338) grad_norm 2.8451 (2.9956) [2022-10-01 12:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][300/1251] eta 0:04:47 lr 0.000042 time 0.2929 (0.3022) loss 2.8581 (3.0247) grad_norm 2.4763 (2.9805) [2022-10-01 12:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][400/1251] eta 0:04:14 lr 0.000042 time 0.2965 (0.2995) loss 3.3129 (3.0388) grad_norm 3.0677 (2.9887) [2022-10-01 12:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][500/1251] eta 0:03:43 lr 0.000042 time 0.2957 (0.2979) loss 2.5055 (3.0259) grad_norm 2.8376 (2.9719) [2022-10-01 12:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][600/1251] eta 0:03:13 lr 0.000042 time 0.2921 (0.2967) loss 3.5338 (3.0162) grad_norm 2.9430 (2.9609) [2022-10-01 12:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][700/1251] eta 0:02:43 lr 0.000042 time 0.2918 (0.2959) loss 3.3112 (3.0148) grad_norm 2.7890 (2.9464) [2022-10-01 12:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][800/1251] eta 0:02:13 lr 0.000042 time 0.2910 (0.2953) loss 3.0593 (3.0076) grad_norm 2.6506 (2.9498) [2022-10-01 12:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][900/1251] eta 0:01:43 lr 0.000042 time 0.2897 (0.2948) loss 2.8939 (3.0082) grad_norm 2.8210 (2.9480) [2022-10-01 12:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1000/1251] eta 0:01:13 lr 0.000041 time 0.2983 (0.2945) loss 3.0598 (3.0092) grad_norm 2.8726 (2.9446) [2022-10-01 12:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1100/1251] eta 0:00:44 lr 0.000041 time 0.2853 (0.2941) loss 3.1597 (3.0106) grad_norm 2.8578 (2.9393) [2022-10-01 12:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1200/1251] eta 0:00:14 lr 0.000041 time 0.2922 (0.2938) loss 3.7213 (3.0129) grad_norm 3.4871 (2.9501) [2022-10-01 12:30:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 265 training takes 0:06:07 [2022-10-01 12:30:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.041 (3.041) Loss 0.9031 (0.9031) Acc@1 79.102 (79.102) Acc@5 94.434 (94.434) [2022-10-01 12:30:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.504 Acc@5 95.506 [2022-10-01 12:30:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-01 12:30:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][0/1251] eta 0:59:05 lr 0.000041 time 2.8339 (2.8339) loss 3.5160 (3.5160) grad_norm 3.0027 (3.0027) [2022-10-01 12:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][100/1251] eta 0:06:05 lr 0.000041 time 0.2897 (0.3179) loss 3.1830 (3.0209) grad_norm 2.2339 (2.8772) [2022-10-01 12:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][200/1251] eta 0:05:19 lr 0.000041 time 0.2893 (0.3039) loss 2.7435 (3.0180) grad_norm 3.2622 (2.8806) [2022-10-01 12:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][300/1251] eta 0:04:44 lr 0.000041 time 0.2961 (0.2991) loss 3.0098 (3.0283) grad_norm 2.9645 (2.8884) [2022-10-01 12:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][400/1251] eta 0:04:13 lr 0.000040 time 0.2868 (0.2978) loss 3.2041 (3.0325) grad_norm 2.6768 (2.9020) [2022-10-01 12:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][500/1251] eta 0:03:42 lr 0.000040 time 0.2870 (0.2961) loss 3.1547 (3.0159) grad_norm 2.8172 (2.9034) [2022-10-01 12:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][600/1251] eta 0:03:12 lr 0.000040 time 0.2879 (0.2950) loss 3.2107 (2.9996) grad_norm 3.2853 (2.8976) [2022-10-01 12:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][700/1251] eta 0:02:42 lr 0.000040 time 0.2890 (0.2943) loss 3.6555 (2.9958) grad_norm 3.1457 (2.9115) [2022-10-01 12:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][800/1251] eta 0:02:12 lr 0.000040 time 0.2869 (0.2936) loss 3.1349 (2.9890) grad_norm 3.3725 (2.9079) [2022-10-01 12:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][900/1251] eta 0:01:42 lr 0.000040 time 0.2856 (0.2932) loss 3.2090 (3.0019) grad_norm 2.9660 (2.9104) [2022-10-01 12:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1000/1251] eta 0:01:13 lr 0.000040 time 0.2860 (0.2929) loss 2.0885 (3.0033) grad_norm 2.6331 (2.9180) [2022-10-01 12:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1100/1251] eta 0:00:44 lr 0.000039 time 0.2967 (0.2925) loss 3.0293 (3.0092) grad_norm 3.0849 (2.9152) [2022-10-01 12:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1200/1251] eta 0:00:14 lr 0.000039 time 0.2845 (0.2923) loss 3.2028 (3.0145) grad_norm 2.6018 (2.9114) [2022-10-01 12:36:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 266 training takes 0:06:05 [2022-10-01 12:36:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.429 (3.429) Loss 0.8314 (0.8314) Acc@1 80.371 (80.371) Acc@5 95.703 (95.703) [2022-10-01 12:36:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.560 Acc@5 95.386 [2022-10-01 12:36:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 12:36:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][0/1251] eta 1:19:20 lr 0.000039 time 3.8056 (3.8056) loss 3.0977 (3.0977) grad_norm 2.9624 (2.9624) [2022-10-01 12:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][100/1251] eta 0:06:11 lr 0.000039 time 0.2857 (0.3226) loss 3.4362 (3.0601) grad_norm 2.7362 (2.9372) [2022-10-01 12:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][200/1251] eta 0:05:21 lr 0.000039 time 0.2910 (0.3063) loss 3.5133 (3.0599) grad_norm 2.5314 (2.9241) [2022-10-01 12:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][300/1251] eta 0:04:45 lr 0.000039 time 0.2866 (0.3007) loss 3.0424 (3.0561) grad_norm 2.7161 (2.9176) [2022-10-01 12:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][400/1251] eta 0:04:13 lr 0.000039 time 0.2854 (0.2980) loss 2.0428 (3.0288) grad_norm 3.4501 (2.9331) [2022-10-01 12:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][500/1251] eta 0:03:42 lr 0.000039 time 0.2823 (0.2958) loss 2.5450 (3.0198) grad_norm 2.8431 (2.9259) [2022-10-01 12:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][600/1251] eta 0:03:11 lr 0.000038 time 0.2858 (0.2943) loss 3.5835 (3.0432) grad_norm 2.6300 (2.9360) [2022-10-01 12:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][700/1251] eta 0:02:41 lr 0.000038 time 0.2863 (0.2937) loss 3.1961 (3.0514) grad_norm 2.6665 (2.9392) [2022-10-01 12:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][800/1251] eta 0:02:12 lr 0.000038 time 0.2878 (0.2931) loss 3.3777 (3.0468) grad_norm 2.9054 (2.9446) [2022-10-01 12:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][900/1251] eta 0:01:42 lr 0.000038 time 0.2876 (0.2924) loss 3.3117 (3.0412) grad_norm 2.7734 (2.9528) [2022-10-01 12:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1000/1251] eta 0:01:13 lr 0.000038 time 0.2871 (0.2920) loss 3.2557 (3.0425) grad_norm 2.9735 (2.9588) [2022-10-01 12:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1100/1251] eta 0:00:44 lr 0.000038 time 0.2862 (0.2915) loss 2.9191 (3.0465) grad_norm 2.9836 (2.9600) [2022-10-01 12:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1200/1251] eta 0:00:14 lr 0.000038 time 0.2859 (0.2912) loss 3.1364 (3.0450) grad_norm 2.6258 (2.9568) [2022-10-01 12:42:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 267 training takes 0:06:04 [2022-10-01 12:42:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.212 (2.212) Loss 0.7674 (0.7674) Acc@1 80.762 (80.762) Acc@5 95.801 (95.801) [2022-10-01 12:43:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.556 Acc@5 95.416 [2022-10-01 12:43:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 12:43:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][0/1251] eta 1:03:08 lr 0.000038 time 3.0282 (3.0282) loss 3.4232 (3.4232) grad_norm 2.9481 (2.9481) [2022-10-01 12:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][100/1251] eta 0:06:18 lr 0.000037 time 0.2895 (0.3289) loss 2.3486 (3.0111) grad_norm 2.6006 (3.0079) [2022-10-01 12:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][200/1251] eta 0:05:26 lr 0.000037 time 0.2927 (0.3108) loss 2.4886 (3.0149) grad_norm 2.7991 (2.9592) [2022-10-01 12:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][300/1251] eta 0:04:49 lr 0.000037 time 0.2905 (0.3045) loss 2.2235 (3.0214) grad_norm 2.6907 (2.9768) [2022-10-01 12:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][400/1251] eta 0:04:16 lr 0.000037 time 0.2934 (0.3013) loss 3.6704 (3.0351) grad_norm 2.9852 (2.9747) [2022-10-01 12:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][500/1251] eta 0:03:44 lr 0.000037 time 0.2906 (0.2993) loss 3.1752 (3.0231) grad_norm 3.5366 (2.9662) [2022-10-01 12:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][600/1251] eta 0:03:13 lr 0.000037 time 0.2938 (0.2979) loss 3.6387 (3.0195) grad_norm 3.3385 (3.0010) [2022-10-01 12:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][700/1251] eta 0:02:43 lr 0.000037 time 0.2884 (0.2967) loss 3.3929 (3.0264) grad_norm 3.3883 (3.0062) [2022-10-01 12:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][800/1251] eta 0:02:13 lr 0.000036 time 0.2888 (0.2958) loss 3.3815 (3.0283) grad_norm 2.6807 (2.9991) [2022-10-01 12:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][900/1251] eta 0:01:43 lr 0.000036 time 0.2878 (0.2950) loss 3.2122 (3.0260) grad_norm 3.7324 (3.0054) [2022-10-01 12:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1000/1251] eta 0:01:13 lr 0.000036 time 0.2910 (0.2943) loss 2.5586 (3.0173) grad_norm 2.8259 (3.0024) [2022-10-01 12:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1100/1251] eta 0:00:44 lr 0.000036 time 0.2871 (0.2938) loss 3.2773 (3.0175) grad_norm 3.2458 (3.0016) [2022-10-01 12:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1200/1251] eta 0:00:14 lr 0.000036 time 0.2845 (0.2934) loss 3.0802 (3.0174) grad_norm 2.8311 (2.9958) [2022-10-01 12:49:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 268 training takes 0:06:07 [2022-10-01 12:49:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.501 (2.501) Loss 0.8598 (0.8598) Acc@1 79.297 (79.297) Acc@5 95.215 (95.215) [2022-10-01 12:49:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.630 Acc@5 95.468 [2022-10-01 12:49:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 12:49:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][0/1251] eta 1:08:01 lr 0.000036 time 3.2628 (3.2628) loss 3.3916 (3.3916) grad_norm 3.5232 (3.5232) [2022-10-01 12:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][100/1251] eta 0:06:08 lr 0.000036 time 0.2915 (0.3200) loss 3.3026 (3.0013) grad_norm 2.9897 (3.0763) [2022-10-01 12:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][200/1251] eta 0:05:20 lr 0.000036 time 0.2908 (0.3048) loss 2.8450 (3.0097) grad_norm 2.8788 (3.0451) [2022-10-01 12:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][300/1251] eta 0:04:44 lr 0.000035 time 0.2885 (0.2997) loss 3.2337 (2.9889) grad_norm 2.8896 (3.0161) [2022-10-01 12:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][400/1251] eta 0:04:12 lr 0.000035 time 0.2926 (0.2970) loss 1.9745 (2.9689) grad_norm 4.1007 (2.9866) [2022-10-01 12:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][500/1251] eta 0:03:41 lr 0.000035 time 0.2916 (0.2955) loss 2.8781 (2.9840) grad_norm 2.8111 (2.9903) [2022-10-01 12:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][600/1251] eta 0:03:11 lr 0.000035 time 0.2865 (0.2944) loss 3.1246 (2.9891) grad_norm 2.7433 (2.9959) [2022-10-01 12:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][700/1251] eta 0:02:41 lr 0.000035 time 0.2909 (0.2936) loss 3.0721 (3.0034) grad_norm 3.1548 (3.0023) [2022-10-01 12:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][800/1251] eta 0:02:12 lr 0.000035 time 0.2914 (0.2931) loss 2.5752 (3.0069) grad_norm 2.9177 (2.9938) [2022-10-01 12:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][900/1251] eta 0:01:42 lr 0.000035 time 0.2893 (0.2926) loss 2.2266 (3.0109) grad_norm 2.9788 (2.9911) [2022-10-01 12:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1000/1251] eta 0:01:13 lr 0.000035 time 0.2900 (0.2923) loss 2.5568 (3.0066) grad_norm 2.7507 (2.9879) [2022-10-01 12:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1100/1251] eta 0:00:44 lr 0.000034 time 0.2897 (0.2920) loss 2.7225 (3.0092) grad_norm 2.7629 (2.9971) [2022-10-01 12:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1200/1251] eta 0:00:14 lr 0.000034 time 0.2938 (0.2917) loss 3.3458 (3.0061) grad_norm 2.7199 (2.9945) [2022-10-01 12:55:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 269 training takes 0:06:05 [2022-10-01 12:55:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.154 (3.154) Loss 0.7675 (0.7675) Acc@1 83.105 (83.105) Acc@5 95.508 (95.508) [2022-10-01 12:55:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.592 Acc@5 95.474 [2022-10-01 12:55:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 12:55:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 12:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][0/1251] eta 1:10:00 lr 0.000034 time 3.3579 (3.3579) loss 3.4775 (3.4775) grad_norm 3.0001 (3.0001) [2022-10-01 12:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][100/1251] eta 0:06:09 lr 0.000034 time 0.2862 (0.3208) loss 3.1143 (3.0510) grad_norm 2.8392 (2.9974) [2022-10-01 12:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][200/1251] eta 0:05:20 lr 0.000034 time 0.2958 (0.3052) loss 2.8796 (3.0227) grad_norm 3.1024 (3.0421) [2022-10-01 12:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][300/1251] eta 0:04:45 lr 0.000034 time 0.2922 (0.3000) loss 2.0381 (3.0042) grad_norm 2.9205 (3.0762) [2022-10-01 12:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][400/1251] eta 0:04:13 lr 0.000034 time 0.2955 (0.2973) loss 3.2291 (3.0094) grad_norm 3.2117 (3.0685) [2022-10-01 12:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][500/1251] eta 0:03:42 lr 0.000034 time 0.2926 (0.2958) loss 2.8925 (3.0055) grad_norm 3.0738 (3.0588) [2022-10-01 12:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][600/1251] eta 0:03:11 lr 0.000033 time 0.2905 (0.2947) loss 3.1521 (3.0030) grad_norm 3.0356 (3.0458) [2022-10-01 12:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][700/1251] eta 0:02:41 lr 0.000033 time 0.2903 (0.2939) loss 2.5236 (3.0094) grad_norm 3.0478 (3.0337) [2022-10-01 12:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][800/1251] eta 0:02:12 lr 0.000033 time 0.2940 (0.2933) loss 2.9610 (3.0173) grad_norm 3.1292 (3.0353) [2022-10-01 13:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][900/1251] eta 0:01:42 lr 0.000033 time 0.2923 (0.2930) loss 3.0685 (3.0188) grad_norm 2.8861 (3.0389) [2022-10-01 13:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1000/1251] eta 0:01:13 lr 0.000033 time 0.2900 (0.2926) loss 2.1146 (3.0203) grad_norm 3.0069 (3.0368) [2022-10-01 13:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1100/1251] eta 0:00:44 lr 0.000033 time 0.2901 (0.2924) loss 2.8424 (3.0212) grad_norm 2.9500 (3.0272) [2022-10-01 13:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1200/1251] eta 0:00:14 lr 0.000033 time 0.2907 (0.2920) loss 2.8298 (3.0225) grad_norm 3.1362 (3.0207) [2022-10-01 13:01:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 270 training takes 0:06:05 [2022-10-01 13:01:44 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saving...... [2022-10-01 13:01:44 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saved !!! [2022-10-01 13:01:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.955 (2.955) Loss 0.7907 (0.7907) Acc@1 81.738 (81.738) Acc@5 96.191 (96.191) [2022-10-01 13:01:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.638 Acc@5 95.464 [2022-10-01 13:01:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-01 13:01:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.68% [2022-10-01 13:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][0/1251] eta 0:45:28 lr 0.000033 time 2.1811 (2.1811) loss 3.4422 (3.4422) grad_norm 2.8346 (2.8346) [2022-10-01 13:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][100/1251] eta 0:06:01 lr 0.000033 time 0.2864 (0.3136) loss 3.2700 (3.0009) grad_norm 2.7189 (3.0491) [2022-10-01 13:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][200/1251] eta 0:05:16 lr 0.000032 time 0.2892 (0.3012) loss 3.1966 (2.9942) grad_norm 2.7880 (2.9939) [2022-10-01 13:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][300/1251] eta 0:04:42 lr 0.000032 time 0.2861 (0.2971) loss 3.4468 (2.9827) grad_norm 3.3567 (3.0282) [2022-10-01 13:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][400/1251] eta 0:04:11 lr 0.000032 time 0.2890 (0.2950) loss 2.2791 (2.9942) grad_norm 2.8459 (3.0510) [2022-10-01 13:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][500/1251] eta 0:03:40 lr 0.000032 time 0.2866 (0.2937) loss 3.3361 (2.9847) grad_norm 3.6238 (3.0409) [2022-10-01 13:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][600/1251] eta 0:03:10 lr 0.000032 time 0.2907 (0.2929) loss 3.1751 (3.0005) grad_norm 2.9386 (3.0338) [2022-10-01 13:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][700/1251] eta 0:02:41 lr 0.000032 time 0.2862 (0.2922) loss 2.0239 (3.0103) grad_norm 2.8131 (3.0200) [2022-10-01 13:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][800/1251] eta 0:02:11 lr 0.000032 time 0.2867 (0.2917) loss 3.0800 (3.0088) grad_norm 2.8765 (3.0421) [2022-10-01 13:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][900/1251] eta 0:01:42 lr 0.000032 time 0.2864 (0.2913) loss 3.3777 (3.0063) grad_norm 2.8725 (3.0413) [2022-10-01 13:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1000/1251] eta 0:01:13 lr 0.000031 time 0.2943 (0.2910) loss 2.4377 (2.9974) grad_norm 2.9177 (3.0386) [2022-10-01 13:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1100/1251] eta 0:00:43 lr 0.000031 time 0.2858 (0.2906) loss 3.9174 (2.9991) grad_norm 2.7464 (3.0356) [2022-10-01 13:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1200/1251] eta 0:00:14 lr 0.000031 time 0.2886 (0.2904) loss 3.4729 (2.9953) grad_norm 2.8825 (3.0374) [2022-10-01 13:08:00 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 271 training takes 0:06:03 [2022-10-01 13:08:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.424 (2.424) Loss 0.8103 (0.8103) Acc@1 80.957 (80.957) Acc@5 95.215 (95.215) [2022-10-01 13:08:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.792 Acc@5 95.466 [2022-10-01 13:08:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-01 13:08:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.79% [2022-10-01 13:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][0/1251] eta 0:56:37 lr 0.000031 time 2.7159 (2.7159) loss 3.2305 (3.2305) grad_norm 2.7137 (2.7137) [2022-10-01 13:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][100/1251] eta 0:06:04 lr 0.000031 time 0.2897 (0.3164) loss 3.7501 (3.0219) grad_norm 2.9139 (3.0925) [2022-10-01 13:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][200/1251] eta 0:05:18 lr 0.000031 time 0.2905 (0.3032) loss 2.7801 (3.0256) grad_norm 3.1224 (3.0735) [2022-10-01 13:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][300/1251] eta 0:04:44 lr 0.000031 time 0.2898 (0.2988) loss 3.2922 (3.0296) grad_norm 3.6209 (3.0698) [2022-10-01 13:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][400/1251] eta 0:04:12 lr 0.000031 time 0.2915 (0.2966) loss 3.0961 (3.0337) grad_norm 3.2045 (3.0802) [2022-10-01 13:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][500/1251] eta 0:03:41 lr 0.000031 time 0.2908 (0.2954) loss 2.8640 (3.0351) grad_norm 3.0503 (3.0768) [2022-10-01 13:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][600/1251] eta 0:03:11 lr 0.000030 time 0.2915 (0.2945) loss 1.8593 (3.0293) grad_norm 2.9904 (3.0594) [2022-10-01 13:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][700/1251] eta 0:02:41 lr 0.000030 time 0.2930 (0.2939) loss 2.6054 (3.0215) grad_norm 2.5882 (3.0647) [2022-10-01 13:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][800/1251] eta 0:02:12 lr 0.000030 time 0.2890 (0.2934) loss 3.4765 (3.0100) grad_norm 3.0149 (3.0747) [2022-10-01 13:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][900/1251] eta 0:01:42 lr 0.000030 time 0.2959 (0.2931) loss 3.1435 (3.0043) grad_norm 2.8224 (3.0737) [2022-10-01 13:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1000/1251] eta 0:01:13 lr 0.000030 time 0.2904 (0.2928) loss 1.8300 (3.0048) grad_norm 2.9531 (3.0732) [2022-10-01 13:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1100/1251] eta 0:00:44 lr 0.000030 time 0.2882 (0.2925) loss 1.8895 (3.0024) grad_norm 2.8353 (3.0749) [2022-10-01 13:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1200/1251] eta 0:00:14 lr 0.000030 time 0.2870 (0.2923) loss 3.1569 (2.9971) grad_norm 2.7921 (3.0754) [2022-10-01 13:14:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 272 training takes 0:06:05 [2022-10-01 13:14:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.371 (3.371) Loss 0.8748 (0.8748) Acc@1 79.199 (79.199) Acc@5 94.922 (94.922) [2022-10-01 13:14:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.700 Acc@5 95.472 [2022-10-01 13:14:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-01 13:14:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.79% [2022-10-01 13:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][0/1251] eta 0:46:45 lr 0.000030 time 2.2428 (2.2428) loss 3.3184 (3.3184) grad_norm 3.3178 (3.3178) [2022-10-01 13:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][100/1251] eta 0:06:03 lr 0.000030 time 0.2894 (0.3160) loss 2.8075 (2.9818) grad_norm 3.5886 (3.0641) [2022-10-01 13:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][200/1251] eta 0:05:17 lr 0.000029 time 0.2888 (0.3025) loss 3.7227 (2.9617) grad_norm 2.7029 (3.0865) [2022-10-01 13:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][300/1251] eta 0:04:43 lr 0.000029 time 0.2931 (0.2980) loss 3.4253 (2.9745) grad_norm 3.2330 (3.0926) [2022-10-01 13:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][400/1251] eta 0:04:11 lr 0.000029 time 0.2922 (0.2958) loss 3.1169 (2.9781) grad_norm 2.8380 (3.0938) [2022-10-01 13:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][500/1251] eta 0:03:41 lr 0.000029 time 0.2924 (0.2945) loss 2.7845 (2.9735) grad_norm 3.3461 (3.0825) [2022-10-01 13:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][600/1251] eta 0:03:11 lr 0.000029 time 0.2949 (0.2936) loss 3.0540 (2.9709) grad_norm 3.2149 (3.0798) [2022-10-01 13:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][700/1251] eta 0:02:41 lr 0.000029 time 0.2909 (0.2931) loss 2.7336 (2.9741) grad_norm 3.1262 (3.0784) [2022-10-01 13:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][800/1251] eta 0:02:12 lr 0.000029 time 0.2936 (0.2927) loss 3.2384 (2.9717) grad_norm 2.9375 (3.0894) [2022-10-01 13:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][900/1251] eta 0:01:42 lr 0.000029 time 0.2904 (0.2924) loss 2.5705 (2.9805) grad_norm 3.0106 (3.0881) [2022-10-01 13:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1000/1251] eta 0:01:13 lr 0.000029 time 0.2920 (0.2921) loss 2.9519 (2.9829) grad_norm 3.0250 (3.0954) [2022-10-01 13:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1100/1251] eta 0:00:44 lr 0.000028 time 0.2952 (0.2919) loss 3.1445 (2.9837) grad_norm 2.9529 (3.0932) [2022-10-01 13:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1200/1251] eta 0:00:14 lr 0.000028 time 0.2943 (0.2917) loss 3.6840 (2.9845) grad_norm 3.2044 (3.0902) [2022-10-01 13:20:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 273 training takes 0:06:05 [2022-10-01 13:20:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.315 (3.315) Loss 0.8543 (0.8543) Acc@1 79.199 (79.199) Acc@5 95.312 (95.312) [2022-10-01 13:20:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.688 Acc@5 95.400 [2022-10-01 13:20:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-01 13:20:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.79% [2022-10-01 13:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][0/1251] eta 0:55:44 lr 0.000028 time 2.6732 (2.6732) loss 3.6227 (3.6227) grad_norm 3.6077 (3.6077) [2022-10-01 13:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][100/1251] eta 0:06:04 lr 0.000028 time 0.2912 (0.3163) loss 3.3960 (2.9332) grad_norm 3.1293 (3.0608) [2022-10-01 13:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][200/1251] eta 0:05:18 lr 0.000028 time 0.2889 (0.3035) loss 3.0206 (2.8990) grad_norm 2.2537 (3.0587) [2022-10-01 13:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][300/1251] eta 0:04:44 lr 0.000028 time 0.2852 (0.2991) loss 3.4469 (2.9189) grad_norm 2.9433 (3.0611) [2022-10-01 13:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][400/1251] eta 0:04:12 lr 0.000028 time 0.2921 (0.2967) loss 2.9138 (2.9581) grad_norm 3.1086 (3.0832) [2022-10-01 13:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][500/1251] eta 0:03:41 lr 0.000028 time 0.2876 (0.2952) loss 2.9572 (2.9755) grad_norm 3.0303 (3.0849) [2022-10-01 13:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][600/1251] eta 0:03:11 lr 0.000028 time 0.2866 (0.2942) loss 2.8408 (2.9731) grad_norm 3.1729 (3.0836) [2022-10-01 13:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][700/1251] eta 0:02:41 lr 0.000027 time 0.2906 (0.2935) loss 3.3290 (2.9769) grad_norm 3.1460 (3.0780) [2022-10-01 13:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][800/1251] eta 0:02:12 lr 0.000027 time 0.2911 (0.2929) loss 3.1606 (2.9903) grad_norm 3.2991 (3.0821) [2022-10-01 13:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][900/1251] eta 0:01:42 lr 0.000027 time 0.2869 (0.2924) loss 2.6883 (2.9935) grad_norm 2.6198 (3.0822) [2022-10-01 13:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1000/1251] eta 0:01:13 lr 0.000027 time 0.2898 (0.2920) loss 3.0496 (2.9942) grad_norm 3.0448 (3.0821) [2022-10-01 13:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1100/1251] eta 0:00:44 lr 0.000027 time 0.2876 (0.2917) loss 2.0134 (2.9907) grad_norm 3.1861 (3.0858) [2022-10-01 13:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1200/1251] eta 0:00:14 lr 0.000027 time 0.2954 (0.2915) loss 3.2461 (2.9876) grad_norm 3.5887 (3.0826) [2022-10-01 13:26:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 274 training takes 0:06:04 [2022-10-01 13:26:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.793 (2.793) Loss 0.8193 (0.8193) Acc@1 81.055 (81.055) Acc@5 95.117 (95.117) [2022-10-01 13:27:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.750 Acc@5 95.446 [2022-10-01 13:27:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-01 13:27:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.79% [2022-10-01 13:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][0/1251] eta 0:48:57 lr 0.000027 time 2.3480 (2.3480) loss 3.3561 (3.3561) grad_norm 3.0826 (3.0826) [2022-10-01 13:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][100/1251] eta 0:06:10 lr 0.000027 time 0.2919 (0.3220) loss 3.1243 (3.0384) grad_norm 3.4287 (3.0239) [2022-10-01 13:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][200/1251] eta 0:05:22 lr 0.000027 time 0.2920 (0.3067) loss 3.2575 (2.9972) grad_norm 2.9561 (3.0700) [2022-10-01 13:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][300/1251] eta 0:04:46 lr 0.000027 time 0.2902 (0.3016) loss 2.8529 (2.9878) grad_norm 2.9806 (3.0771) [2022-10-01 13:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][400/1251] eta 0:04:14 lr 0.000026 time 0.2872 (0.2990) loss 3.2650 (2.9870) grad_norm 2.7318 (3.0858) [2022-10-01 13:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][500/1251] eta 0:03:43 lr 0.000026 time 0.2896 (0.2974) loss 2.3336 (2.9859) grad_norm 3.6512 (3.0854) [2022-10-01 13:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][600/1251] eta 0:03:12 lr 0.000026 time 0.2879 (0.2963) loss 2.8834 (2.9894) grad_norm 3.1692 (3.0917) [2022-10-01 13:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][700/1251] eta 0:02:42 lr 0.000026 time 0.2939 (0.2955) loss 1.8218 (2.9796) grad_norm 4.1839 (3.0927) [2022-10-01 13:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][800/1251] eta 0:02:13 lr 0.000026 time 0.2883 (0.2949) loss 2.2442 (2.9911) grad_norm 2.6745 (3.1068) [2022-10-01 13:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][900/1251] eta 0:01:43 lr 0.000026 time 0.2873 (0.2944) loss 3.3422 (2.9876) grad_norm 3.0134 (3.0987) [2022-10-01 13:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1000/1251] eta 0:01:13 lr 0.000026 time 0.2894 (0.2940) loss 3.0128 (2.9892) grad_norm 3.1656 (3.1052) [2022-10-01 13:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1100/1251] eta 0:00:44 lr 0.000026 time 0.2927 (0.2936) loss 2.9940 (2.9935) grad_norm 3.2423 (3.1101) [2022-10-01 13:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1200/1251] eta 0:00:14 lr 0.000026 time 0.2885 (0.2933) loss 2.7040 (2.9971) grad_norm 3.1911 (3.1141) [2022-10-01 13:33:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 275 training takes 0:06:07 [2022-10-01 13:33:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.780 (2.780) Loss 0.8047 (0.8047) Acc@1 80.957 (80.957) Acc@5 95.703 (95.703) [2022-10-01 13:33:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.824 Acc@5 95.458 [2022-10-01 13:33:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-01 13:33:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.82% [2022-10-01 13:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][0/1251] eta 1:06:49 lr 0.000026 time 3.2051 (3.2051) loss 3.9418 (3.9418) grad_norm 3.2033 (3.2033) [2022-10-01 13:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][100/1251] eta 0:06:04 lr 0.000025 time 0.2868 (0.3170) loss 3.5514 (3.0063) grad_norm 2.9779 (3.0404) [2022-10-01 13:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][200/1251] eta 0:05:17 lr 0.000025 time 0.2871 (0.3023) loss 3.6938 (3.0115) grad_norm 2.9522 (3.0937) [2022-10-01 13:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][300/1251] eta 0:04:42 lr 0.000025 time 0.2871 (0.2974) loss 3.4405 (3.0051) grad_norm 3.2767 (3.1106) [2022-10-01 13:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][400/1251] eta 0:04:11 lr 0.000025 time 0.2920 (0.2950) loss 3.2929 (3.0208) grad_norm 3.1695 (3.1099) [2022-10-01 13:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][500/1251] eta 0:03:40 lr 0.000025 time 0.2880 (0.2935) loss 3.5945 (3.0159) grad_norm 2.8739 (3.1243) [2022-10-01 13:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][600/1251] eta 0:03:10 lr 0.000025 time 0.2907 (0.2925) loss 2.9935 (3.0221) grad_norm 3.2450 (3.1193) [2022-10-01 13:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][700/1251] eta 0:02:40 lr 0.000025 time 0.2846 (0.2920) loss 2.5550 (3.0099) grad_norm 3.3959 (3.1168) [2022-10-01 13:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][800/1251] eta 0:02:11 lr 0.000025 time 0.2873 (0.2914) loss 1.8653 (3.0150) grad_norm 2.8730 (3.1182) [2022-10-01 13:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][900/1251] eta 0:01:42 lr 0.000025 time 0.2877 (0.2911) loss 3.4005 (3.0168) grad_norm 2.8307 (3.1129) [2022-10-01 13:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1000/1251] eta 0:01:12 lr 0.000025 time 0.2905 (0.2908) loss 2.0837 (3.0120) grad_norm 2.9333 (3.1189) [2022-10-01 13:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1100/1251] eta 0:00:43 lr 0.000024 time 0.2861 (0.2905) loss 3.2285 (3.0110) grad_norm 3.0559 (3.1171) [2022-10-01 13:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1200/1251] eta 0:00:14 lr 0.000024 time 0.2904 (0.2903) loss 3.3724 (3.0090) grad_norm 3.5799 (3.1194) [2022-10-01 13:39:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 276 training takes 0:06:03 [2022-10-01 13:39:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.752 (2.752) Loss 0.8074 (0.8074) Acc@1 80.664 (80.664) Acc@5 95.996 (95.996) [2022-10-01 13:39:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.882 Acc@5 95.526 [2022-10-01 13:39:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 13:39:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.88% [2022-10-01 13:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][0/1251] eta 0:49:29 lr 0.000024 time 2.3737 (2.3737) loss 2.5146 (2.5146) grad_norm 2.6827 (2.6827) [2022-10-01 13:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][100/1251] eta 0:06:02 lr 0.000024 time 0.2911 (0.3146) loss 2.9866 (2.9741) grad_norm 2.8660 (3.1649) [2022-10-01 13:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][200/1251] eta 0:05:16 lr 0.000024 time 0.2911 (0.3016) loss 2.4689 (2.9653) grad_norm 4.9393 (3.1493) [2022-10-01 13:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][300/1251] eta 0:04:42 lr 0.000024 time 0.2863 (0.2972) loss 2.7571 (2.9829) grad_norm 3.0408 (3.1357) [2022-10-01 13:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][400/1251] eta 0:04:11 lr 0.000024 time 0.2908 (0.2950) loss 3.4862 (2.9976) grad_norm 3.1203 (3.1215) [2022-10-01 13:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][500/1251] eta 0:03:40 lr 0.000024 time 0.2870 (0.2936) loss 3.0739 (3.0088) grad_norm 3.4070 (3.1152) [2022-10-01 13:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][600/1251] eta 0:03:10 lr 0.000024 time 0.2906 (0.2927) loss 2.9219 (3.0081) grad_norm 3.1270 (3.1061) [2022-10-01 13:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][700/1251] eta 0:02:40 lr 0.000024 time 0.2883 (0.2921) loss 2.6786 (3.0065) grad_norm 2.7714 (3.1078) [2022-10-01 13:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][800/1251] eta 0:02:11 lr 0.000024 time 0.2854 (0.2915) loss 3.4625 (3.0074) grad_norm 3.2686 (3.1039) [2022-10-01 13:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][900/1251] eta 0:01:42 lr 0.000023 time 0.2855 (0.2912) loss 2.5809 (3.0113) grad_norm 2.6454 (3.1119) [2022-10-01 13:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1000/1251] eta 0:01:12 lr 0.000023 time 0.2864 (0.2908) loss 2.8620 (3.0035) grad_norm 3.1444 (3.1160) [2022-10-01 13:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1100/1251] eta 0:00:43 lr 0.000023 time 0.2864 (0.2906) loss 2.6868 (3.0015) grad_norm 2.7887 (3.1157) [2022-10-01 13:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1200/1251] eta 0:00:14 lr 0.000023 time 0.2860 (0.2903) loss 2.5472 (2.9995) grad_norm 3.2498 (3.1153) [2022-10-01 13:45:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 277 training takes 0:06:03 [2022-10-01 13:45:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.646 (2.646) Loss 0.7676 (0.7676) Acc@1 81.348 (81.348) Acc@5 96.777 (96.777) [2022-10-01 13:46:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.726 Acc@5 95.534 [2022-10-01 13:46:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-01 13:46:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.88% [2022-10-01 13:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][0/1251] eta 1:10:21 lr 0.000023 time 3.3744 (3.3744) loss 3.1941 (3.1941) grad_norm 2.8773 (2.8773) [2022-10-01 13:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][100/1251] eta 0:06:11 lr 0.000023 time 0.2942 (0.3227) loss 3.0781 (3.0024) grad_norm 3.1763 (3.1912) [2022-10-01 13:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][200/1251] eta 0:05:22 lr 0.000023 time 0.2900 (0.3070) loss 3.5551 (2.9942) grad_norm 3.0194 (3.1886) [2022-10-01 13:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][300/1251] eta 0:04:46 lr 0.000023 time 0.2904 (0.3018) loss 3.6814 (2.9933) grad_norm 3.3394 (3.1501) [2022-10-01 13:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][400/1251] eta 0:04:14 lr 0.000023 time 0.2879 (0.2991) loss 3.0552 (3.0094) grad_norm 2.8132 (3.1648) [2022-10-01 13:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][500/1251] eta 0:03:43 lr 0.000023 time 0.2911 (0.2976) loss 3.1847 (3.0151) grad_norm 3.5048 (3.1570) [2022-10-01 13:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][600/1251] eta 0:03:13 lr 0.000023 time 0.2896 (0.2965) loss 3.3858 (3.0131) grad_norm 2.6611 (3.1519) [2022-10-01 13:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][700/1251] eta 0:02:42 lr 0.000022 time 0.2968 (0.2957) loss 3.2214 (3.0105) grad_norm 3.2459 (3.1367) [2022-10-01 13:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][800/1251] eta 0:02:13 lr 0.000022 time 0.2956 (0.2952) loss 3.3471 (3.0076) grad_norm 4.0012 (3.1442) [2022-10-01 13:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][900/1251] eta 0:01:43 lr 0.000022 time 0.2916 (0.2946) loss 2.7721 (3.0144) grad_norm 2.7262 (3.1491) [2022-10-01 13:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1000/1251] eta 0:01:13 lr 0.000022 time 0.2881 (0.2942) loss 3.0767 (3.0100) grad_norm 2.8028 (3.1383) [2022-10-01 13:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1100/1251] eta 0:00:44 lr 0.000022 time 0.2936 (0.2938) loss 2.8862 (3.0066) grad_norm 3.7885 (3.1360) [2022-10-01 13:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1200/1251] eta 0:00:14 lr 0.000022 time 0.2854 (0.2935) loss 2.3946 (3.0041) grad_norm 2.9962 (3.1388) [2022-10-01 13:52:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 278 training takes 0:06:07 [2022-10-01 13:52:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.380 (2.380) Loss 0.7496 (0.7496) Acc@1 83.594 (83.594) Acc@5 96.289 (96.289) [2022-10-01 13:52:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.746 Acc@5 95.516 [2022-10-01 13:52:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-01 13:52:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.88% [2022-10-01 13:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][0/1251] eta 0:56:43 lr 0.000022 time 2.7208 (2.7208) loss 2.9225 (2.9225) grad_norm 2.7176 (2.7176) [2022-10-01 13:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][100/1251] eta 0:06:06 lr 0.000022 time 0.2896 (0.3180) loss 1.8772 (3.0185) grad_norm 2.8131 (3.1252) [2022-10-01 13:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][200/1251] eta 0:05:19 lr 0.000022 time 0.2900 (0.3041) loss 3.3920 (2.9821) grad_norm 2.9364 (3.1337) [2022-10-01 13:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][300/1251] eta 0:04:44 lr 0.000022 time 0.2892 (0.2996) loss 3.2877 (2.9817) grad_norm 2.7988 (3.1475) [2022-10-01 13:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][400/1251] eta 0:04:12 lr 0.000022 time 0.2899 (0.2973) loss 3.2970 (2.9772) grad_norm 3.3540 (3.1558) [2022-10-01 13:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][500/1251] eta 0:03:42 lr 0.000021 time 0.2920 (0.2961) loss 3.3524 (2.9664) grad_norm 3.4651 (3.1488) [2022-10-01 13:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][600/1251] eta 0:03:12 lr 0.000021 time 0.2901 (0.2951) loss 2.5737 (2.9801) grad_norm 2.9617 (3.1509) [2022-10-01 13:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][700/1251] eta 0:02:42 lr 0.000021 time 0.2899 (0.2945) loss 3.0364 (2.9982) grad_norm 3.1592 (3.1436) [2022-10-01 13:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][800/1251] eta 0:02:12 lr 0.000021 time 0.2903 (0.2941) loss 3.2211 (2.9929) grad_norm 2.7488 (3.1409) [2022-10-01 13:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][900/1251] eta 0:01:43 lr 0.000021 time 0.2941 (0.2937) loss 2.7179 (2.9975) grad_norm 3.1405 (3.1388) [2022-10-01 13:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1000/1251] eta 0:01:13 lr 0.000021 time 0.2878 (0.2934) loss 2.9291 (2.9931) grad_norm 3.3560 (3.1446) [2022-10-01 13:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1100/1251] eta 0:00:44 lr 0.000021 time 0.2893 (0.2931) loss 2.6425 (2.9919) grad_norm 3.8182 (3.1363) [2022-10-01 13:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1200/1251] eta 0:00:14 lr 0.000021 time 0.2878 (0.2928) loss 3.3737 (2.9911) grad_norm 2.5041 (3.1294) [2022-10-01 13:58:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 279 training takes 0:06:06 [2022-10-01 13:58:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.314 (3.314) Loss 0.8758 (0.8758) Acc@1 79.297 (79.297) Acc@5 95.508 (95.508) [2022-10-01 13:58:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.918 Acc@5 95.514 [2022-10-01 13:58:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 13:58:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.92% [2022-10-01 13:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][0/1251] eta 1:08:56 lr 0.000021 time 3.3063 (3.3063) loss 2.8166 (2.8166) grad_norm 2.9930 (2.9930) [2022-10-01 13:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][100/1251] eta 0:06:09 lr 0.000021 time 0.2870 (0.3209) loss 3.2262 (2.9548) grad_norm 3.4157 (3.1113) [2022-10-01 13:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][200/1251] eta 0:05:21 lr 0.000021 time 0.2892 (0.3055) loss 3.1640 (2.9585) grad_norm 3.0108 (3.1180) [2022-10-01 14:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][300/1251] eta 0:04:45 lr 0.000021 time 0.2877 (0.3005) loss 2.6945 (2.9717) grad_norm 3.1865 (3.1493) [2022-10-01 14:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][400/1251] eta 0:04:13 lr 0.000020 time 0.2898 (0.2977) loss 3.5871 (2.9822) grad_norm 2.6494 (3.1484) [2022-10-01 14:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][500/1251] eta 0:03:42 lr 0.000020 time 0.2870 (0.2961) loss 2.9735 (2.9675) grad_norm 2.6892 (3.1570) [2022-10-01 14:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][600/1251] eta 0:03:12 lr 0.000020 time 0.2901 (0.2950) loss 3.2644 (2.9778) grad_norm 3.2982 (3.1704) [2022-10-01 14:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][700/1251] eta 0:02:42 lr 0.000020 time 0.2913 (0.2942) loss 3.1027 (2.9696) grad_norm 2.9712 (3.1609) [2022-10-01 14:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][800/1251] eta 0:02:12 lr 0.000020 time 0.2879 (0.2936) loss 2.7938 (2.9675) grad_norm 3.0834 (3.1584) [2022-10-01 14:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][900/1251] eta 0:01:42 lr 0.000020 time 0.2875 (0.2932) loss 2.3837 (2.9755) grad_norm 3.0381 (3.1522) [2022-10-01 14:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1000/1251] eta 0:01:13 lr 0.000020 time 0.2931 (0.2928) loss 2.8200 (2.9761) grad_norm 2.7237 (3.1469) [2022-10-01 14:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1100/1251] eta 0:00:44 lr 0.000020 time 0.2884 (0.2925) loss 3.0567 (2.9781) grad_norm 2.6753 (3.1463) [2022-10-01 14:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1200/1251] eta 0:00:14 lr 0.000020 time 0.2893 (0.2923) loss 3.0639 (2.9787) grad_norm 3.0674 (3.1511) [2022-10-01 14:04:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 280 training takes 0:06:05 [2022-10-01 14:04:46 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saving...... [2022-10-01 14:04:47 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saved !!! [2022-10-01 14:04:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.589 (2.589) Loss 0.8932 (0.8932) Acc@1 79.688 (79.688) Acc@5 94.922 (94.922) [2022-10-01 14:04:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.918 Acc@5 95.528 [2022-10-01 14:04:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 14:04:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.92% [2022-10-01 14:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][0/1251] eta 1:08:36 lr 0.000020 time 3.2910 (3.2910) loss 3.0940 (3.0940) grad_norm 2.7590 (2.7590) [2022-10-01 14:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][100/1251] eta 0:06:07 lr 0.000020 time 0.2858 (0.3195) loss 3.0961 (3.0287) grad_norm 3.3081 (3.2313) [2022-10-01 14:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][200/1251] eta 0:05:19 lr 0.000020 time 0.2884 (0.3043) loss 3.1408 (3.0054) grad_norm 2.8539 (3.1845) [2022-10-01 14:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][300/1251] eta 0:04:44 lr 0.000020 time 0.2871 (0.2992) loss 3.0904 (2.9875) grad_norm 3.0221 (3.1834) [2022-10-01 14:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][400/1251] eta 0:04:12 lr 0.000019 time 0.2896 (0.2966) loss 3.1640 (2.9855) grad_norm 2.6385 (3.1592) [2022-10-01 14:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][500/1251] eta 0:03:41 lr 0.000019 time 0.2884 (0.2950) loss 2.9196 (2.9738) grad_norm 3.7424 (3.1634) [2022-10-01 14:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][600/1251] eta 0:03:11 lr 0.000019 time 0.2862 (0.2939) loss 3.5322 (2.9703) grad_norm 3.1476 (3.1683) [2022-10-01 14:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][700/1251] eta 0:02:41 lr 0.000019 time 0.2885 (0.2931) loss 3.3045 (2.9694) grad_norm 2.9951 (3.1615) [2022-10-01 14:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][800/1251] eta 0:02:11 lr 0.000019 time 0.2866 (0.2925) loss 2.9446 (2.9767) grad_norm 2.9082 (3.1550) [2022-10-01 14:09:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][900/1251] eta 0:01:42 lr 0.000019 time 0.2898 (0.2920) loss 1.9365 (2.9753) grad_norm 2.9139 (3.1556) [2022-10-01 14:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1000/1251] eta 0:01:13 lr 0.000019 time 0.2882 (0.2916) loss 2.9484 (2.9789) grad_norm 2.9315 (3.1487) [2022-10-01 14:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1100/1251] eta 0:00:43 lr 0.000019 time 0.2856 (0.2912) loss 2.4780 (2.9770) grad_norm 3.2409 (3.1497) [2022-10-01 14:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1200/1251] eta 0:00:14 lr 0.000019 time 0.2878 (0.2910) loss 2.4257 (2.9773) grad_norm 6.6292 (3.1560) [2022-10-01 14:11:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 281 training takes 0:06:04 [2022-10-01 14:11:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.235 (3.235) Loss 0.8927 (0.8927) Acc@1 79.688 (79.688) Acc@5 94.922 (94.922) [2022-10-01 14:11:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.908 Acc@5 95.570 [2022-10-01 14:11:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 14:11:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.92% [2022-10-01 14:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][0/1251] eta 0:58:45 lr 0.000019 time 2.8184 (2.8184) loss 3.2454 (3.2454) grad_norm 2.4870 (2.4870) [2022-10-01 14:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][100/1251] eta 0:06:05 lr 0.000019 time 0.2898 (0.3175) loss 3.0761 (2.9569) grad_norm 2.6266 (3.1287) [2022-10-01 14:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][200/1251] eta 0:05:19 lr 0.000019 time 0.2897 (0.3042) loss 2.5232 (2.9534) grad_norm 3.0850 (3.1620) [2022-10-01 14:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][300/1251] eta 0:04:45 lr 0.000019 time 0.2908 (0.2997) loss 3.0091 (2.9546) grad_norm 2.8772 (3.1761) [2022-10-01 14:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][400/1251] eta 0:04:13 lr 0.000018 time 0.2929 (0.2977) loss 3.5611 (2.9721) grad_norm 3.2323 (3.1754) [2022-10-01 14:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][500/1251] eta 0:03:42 lr 0.000018 time 0.2898 (0.2965) loss 2.7671 (2.9681) grad_norm 3.4543 (3.1958) [2022-10-01 14:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][600/1251] eta 0:03:12 lr 0.000018 time 0.2895 (0.2956) loss 3.3328 (2.9701) grad_norm 3.5707 (3.1825) [2022-10-01 14:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][700/1251] eta 0:02:42 lr 0.000018 time 0.2886 (0.2949) loss 3.5087 (2.9669) grad_norm 3.3148 (3.1724) [2022-10-01 14:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][800/1251] eta 0:02:12 lr 0.000018 time 0.2885 (0.2945) loss 3.3693 (2.9742) grad_norm 3.1625 (3.1637) [2022-10-01 14:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][900/1251] eta 0:01:43 lr 0.000018 time 0.2932 (0.2941) loss 3.4286 (2.9811) grad_norm 2.7828 (3.1651) [2022-10-01 14:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1000/1251] eta 0:01:13 lr 0.000018 time 0.2908 (0.2938) loss 2.5988 (2.9783) grad_norm 3.5136 (3.1627) [2022-10-01 14:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1100/1251] eta 0:00:44 lr 0.000018 time 0.2883 (0.2936) loss 2.9528 (2.9819) grad_norm 3.1532 (3.1721) [2022-10-01 14:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1200/1251] eta 0:00:14 lr 0.000018 time 0.2902 (0.2934) loss 2.7735 (2.9849) grad_norm 2.9970 (3.1907) [2022-10-01 14:17:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 282 training takes 0:06:07 [2022-10-01 14:17:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.308 (2.308) Loss 0.9117 (0.9117) Acc@1 79.199 (79.199) Acc@5 94.727 (94.727) [2022-10-01 14:17:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.900 Acc@5 95.496 [2022-10-01 14:17:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 14:17:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.92% [2022-10-01 14:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][0/1251] eta 1:09:20 lr 0.000018 time 3.3255 (3.3255) loss 1.9600 (1.9600) grad_norm 2.5466 (2.5466) [2022-10-01 14:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][100/1251] eta 0:06:07 lr 0.000018 time 0.2877 (0.3193) loss 1.9519 (2.9220) grad_norm 3.0576 (3.0660) [2022-10-01 14:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][200/1251] eta 0:05:19 lr 0.000018 time 0.2933 (0.3042) loss 3.4202 (2.9643) grad_norm 2.5489 (3.1187) [2022-10-01 14:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][300/1251] eta 0:04:44 lr 0.000018 time 0.2890 (0.2990) loss 2.8393 (2.9890) grad_norm 3.0182 (3.1200) [2022-10-01 14:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][400/1251] eta 0:04:12 lr 0.000018 time 0.2934 (0.2965) loss 3.3011 (2.9918) grad_norm 4.2268 (3.1361) [2022-10-01 14:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][500/1251] eta 0:03:41 lr 0.000017 time 0.2891 (0.2949) loss 3.4108 (2.9831) grad_norm 4.3080 (3.1458) [2022-10-01 14:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][600/1251] eta 0:03:11 lr 0.000017 time 0.2910 (0.2939) loss 3.0553 (2.9850) grad_norm 3.1171 (3.1477) [2022-10-01 14:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][700/1251] eta 0:02:41 lr 0.000017 time 0.2883 (0.2931) loss 2.5530 (2.9843) grad_norm 3.0210 (3.1461) [2022-10-01 14:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][800/1251] eta 0:02:11 lr 0.000017 time 0.2924 (0.2925) loss 3.1544 (2.9864) grad_norm 2.9191 (3.1449) [2022-10-01 14:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][900/1251] eta 0:01:42 lr 0.000017 time 0.2884 (0.2921) loss 3.4879 (2.9827) grad_norm 3.0747 (3.1526) [2022-10-01 14:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1000/1251] eta 0:01:13 lr 0.000017 time 0.2906 (0.2917) loss 2.8162 (2.9784) grad_norm 2.6669 (3.1685) [2022-10-01 14:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1100/1251] eta 0:00:44 lr 0.000017 time 0.2864 (0.2915) loss 2.0142 (2.9765) grad_norm 3.0239 (3.1673) [2022-10-01 14:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1200/1251] eta 0:00:14 lr 0.000017 time 0.2917 (0.2913) loss 3.4754 (2.9789) grad_norm 3.5169 (3.1703) [2022-10-01 14:23:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 283 training takes 0:06:04 [2022-10-01 14:23:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.271 (2.271) Loss 0.7725 (0.7725) Acc@1 81.836 (81.836) Acc@5 96.191 (96.191) [2022-10-01 14:23:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.962 Acc@5 95.504 [2022-10-01 14:23:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 14:23:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.96% [2022-10-01 14:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][0/1251] eta 1:11:21 lr 0.000017 time 3.4228 (3.4228) loss 3.6431 (3.6431) grad_norm 3.0118 (3.0118) [2022-10-01 14:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][100/1251] eta 0:06:10 lr 0.000017 time 0.2850 (0.3220) loss 2.8774 (2.9384) grad_norm 3.5241 (3.2014) [2022-10-01 14:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][200/1251] eta 0:05:21 lr 0.000017 time 0.2908 (0.3063) loss 2.0168 (2.9899) grad_norm 3.0974 (3.2099) [2022-10-01 14:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][300/1251] eta 0:04:46 lr 0.000017 time 0.2870 (0.3010) loss 2.7651 (2.9710) grad_norm 3.2868 (3.2132) [2022-10-01 14:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][400/1251] eta 0:04:13 lr 0.000017 time 0.2914 (0.2982) loss 2.1688 (2.9668) grad_norm 3.2129 (3.1968) [2022-10-01 14:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][500/1251] eta 0:03:42 lr 0.000017 time 0.2900 (0.2965) loss 2.0267 (2.9750) grad_norm 2.7445 (3.1944) [2022-10-01 14:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][600/1251] eta 0:03:12 lr 0.000017 time 0.2959 (0.2953) loss 3.1600 (2.9844) grad_norm 2.7295 (3.1996) [2022-10-01 14:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][700/1251] eta 0:02:42 lr 0.000016 time 0.2862 (0.2944) loss 2.9536 (2.9850) grad_norm 2.6404 (3.1858) [2022-10-01 14:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][800/1251] eta 0:02:12 lr 0.000016 time 0.2886 (0.2937) loss 2.3701 (2.9825) grad_norm 3.1264 (3.1759) [2022-10-01 14:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][900/1251] eta 0:01:42 lr 0.000016 time 0.2859 (0.2932) loss 3.1302 (2.9878) grad_norm 3.1385 (3.1731) [2022-10-01 14:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1000/1251] eta 0:01:13 lr 0.000016 time 0.2924 (0.2927) loss 3.2168 (2.9875) grad_norm 2.4907 (3.1760) [2022-10-01 14:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1100/1251] eta 0:00:44 lr 0.000016 time 0.2847 (0.2924) loss 3.3898 (2.9849) grad_norm 2.5534 (3.1872) [2022-10-01 14:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1200/1251] eta 0:00:14 lr 0.000016 time 0.2885 (0.2920) loss 3.2323 (2.9788) grad_norm 2.4134 (3.1807) [2022-10-01 14:30:00 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 284 training takes 0:06:05 [2022-10-01 14:30:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.275 (2.275) Loss 0.7551 (0.7551) Acc@1 82.617 (82.617) Acc@5 95.801 (95.801) [2022-10-01 14:30:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.892 Acc@5 95.502 [2022-10-01 14:30:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-01 14:30:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.96% [2022-10-01 14:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][0/1251] eta 0:49:02 lr 0.000016 time 2.3524 (2.3524) loss 3.0809 (3.0809) grad_norm 3.2632 (3.2632) [2022-10-01 14:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][100/1251] eta 0:06:05 lr 0.000016 time 0.2947 (0.3176) loss 3.4657 (2.9650) grad_norm 3.1113 (3.1509) [2022-10-01 14:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][200/1251] eta 0:05:19 lr 0.000016 time 0.2892 (0.3041) loss 2.6372 (2.9506) grad_norm 3.1778 (3.1809) [2022-10-01 14:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][300/1251] eta 0:04:45 lr 0.000016 time 0.2880 (0.2997) loss 2.3003 (2.9390) grad_norm 3.0565 (3.1804) [2022-10-01 14:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][400/1251] eta 0:04:13 lr 0.000016 time 0.2908 (0.2975) loss 2.9953 (2.9684) grad_norm 3.5813 (3.1926) [2022-10-01 14:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][500/1251] eta 0:03:42 lr 0.000016 time 0.2898 (0.2962) loss 2.6644 (2.9615) grad_norm 2.9805 (3.1825) [2022-10-01 14:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][600/1251] eta 0:03:12 lr 0.000016 time 0.2871 (0.2953) loss 3.5090 (2.9655) grad_norm 3.5662 (3.1806) [2022-10-01 14:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][700/1251] eta 0:02:42 lr 0.000016 time 0.2921 (0.2946) loss 3.4907 (2.9613) grad_norm 2.8605 (3.1836) [2022-10-01 14:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][800/1251] eta 0:02:12 lr 0.000016 time 0.2884 (0.2941) loss 3.2528 (2.9739) grad_norm 3.3472 (3.1888) [2022-10-01 14:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][900/1251] eta 0:01:43 lr 0.000016 time 0.2903 (0.2936) loss 3.2757 (2.9698) grad_norm 3.5400 (3.1981) [2022-10-01 14:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1000/1251] eta 0:01:13 lr 0.000015 time 0.2890 (0.2933) loss 2.7467 (2.9764) grad_norm 3.3682 (3.1975) [2022-10-01 14:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1100/1251] eta 0:00:44 lr 0.000015 time 0.2869 (0.2930) loss 2.0340 (2.9792) grad_norm 3.2886 (3.1929) [2022-10-01 14:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1200/1251] eta 0:00:14 lr 0.000015 time 0.2906 (0.2927) loss 3.3177 (2.9781) grad_norm 3.5572 (3.1920) [2022-10-01 14:36:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 285 training takes 0:06:06 [2022-10-01 14:36:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.267 (2.267) Loss 0.7947 (0.7947) Acc@1 79.883 (79.883) Acc@5 96.387 (96.387) [2022-10-01 14:36:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.014 Acc@5 95.470 [2022-10-01 14:36:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 14:36:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.01% [2022-10-01 14:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][0/1251] eta 1:04:52 lr 0.000015 time 3.1114 (3.1114) loss 3.3313 (3.3313) grad_norm 3.2104 (3.2104) [2022-10-01 14:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][100/1251] eta 0:06:07 lr 0.000015 time 0.2924 (0.3194) loss 2.1767 (3.0332) grad_norm 2.9262 (3.1670) [2022-10-01 14:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][200/1251] eta 0:05:20 lr 0.000015 time 0.2959 (0.3048) loss 3.3261 (3.0045) grad_norm 3.3838 (3.1747) [2022-10-01 14:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][300/1251] eta 0:04:45 lr 0.000015 time 0.2917 (0.2998) loss 3.3547 (3.0034) grad_norm 3.2015 (3.2026) [2022-10-01 14:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][400/1251] eta 0:04:13 lr 0.000015 time 0.2950 (0.2974) loss 3.2792 (2.9833) grad_norm 2.8147 (3.2045) [2022-10-01 14:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][500/1251] eta 0:03:42 lr 0.000015 time 0.3001 (0.2960) loss 3.2804 (2.9861) grad_norm 2.9327 (3.1907) [2022-10-01 14:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][600/1251] eta 0:03:12 lr 0.000015 time 0.2934 (0.2950) loss 3.0746 (2.9801) grad_norm 3.0201 (3.2033) [2022-10-01 14:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][700/1251] eta 0:02:42 lr 0.000015 time 0.2900 (0.2943) loss 3.3162 (2.9771) grad_norm 3.2770 (3.2119) [2022-10-01 14:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][800/1251] eta 0:02:12 lr 0.000015 time 0.2880 (0.2938) loss 2.3353 (2.9820) grad_norm 2.9929 (3.2126) [2022-10-01 14:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][900/1251] eta 0:01:42 lr 0.000015 time 0.2915 (0.2933) loss 3.2153 (2.9911) grad_norm 2.7217 (3.2178) [2022-10-01 14:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1000/1251] eta 0:01:13 lr 0.000015 time 0.2881 (0.2929) loss 2.7012 (2.9907) grad_norm 3.1535 (3.2201) [2022-10-01 14:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1100/1251] eta 0:00:44 lr 0.000015 time 0.2925 (0.2926) loss 2.2340 (2.9888) grad_norm 3.0569 (3.2174) [2022-10-01 14:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1200/1251] eta 0:00:14 lr 0.000015 time 0.2915 (0.2923) loss 3.1713 (2.9878) grad_norm 3.4234 (3.2087) [2022-10-01 14:42:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 286 training takes 0:06:05 [2022-10-01 14:42:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.203 (2.203) Loss 0.7665 (0.7665) Acc@1 82.227 (82.227) Acc@5 95.996 (95.996) [2022-10-01 14:42:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.984 Acc@5 95.520 [2022-10-01 14:42:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 14:42:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.01% [2022-10-01 14:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][0/1251] eta 1:09:07 lr 0.000015 time 3.3150 (3.3150) loss 2.6137 (2.6137) grad_norm 3.3613 (3.3613) [2022-10-01 14:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][100/1251] eta 0:06:10 lr 0.000015 time 0.2929 (0.3221) loss 3.5396 (2.9794) grad_norm 3.1648 (3.2577) [2022-10-01 14:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][200/1251] eta 0:05:22 lr 0.000014 time 0.2878 (0.3065) loss 3.2813 (2.9893) grad_norm 3.2791 (3.2535) [2022-10-01 14:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][300/1251] eta 0:04:46 lr 0.000014 time 0.2887 (0.3012) loss 3.1775 (2.9896) grad_norm 3.7712 (3.2896) [2022-10-01 14:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][400/1251] eta 0:04:13 lr 0.000014 time 0.2873 (0.2984) loss 3.5861 (3.0002) grad_norm 2.8217 (3.2592) [2022-10-01 14:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][500/1251] eta 0:03:42 lr 0.000014 time 0.2879 (0.2969) loss 3.2055 (3.0046) grad_norm 4.0630 (3.2507) [2022-10-01 14:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][600/1251] eta 0:03:12 lr 0.000014 time 0.2852 (0.2958) loss 3.1512 (3.0027) grad_norm 3.5074 (3.2381) [2022-10-01 14:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][700/1251] eta 0:02:42 lr 0.000014 time 0.2912 (0.2951) loss 3.3591 (3.0065) grad_norm 3.2100 (3.2303) [2022-10-01 14:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][800/1251] eta 0:02:12 lr 0.000014 time 0.2896 (0.2945) loss 3.0326 (3.0121) grad_norm 3.9686 (3.2224) [2022-10-01 14:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][900/1251] eta 0:01:43 lr 0.000014 time 0.2880 (0.2940) loss 2.8913 (3.0045) grad_norm 3.4011 (3.2228) [2022-10-01 14:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1000/1251] eta 0:01:13 lr 0.000014 time 0.2882 (0.2935) loss 3.3241 (2.9956) grad_norm 3.1804 (3.2194) [2022-10-01 14:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1100/1251] eta 0:00:44 lr 0.000014 time 0.2888 (0.2932) loss 3.2953 (2.9890) grad_norm 3.1044 (3.2192) [2022-10-01 14:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1200/1251] eta 0:00:14 lr 0.000014 time 0.2865 (0.2929) loss 2.9366 (2.9938) grad_norm 3.2154 (3.2173) [2022-10-01 14:48:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 287 training takes 0:06:06 [2022-10-01 14:49:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.718 (2.718) Loss 0.8362 (0.8362) Acc@1 81.543 (81.543) Acc@5 95.020 (95.020) [2022-10-01 14:49:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.026 Acc@5 95.534 [2022-10-01 14:49:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 14:49:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.03% [2022-10-01 14:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][0/1251] eta 1:01:41 lr 0.000014 time 2.9589 (2.9589) loss 2.9322 (2.9322) grad_norm 2.9569 (2.9569) [2022-10-01 14:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][100/1251] eta 0:06:08 lr 0.000014 time 0.2946 (0.3201) loss 2.6523 (2.9801) grad_norm 3.1780 (3.3671) [2022-10-01 14:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][200/1251] eta 0:05:22 lr 0.000014 time 0.2938 (0.3064) loss 2.9305 (2.9685) grad_norm 3.3683 (3.3135) [2022-10-01 14:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][300/1251] eta 0:04:46 lr 0.000014 time 0.2920 (0.3017) loss 2.7261 (2.9671) grad_norm 3.6559 (3.2769) [2022-10-01 14:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][400/1251] eta 0:04:14 lr 0.000014 time 0.2898 (0.2992) loss 3.3576 (2.9687) grad_norm 3.3814 (3.2727) [2022-10-01 14:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][500/1251] eta 0:03:43 lr 0.000014 time 0.2908 (0.2977) loss 2.2289 (2.9762) grad_norm 3.1555 (3.2792) [2022-10-01 14:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][600/1251] eta 0:03:13 lr 0.000014 time 0.2878 (0.2966) loss 3.1453 (2.9814) grad_norm 3.2543 (3.2697) [2022-10-01 14:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][700/1251] eta 0:02:42 lr 0.000014 time 0.2931 (0.2958) loss 3.4496 (2.9789) grad_norm 2.8657 (3.2588) [2022-10-01 14:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][800/1251] eta 0:02:13 lr 0.000013 time 0.2895 (0.2951) loss 2.0233 (2.9792) grad_norm 3.1503 (3.2612) [2022-10-01 14:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][900/1251] eta 0:01:43 lr 0.000013 time 0.2908 (0.2946) loss 3.2359 (2.9833) grad_norm 3.0626 (3.2615) [2022-10-01 14:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1000/1251] eta 0:01:13 lr 0.000013 time 0.2915 (0.2942) loss 2.6010 (2.9773) grad_norm 2.9448 (3.2570) [2022-10-01 14:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1100/1251] eta 0:00:44 lr 0.000013 time 0.2947 (0.2939) loss 3.2210 (2.9775) grad_norm 3.2602 (3.2652) [2022-10-01 14:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1200/1251] eta 0:00:14 lr 0.000013 time 0.2879 (0.2936) loss 2.8707 (2.9755) grad_norm 3.7545 (3.2686) [2022-10-01 14:55:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 288 training takes 0:06:07 [2022-10-01 14:55:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.141 (3.141) Loss 0.8005 (0.8005) Acc@1 82.520 (82.520) Acc@5 95.508 (95.508) [2022-10-01 14:55:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.034 Acc@5 95.562 [2022-10-01 14:55:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 14:55:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.03% [2022-10-01 14:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][0/1251] eta 1:03:48 lr 0.000013 time 3.0606 (3.0606) loss 2.8540 (2.8540) grad_norm 2.7264 (2.7264) [2022-10-01 14:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][100/1251] eta 0:06:06 lr 0.000013 time 0.2868 (0.3187) loss 2.4069 (2.8885) grad_norm 2.9248 (3.1859) [2022-10-01 14:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][200/1251] eta 0:05:19 lr 0.000013 time 0.2945 (0.3041) loss 1.9209 (2.9017) grad_norm 3.9890 (3.2508) [2022-10-01 14:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][300/1251] eta 0:04:44 lr 0.000013 time 0.2885 (0.2992) loss 3.2505 (2.9418) grad_norm 2.9125 (3.2628) [2022-10-01 14:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][400/1251] eta 0:04:12 lr 0.000013 time 0.2891 (0.2967) loss 3.3114 (2.9439) grad_norm 3.7622 (3.2716) [2022-10-01 14:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][500/1251] eta 0:03:41 lr 0.000013 time 0.2888 (0.2951) loss 3.3147 (2.9431) grad_norm 3.4304 (3.2649) [2022-10-01 14:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][600/1251] eta 0:03:11 lr 0.000013 time 0.2866 (0.2941) loss 3.0262 (2.9518) grad_norm 3.8362 (3.2566) [2022-10-01 14:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][700/1251] eta 0:02:41 lr 0.000013 time 0.2876 (0.2935) loss 3.7889 (2.9520) grad_norm 3.2576 (3.2532) [2022-10-01 14:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][800/1251] eta 0:02:12 lr 0.000013 time 0.2883 (0.2929) loss 3.2138 (2.9542) grad_norm 3.0245 (3.2555) [2022-10-01 14:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][900/1251] eta 0:01:42 lr 0.000013 time 0.2911 (0.2925) loss 2.3798 (2.9597) grad_norm 2.7894 (3.2518) [2022-10-01 15:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1000/1251] eta 0:01:13 lr 0.000013 time 0.2846 (0.2922) loss 2.9754 (2.9626) grad_norm 3.7392 (3.2544) [2022-10-01 15:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1100/1251] eta 0:00:44 lr 0.000013 time 0.2916 (0.2919) loss 2.8102 (2.9652) grad_norm 2.8956 (3.2601) [2022-10-01 15:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1200/1251] eta 0:00:14 lr 0.000013 time 0.2902 (0.2917) loss 3.5406 (2.9670) grad_norm 3.4136 (3.2574) [2022-10-01 15:01:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 289 training takes 0:06:05 [2022-10-01 15:01:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.268 (3.268) Loss 0.8444 (0.8444) Acc@1 80.762 (80.762) Acc@5 95.020 (95.020) [2022-10-01 15:01:49 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.022 Acc@5 95.546 [2022-10-01 15:01:49 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:01:49 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.03% [2022-10-01 15:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][0/1251] eta 0:53:21 lr 0.000013 time 2.5592 (2.5592) loss 2.8339 (2.8339) grad_norm 3.0667 (3.0667) [2022-10-01 15:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][100/1251] eta 0:06:07 lr 0.000013 time 0.2869 (0.3197) loss 2.2671 (2.9450) grad_norm 3.3133 (3.2027) [2022-10-01 15:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][200/1251] eta 0:05:20 lr 0.000013 time 0.2895 (0.3049) loss 1.8514 (2.9691) grad_norm 2.8139 (3.2477) [2022-10-01 15:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][300/1251] eta 0:04:45 lr 0.000013 time 0.2853 (0.2998) loss 2.6669 (2.9561) grad_norm 3.2460 (3.2503) [2022-10-01 15:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][400/1251] eta 0:04:13 lr 0.000013 time 0.2942 (0.2974) loss 2.0724 (2.9714) grad_norm 2.8077 (3.2665) [2022-10-01 15:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][500/1251] eta 0:03:42 lr 0.000012 time 0.2854 (0.2958) loss 3.0131 (2.9798) grad_norm 3.8792 (3.2623) [2022-10-01 15:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][600/1251] eta 0:03:11 lr 0.000012 time 0.2891 (0.2946) loss 3.4165 (2.9766) grad_norm 3.5703 (3.2549) [2022-10-01 15:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][700/1251] eta 0:02:41 lr 0.000012 time 0.2864 (0.2939) loss 2.2897 (2.9694) grad_norm 3.2038 (3.2545) [2022-10-01 15:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][800/1251] eta 0:02:12 lr 0.000012 time 0.2880 (0.2933) loss 3.1261 (2.9634) grad_norm 2.6931 (3.2485) [2022-10-01 15:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][900/1251] eta 0:01:42 lr 0.000012 time 0.2843 (0.2929) loss 3.6064 (2.9654) grad_norm 3.0458 (3.2569) [2022-10-01 15:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1000/1251] eta 0:01:13 lr 0.000012 time 0.2889 (0.2926) loss 3.1945 (2.9685) grad_norm 3.1511 (3.2489) [2022-10-01 15:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1100/1251] eta 0:00:44 lr 0.000012 time 0.2911 (0.2924) loss 2.3078 (2.9637) grad_norm 2.9674 (3.2460) [2022-10-01 15:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1200/1251] eta 0:00:14 lr 0.000012 time 0.2898 (0.2921) loss 2.6604 (2.9610) grad_norm 3.2015 (3.2415) [2022-10-01 15:07:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 290 training takes 0:06:05 [2022-10-01 15:07:55 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saving...... [2022-10-01 15:07:55 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saved !!! [2022-10-01 15:07:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.893 (2.893) Loss 0.7734 (0.7734) Acc@1 83.691 (83.691) Acc@5 95.117 (95.117) [2022-10-01 15:08:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.992 Acc@5 95.550 [2022-10-01 15:08:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:08:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.03% [2022-10-01 15:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][0/1251] eta 1:08:28 lr 0.000012 time 3.2842 (3.2842) loss 3.0507 (3.0507) grad_norm 3.1906 (3.1906) [2022-10-01 15:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][100/1251] eta 0:06:12 lr 0.000012 time 0.2949 (0.3234) loss 2.6097 (2.9728) grad_norm 3.0840 (3.2415) [2022-10-01 15:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][200/1251] eta 0:05:23 lr 0.000012 time 0.2888 (0.3083) loss 2.9569 (2.9862) grad_norm 2.7810 (3.2749) [2022-10-01 15:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][300/1251] eta 0:04:48 lr 0.000012 time 0.2892 (0.3029) loss 2.2268 (3.0035) grad_norm 3.3258 (3.2711) [2022-10-01 15:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][400/1251] eta 0:04:15 lr 0.000012 time 0.2861 (0.3002) loss 3.0488 (2.9838) grad_norm 3.1736 (3.2778) [2022-10-01 15:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][500/1251] eta 0:03:44 lr 0.000012 time 0.2945 (0.2986) loss 3.3909 (2.9823) grad_norm 2.5875 (3.2675) [2022-10-01 15:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][600/1251] eta 0:03:13 lr 0.000012 time 0.2913 (0.2976) loss 2.2058 (2.9717) grad_norm 3.2418 (3.2701) [2022-10-01 15:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][700/1251] eta 0:02:43 lr 0.000012 time 0.2906 (0.2968) loss 3.7246 (2.9760) grad_norm 2.9321 (3.2634) [2022-10-01 15:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][800/1251] eta 0:02:13 lr 0.000012 time 0.2877 (0.2960) loss 2.5087 (2.9772) grad_norm 3.2894 (3.2583) [2022-10-01 15:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][900/1251] eta 0:01:43 lr 0.000012 time 0.2892 (0.2953) loss 3.0505 (2.9816) grad_norm 3.3449 (3.2586) [2022-10-01 15:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1000/1251] eta 0:01:13 lr 0.000012 time 0.2883 (0.2947) loss 3.2042 (2.9871) grad_norm 3.1031 (3.2551) [2022-10-01 15:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1100/1251] eta 0:00:44 lr 0.000012 time 0.2943 (0.2943) loss 2.4291 (2.9926) grad_norm 3.3476 (3.2572) [2022-10-01 15:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1200/1251] eta 0:00:14 lr 0.000012 time 0.2893 (0.2938) loss 3.3726 (2.9915) grad_norm 3.4517 (3.2549) [2022-10-01 15:14:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 291 training takes 0:06:07 [2022-10-01 15:14:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.182 (2.182) Loss 0.8219 (0.8219) Acc@1 81.641 (81.641) Acc@5 95.898 (95.898) [2022-10-01 15:14:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.062 Acc@5 95.552 [2022-10-01 15:14:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-01 15:14:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.06% [2022-10-01 15:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][0/1251] eta 1:00:40 lr 0.000012 time 2.9103 (2.9103) loss 3.2057 (3.2057) grad_norm 3.1351 (3.1351) [2022-10-01 15:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][100/1251] eta 0:06:08 lr 0.000012 time 0.2920 (0.3200) loss 3.2962 (3.0149) grad_norm 3.1193 (3.2585) [2022-10-01 15:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][200/1251] eta 0:05:20 lr 0.000012 time 0.2877 (0.3051) loss 3.1626 (3.0487) grad_norm 4.0944 (3.2497) [2022-10-01 15:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][300/1251] eta 0:04:45 lr 0.000012 time 0.2870 (0.3000) loss 3.0765 (3.0405) grad_norm 2.8843 (3.2431) [2022-10-01 15:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][400/1251] eta 0:04:12 lr 0.000012 time 0.2871 (0.2973) loss 3.3928 (3.0211) grad_norm 4.3340 (3.2537) [2022-10-01 15:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][500/1251] eta 0:03:42 lr 0.000012 time 0.2871 (0.2956) loss 3.1772 (3.0285) grad_norm 3.6187 (3.2582) [2022-10-01 15:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][600/1251] eta 0:03:11 lr 0.000012 time 0.2875 (0.2946) loss 3.0889 (3.0240) grad_norm 3.2745 (3.2585) [2022-10-01 15:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][700/1251] eta 0:02:41 lr 0.000012 time 0.2873 (0.2937) loss 3.2663 (3.0027) grad_norm 3.1276 (3.2624) [2022-10-01 15:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][800/1251] eta 0:02:12 lr 0.000011 time 0.2843 (0.2929) loss 2.6544 (2.9876) grad_norm 3.3329 (3.2618) [2022-10-01 15:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2846 (0.2923) loss 3.2698 (2.9849) grad_norm 3.2223 (3.2758) [2022-10-01 15:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2881 (0.2919) loss 3.5638 (2.9745) grad_norm 3.0492 (3.2756) [2022-10-01 15:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1100/1251] eta 0:00:44 lr 0.000011 time 0.2850 (0.2915) loss 3.1974 (2.9676) grad_norm 2.9703 (3.2770) [2022-10-01 15:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2899 (0.2912) loss 2.7341 (2.9654) grad_norm 3.2977 (3.2703) [2022-10-01 15:20:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 292 training takes 0:06:04 [2022-10-01 15:20:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.282 (2.282) Loss 0.7901 (0.7901) Acc@1 81.250 (81.250) Acc@5 95.410 (95.410) [2022-10-01 15:20:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.994 Acc@5 95.520 [2022-10-01 15:20:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:20:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.06% [2022-10-01 15:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][0/1251] eta 1:04:19 lr 0.000011 time 3.0848 (3.0848) loss 3.0510 (3.0510) grad_norm 3.1052 (3.1052) [2022-10-01 15:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][100/1251] eta 0:06:07 lr 0.000011 time 0.2939 (0.3189) loss 3.7569 (2.9850) grad_norm 3.5445 (3.3651) [2022-10-01 15:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][200/1251] eta 0:05:19 lr 0.000011 time 0.2868 (0.3044) loss 3.3467 (2.9457) grad_norm 2.8441 (3.2767) [2022-10-01 15:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][300/1251] eta 0:04:44 lr 0.000011 time 0.2922 (0.2995) loss 3.5939 (2.9527) grad_norm 3.8148 (3.2819) [2022-10-01 15:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][400/1251] eta 0:04:12 lr 0.000011 time 0.2846 (0.2970) loss 3.3149 (2.9739) grad_norm 6.9372 (3.3081) [2022-10-01 15:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][500/1251] eta 0:03:41 lr 0.000011 time 0.2944 (0.2955) loss 3.0572 (2.9780) grad_norm 2.6182 (3.3016) [2022-10-01 15:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][600/1251] eta 0:03:11 lr 0.000011 time 0.2877 (0.2945) loss 3.2751 (2.9744) grad_norm 3.6676 (3.3012) [2022-10-01 15:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][700/1251] eta 0:02:41 lr 0.000011 time 0.2920 (0.2937) loss 1.9173 (2.9635) grad_norm 3.4053 (3.2971) [2022-10-01 15:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][800/1251] eta 0:02:12 lr 0.000011 time 0.2873 (0.2932) loss 1.8269 (2.9585) grad_norm 3.9710 (3.2880) [2022-10-01 15:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2931 (0.2928) loss 3.3330 (2.9559) grad_norm 4.3847 (3.2863) [2022-10-01 15:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2824 (0.2923) loss 1.9910 (2.9578) grad_norm 3.2351 (3.2888) [2022-10-01 15:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1100/1251] eta 0:00:44 lr 0.000011 time 0.2949 (0.2920) loss 3.2045 (2.9512) grad_norm 3.0141 (3.2915) [2022-10-01 15:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2911 (0.2918) loss 3.3772 (2.9458) grad_norm 3.0228 (3.2885) [2022-10-01 15:26:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 293 training takes 0:06:05 [2022-10-01 15:26:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.387 (3.387) Loss 0.7268 (0.7268) Acc@1 83.984 (83.984) Acc@5 96.973 (96.973) [2022-10-01 15:27:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.048 Acc@5 95.522 [2022-10-01 15:27:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:27:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.06% [2022-10-01 15:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][0/1251] eta 1:08:55 lr 0.000011 time 3.3060 (3.3060) loss 3.2618 (3.2618) grad_norm 3.5657 (3.5657) [2022-10-01 15:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][100/1251] eta 0:06:07 lr 0.000011 time 0.2888 (0.3189) loss 2.6442 (2.9947) grad_norm 3.3159 (3.2535) [2022-10-01 15:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][200/1251] eta 0:05:18 lr 0.000011 time 0.2851 (0.3034) loss 2.6161 (2.9989) grad_norm 5.5096 (3.2786) [2022-10-01 15:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][300/1251] eta 0:04:43 lr 0.000011 time 0.2905 (0.2982) loss 1.8137 (2.9599) grad_norm 3.0679 (3.2867) [2022-10-01 15:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][400/1251] eta 0:04:11 lr 0.000011 time 0.2871 (0.2956) loss 3.1506 (2.9690) grad_norm 2.9440 (3.2901) [2022-10-01 15:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][500/1251] eta 0:03:40 lr 0.000011 time 0.2877 (0.2941) loss 2.3042 (2.9548) grad_norm 3.2636 (3.2955) [2022-10-01 15:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][600/1251] eta 0:03:10 lr 0.000011 time 0.2855 (0.2931) loss 3.2892 (2.9554) grad_norm 3.2791 (3.2896) [2022-10-01 15:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][700/1251] eta 0:02:41 lr 0.000011 time 0.2904 (0.2923) loss 3.3066 (2.9598) grad_norm 3.1690 (3.2815) [2022-10-01 15:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][800/1251] eta 0:02:11 lr 0.000011 time 0.2899 (0.2918) loss 2.4696 (2.9571) grad_norm 2.9688 (3.2839) [2022-10-01 15:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2867 (0.2913) loss 2.4927 (2.9601) grad_norm 2.6419 (3.2803) [2022-10-01 15:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2893 (0.2910) loss 1.8573 (2.9604) grad_norm 3.3833 (3.2820) [2022-10-01 15:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1100/1251] eta 0:00:43 lr 0.000011 time 0.2845 (0.2906) loss 3.1990 (2.9591) grad_norm 2.8731 (3.2826) [2022-10-01 15:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2885 (0.2904) loss 3.4715 (2.9607) grad_norm 3.2394 (3.2897) [2022-10-01 15:33:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 294 training takes 0:06:03 [2022-10-01 15:33:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.823 (2.823) Loss 0.8163 (0.8163) Acc@1 81.738 (81.738) Acc@5 94.922 (94.922) [2022-10-01 15:33:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.062 Acc@5 95.556 [2022-10-01 15:33:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-01 15:33:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.06% [2022-10-01 15:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][0/1251] eta 1:03:21 lr 0.000011 time 3.0388 (3.0388) loss 2.0255 (2.0255) grad_norm 3.2308 (3.2308) [2022-10-01 15:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][100/1251] eta 0:06:07 lr 0.000011 time 0.2904 (0.3189) loss 3.0940 (2.9618) grad_norm 3.5212 (3.3155) [2022-10-01 15:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][200/1251] eta 0:05:19 lr 0.000011 time 0.2923 (0.3044) loss 3.3744 (2.9819) grad_norm 3.8867 (3.2688) [2022-10-01 15:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][300/1251] eta 0:04:44 lr 0.000011 time 0.2865 (0.2992) loss 2.7273 (2.9677) grad_norm 2.8788 (3.2645) [2022-10-01 15:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][400/1251] eta 0:04:12 lr 0.000011 time 0.2874 (0.2967) loss 1.9366 (2.9587) grad_norm 3.8305 (3.2861) [2022-10-01 15:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][500/1251] eta 0:03:41 lr 0.000011 time 0.2916 (0.2950) loss 3.1842 (2.9596) grad_norm 2.7481 (3.2838) [2022-10-01 15:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][600/1251] eta 0:03:11 lr 0.000011 time 0.2855 (0.2939) loss 2.3867 (2.9668) grad_norm 3.4310 (3.2825) [2022-10-01 15:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][700/1251] eta 0:02:41 lr 0.000011 time 0.2866 (0.2931) loss 3.3423 (2.9646) grad_norm 3.3489 (3.2797) [2022-10-01 15:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][800/1251] eta 0:02:11 lr 0.000011 time 0.2864 (0.2925) loss 3.4034 (2.9595) grad_norm 3.1764 (3.2953) [2022-10-01 15:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2852 (0.2920) loss 3.1503 (2.9584) grad_norm 3.0437 (3.2962) [2022-10-01 15:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2864 (0.2916) loss 3.4242 (2.9619) grad_norm 3.1104 (3.3119) [2022-10-01 15:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1100/1251] eta 0:00:43 lr 0.000010 time 0.2889 (0.2913) loss 3.6492 (2.9646) grad_norm 3.1674 (3.3105) [2022-10-01 15:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2868 (0.2910) loss 2.5193 (2.9632) grad_norm 2.8630 (3.3077) [2022-10-01 15:39:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 295 training takes 0:06:04 [2022-10-01 15:39:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.366 (2.366) Loss 0.7879 (0.7879) Acc@1 82.031 (82.031) Acc@5 95.605 (95.605) [2022-10-01 15:39:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.068 Acc@5 95.486 [2022-10-01 15:39:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-01 15:39:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-01 15:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][0/1251] eta 1:09:39 lr 0.000010 time 3.3412 (3.3412) loss 2.8155 (2.8155) grad_norm 3.7560 (3.7560) [2022-10-01 15:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][100/1251] eta 0:06:08 lr 0.000010 time 0.2911 (0.3205) loss 2.3347 (2.9720) grad_norm 3.1021 (3.3492) [2022-10-01 15:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][200/1251] eta 0:05:20 lr 0.000010 time 0.2884 (0.3049) loss 2.3975 (2.9480) grad_norm 3.3181 (3.2904) [2022-10-01 15:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][300/1251] eta 0:04:45 lr 0.000010 time 0.2858 (0.2998) loss 3.3329 (2.9801) grad_norm 3.2790 (3.2626) [2022-10-01 15:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2877 (0.2973) loss 3.1167 (2.9666) grad_norm 3.0290 (3.2668) [2022-10-01 15:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][500/1251] eta 0:03:42 lr 0.000010 time 0.2909 (0.2957) loss 2.4840 (2.9738) grad_norm 2.9000 (3.2909) [2022-10-01 15:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2883 (0.2947) loss 2.6477 (2.9651) grad_norm 3.6827 (3.3005) [2022-10-01 15:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2913 (0.2939) loss 2.5330 (2.9666) grad_norm 3.2542 (3.3046) [2022-10-01 15:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2888 (0.2934) loss 3.0112 (2.9648) grad_norm 2.7627 (3.3082) [2022-10-01 15:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2895 (0.2929) loss 3.2787 (2.9558) grad_norm 3.2836 (3.3046) [2022-10-01 15:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2885 (0.2925) loss 3.1800 (2.9599) grad_norm 2.7993 (3.2965) [2022-10-01 15:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2893 (0.2923) loss 1.8626 (2.9604) grad_norm 2.9198 (3.2978) [2022-10-01 15:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2895 (0.2921) loss 2.3591 (2.9570) grad_norm 3.4527 (3.2959) [2022-10-01 15:45:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 296 training takes 0:06:05 [2022-10-01 15:45:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.385 (2.385) Loss 0.7956 (0.7956) Acc@1 81.055 (81.055) Acc@5 95.703 (95.703) [2022-10-01 15:45:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.996 Acc@5 95.546 [2022-10-01 15:45:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:45:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-01 15:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][0/1251] eta 0:58:34 lr 0.000010 time 2.8097 (2.8097) loss 2.9426 (2.9426) grad_norm 2.8656 (2.8656) [2022-10-01 15:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][100/1251] eta 0:06:07 lr 0.000010 time 0.2875 (0.3191) loss 2.9766 (3.0704) grad_norm 3.1293 (3.3278) [2022-10-01 15:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][200/1251] eta 0:05:20 lr 0.000010 time 0.2888 (0.3046) loss 3.0452 (3.0204) grad_norm 4.5486 (3.3064) [2022-10-01 15:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][300/1251] eta 0:04:44 lr 0.000010 time 0.2885 (0.2994) loss 3.0990 (2.9913) grad_norm 3.2037 (3.3354) [2022-10-01 15:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2955 (0.2968) loss 2.6301 (2.9853) grad_norm 2.9925 (3.3335) [2022-10-01 15:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2890 (0.2953) loss 2.9675 (2.9836) grad_norm 3.1452 (3.3339) [2022-10-01 15:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2856 (0.2942) loss 3.2271 (2.9899) grad_norm 3.4259 (3.3365) [2022-10-01 15:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2874 (0.2935) loss 2.2207 (2.9799) grad_norm 3.3352 (3.3351) [2022-10-01 15:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2865 (0.2929) loss 2.2276 (2.9765) grad_norm 4.0042 (3.3417) [2022-10-01 15:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2889 (0.2925) loss 2.6045 (2.9608) grad_norm 3.2197 (3.3565) [2022-10-01 15:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2853 (0.2921) loss 2.3060 (2.9636) grad_norm 2.7791 (3.3583) [2022-10-01 15:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2914 (0.2918) loss 3.0975 (2.9650) grad_norm 3.1171 (3.3594) [2022-10-01 15:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2886 (0.2916) loss 3.0134 (2.9612) grad_norm 3.4108 (3.3526) [2022-10-01 15:52:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 297 training takes 0:06:05 [2022-10-01 15:52:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.396 (2.396) Loss 0.8320 (0.8320) Acc@1 82.129 (82.129) Acc@5 95.117 (95.117) [2022-10-01 15:52:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.010 Acc@5 95.556 [2022-10-01 15:52:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:52:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-01 15:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][0/1251] eta 1:07:00 lr 0.000010 time 3.2137 (3.2137) loss 2.7996 (2.7996) grad_norm 3.1403 (3.1403) [2022-10-01 15:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][100/1251] eta 0:06:08 lr 0.000010 time 0.2876 (0.3200) loss 3.1574 (2.9575) grad_norm 2.9811 (3.2682) [2022-10-01 15:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][200/1251] eta 0:05:20 lr 0.000010 time 0.2873 (0.3050) loss 3.4129 (2.9823) grad_norm 3.5650 (3.3142) [2022-10-01 15:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][300/1251] eta 0:04:45 lr 0.000010 time 0.2919 (0.2998) loss 3.0127 (2.9804) grad_norm 3.6150 (3.3170) [2022-10-01 15:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2873 (0.2972) loss 3.3191 (2.9978) grad_norm 2.8171 (3.3491) [2022-10-01 15:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][500/1251] eta 0:03:42 lr 0.000010 time 0.2869 (0.2957) loss 2.8407 (2.9743) grad_norm 3.1971 (3.3433) [2022-10-01 15:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2873 (0.2947) loss 2.5046 (2.9678) grad_norm 3.2874 (3.3471) [2022-10-01 15:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2867 (0.2939) loss 2.5882 (2.9484) grad_norm 3.3704 (3.3408) [2022-10-01 15:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2914 (0.2933) loss 3.6013 (2.9568) grad_norm 3.4592 (3.3458) [2022-10-01 15:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2860 (0.2928) loss 3.6214 (2.9460) grad_norm 3.0041 (3.3333) [2022-10-01 15:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2882 (0.2924) loss 2.7166 (2.9425) grad_norm 3.5700 (3.3394) [2022-10-01 15:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2870 (0.2920) loss 3.4361 (2.9464) grad_norm 3.5161 (3.3385) [2022-10-01 15:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2846 (0.2917) loss 3.4858 (2.9478) grad_norm 3.2249 (3.3448) [2022-10-01 15:58:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 298 training takes 0:06:05 [2022-10-01 15:58:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.832 (2.832) Loss 0.7576 (0.7576) Acc@1 82.324 (82.324) Acc@5 95.996 (95.996) [2022-10-01 15:58:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.960 Acc@5 95.550 [2022-10-01 15:58:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-01 15:58:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-01 15:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][0/1251] eta 1:04:31 lr 0.000010 time 3.0950 (3.0950) loss 3.3257 (3.3257) grad_norm 3.4375 (3.4375) [2022-10-01 15:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][100/1251] eta 0:06:05 lr 0.000010 time 0.2906 (0.3176) loss 2.4520 (2.9940) grad_norm 3.2615 (3.3211) [2022-10-01 15:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][200/1251] eta 0:05:18 lr 0.000010 time 0.2947 (0.3033) loss 1.7595 (2.9882) grad_norm 2.7691 (3.3705) [2022-10-01 16:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][300/1251] eta 0:04:43 lr 0.000010 time 0.2905 (0.2984) loss 2.9106 (2.9545) grad_norm 3.0246 (3.3428) [2022-10-01 16:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][400/1251] eta 0:04:11 lr 0.000010 time 0.2924 (0.2959) loss 3.4004 (2.9671) grad_norm 3.0054 (3.3539) [2022-10-01 16:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2918 (0.2945) loss 2.6017 (2.9652) grad_norm 4.1258 (3.3648) [2022-10-01 16:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2884 (0.2935) loss 3.2410 (2.9728) grad_norm 4.3307 (3.3616) [2022-10-01 16:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2901 (0.2927) loss 3.3550 (2.9696) grad_norm 3.1547 (3.3840) [2022-10-01 16:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][800/1251] eta 0:02:11 lr 0.000010 time 0.2919 (0.2921) loss 3.4016 (2.9620) grad_norm 3.9053 (3.3724) [2022-10-01 16:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2872 (0.2917) loss 3.2202 (2.9641) grad_norm 2.8476 (3.3651) [2022-10-01 16:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2929 (0.2914) loss 2.8381 (2.9564) grad_norm 3.4671 (3.3581) [2022-10-01 16:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1100/1251] eta 0:00:43 lr 0.000010 time 0.2898 (0.2911) loss 3.3686 (2.9569) grad_norm 3.1269 (3.3571) [2022-10-01 16:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2867 (0.2909) loss 1.9605 (2.9550) grad_norm 3.0746 (3.3571) [2022-10-01 16:04:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 299 training takes 0:06:04 [2022-10-01 16:04:37 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saving...... [2022-10-01 16:04:37 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saved !!! [2022-10-01 16:04:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.308 (2.308) Loss 0.8178 (0.8178) Acc@1 81.348 (81.348) Acc@5 95.117 (95.117) [2022-10-01 16:04:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.056 Acc@5 95.584 [2022-10-01 16:04:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-01 16:04:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-01 16:04:50 swin_tiny_patch4_window7_224] (main.py 139): INFO Training time 2 days, 7:06:36