[2022-10-07 10:01:21 swin_tiny_patch4_window7_224] (main.py 318): INFO Full config saved to output/swin_tiny_patch4_window7_224/fix_eager_global/config.json [2022-10-07 10:01:21 swin_tiny_patch4_window7_224] (main.py 321): INFO AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 128 CACHE_MODE: part DATASET: imagenet DATA_PATH: /data/ImageNet/extract/ IMG_SIZE: 224 INTERPOLATION: bicubic NUM_WORKERS: 8 PIN_MEMORY: true ZIP_MODE: false EVAL_MODE: false LOCAL_RANK: 0 MODEL: DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 NAME: swin_tiny_patch4_window7_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 TYPE: swin OUTPUT: output/swin_tiny_patch4_window7_224/fix_eager_global PRINT_FREQ: 100 SAVE_FREQ: 10 SEED: 0 TAG: fix_eager_global TEST: CROP: true SEQUENTIAL: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 0 AUTO_RESUME: false BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 NAME: cosine MIN_LR: 1.0e-05 OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 [2022-10-07 10:01:25 swin_tiny_patch4_window7_224] (main.py 70): INFO Creating model:swin/swin_tiny_patch4_window7_224 [2022-10-07 10:01:27 swin_tiny_patch4_window7_224] (main.py 73): INFO SwinTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((96,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=384, out_features=192, bias=False) (norm): LayerNorm((384,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=768, out_features=384, bias=False) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=1536, out_features=768, bias=False) (norm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d() (head): Linear(in_features=768, out_features=1000, bias=True) ) [2022-10-07 10:01:27 swin_tiny_patch4_window7_224] (main.py 84): INFO number of params: 28288354 [2022-10-07 10:01:27 swin_tiny_patch4_window7_224] (main.py 109): INFO Start training [2022-10-07 10:01:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][0/1251] eta 2:29:15 lr 0.000001 time 7.1586 (7.1586) loss 6.9832 (6.9832) grad_norm 1.3617 (1.3617) [2022-10-07 10:02:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][100/1251] eta 0:07:26 lr 0.000005 time 0.3224 (0.3876) loss 6.9400 (6.9522) grad_norm 1.2151 (1.3023) [2022-10-07 10:02:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][200/1251] eta 0:06:12 lr 0.000009 time 0.3204 (0.3545) loss 6.9226 (6.9362) grad_norm 1.1262 (1.2459) [2022-10-07 10:03:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][300/1251] eta 0:05:27 lr 0.000013 time 0.3253 (0.3440) loss 6.8952 (6.9243) grad_norm 1.0138 (1.1927) [2022-10-07 10:03:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][400/1251] eta 0:04:48 lr 0.000017 time 0.3215 (0.3389) loss 6.8698 (6.9142) grad_norm 0.9574 (1.1460) [2022-10-07 10:04:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][500/1251] eta 0:04:12 lr 0.000021 time 0.3283 (0.3360) loss 6.8582 (6.9053) grad_norm 1.0518 (1.1077) [2022-10-07 10:04:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][600/1251] eta 0:03:37 lr 0.000025 time 0.3223 (0.3340) loss 6.8548 (6.8970) grad_norm 0.9201 (1.0775) [2022-10-07 10:05:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][700/1251] eta 0:03:03 lr 0.000029 time 0.3278 (0.3327) loss 6.8351 (6.8894) grad_norm 0.9100 (1.0616) [2022-10-07 10:05:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][800/1251] eta 0:02:29 lr 0.000033 time 0.3218 (0.3316) loss 6.8414 (6.8819) grad_norm 1.3560 (1.0640) [2022-10-07 10:06:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][900/1251] eta 0:01:56 lr 0.000037 time 0.3229 (0.3308) loss 6.8010 (6.8737) grad_norm 1.3721 (1.0883) [2022-10-07 10:06:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][1000/1251] eta 0:01:22 lr 0.000041 time 0.3314 (0.3303) loss 6.8146 (6.8649) grad_norm 1.0293 (1.1276) [2022-10-07 10:07:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][1100/1251] eta 0:00:49 lr 0.000045 time 0.3231 (0.3298) loss 6.7800 (6.8556) grad_norm 1.6808 (1.1691) [2022-10-07 10:08:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [0/300][1200/1251] eta 0:00:16 lr 0.000049 time 0.3258 (0.3294) loss 6.7308 (6.8460) grad_norm 1.9332 (1.2164) [2022-10-07 10:08:19 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 0 training takes 0:06:52 [2022-10-07 10:08:19 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_0 saving...... [2022-10-07 10:08:20 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_0 saved !!! [2022-10-07 10:08:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.079 (3.079) Loss 6.3582 (6.3582) Acc@1 1.758 (1.758) Acc@5 5.957 (5.957) [2022-10-07 10:08:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 1.904 Acc@5 6.502 [2022-10-07 10:08:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 1.9% [2022-10-07 10:08:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 1.90% [2022-10-07 10:08:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][0/1251] eta 1:02:33 lr 0.000051 time 3.0004 (3.0004) loss 6.7168 (6.7168) grad_norm 1.4851 (1.4851) [2022-10-07 10:09:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][100/1251] eta 0:06:44 lr 0.000055 time 0.3254 (0.3515) loss 6.6744 (6.6904) grad_norm 2.6794 (2.1273) [2022-10-07 10:09:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][200/1251] eta 0:05:55 lr 0.000059 time 0.3221 (0.3382) loss 6.6421 (6.6775) grad_norm 1.8430 (2.1659) [2022-10-07 10:10:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][300/1251] eta 0:05:17 lr 0.000063 time 0.3276 (0.3337) loss 6.5515 (6.6625) grad_norm 1.9576 (2.1566) [2022-10-07 10:10:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][400/1251] eta 0:04:42 lr 0.000067 time 0.3215 (0.3316) loss 6.5776 (6.6470) grad_norm 2.8561 (2.1925) [2022-10-07 10:11:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][500/1251] eta 0:04:08 lr 0.000071 time 0.3335 (0.3302) loss 6.5389 (6.6337) grad_norm 2.6000 (2.2288) [2022-10-07 10:11:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][600/1251] eta 0:03:34 lr 0.000075 time 0.3229 (0.3293) loss 6.6080 (6.6214) grad_norm 2.7482 (2.2340) [2022-10-07 10:12:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][700/1251] eta 0:03:01 lr 0.000079 time 0.3300 (0.3286) loss 6.4938 (6.6072) grad_norm 2.3460 (2.2466) [2022-10-07 10:12:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][800/1251] eta 0:02:27 lr 0.000083 time 0.3217 (0.3281) loss 6.5310 (6.5956) grad_norm 2.7637 (2.2457) [2022-10-07 10:13:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][900/1251] eta 0:01:55 lr 0.000087 time 0.3253 (0.3277) loss 6.4643 (6.5839) grad_norm 1.7503 (2.2508) [2022-10-07 10:14:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][1000/1251] eta 0:01:22 lr 0.000091 time 0.3281 (0.3274) loss 6.5278 (6.5732) grad_norm 2.3248 (2.2590) [2022-10-07 10:14:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][1100/1251] eta 0:00:49 lr 0.000095 time 0.3222 (0.3272) loss 6.4786 (6.5619) grad_norm 1.7483 (2.2542) [2022-10-07 10:15:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [1/300][1200/1251] eta 0:00:16 lr 0.000099 time 0.3250 (0.3271) loss 6.3956 (6.5521) grad_norm 1.9632 (2.2572) [2022-10-07 10:15:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 1 training takes 0:06:49 [2022-10-07 10:15:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.929 (2.929) Loss 5.5744 (5.5744) Acc@1 5.664 (5.664) Acc@5 19.336 (19.336) [2022-10-07 10:15:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 6.362 Acc@5 18.132 [2022-10-07 10:15:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 6.4% [2022-10-07 10:15:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 6.36% [2022-10-07 10:15:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][0/1251] eta 1:05:27 lr 0.000101 time 3.1394 (3.1394) loss 6.4262 (6.4262) grad_norm 1.7844 (1.7844) [2022-10-07 10:16:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][100/1251] eta 0:06:48 lr 0.000105 time 0.3233 (0.3550) loss 6.2551 (6.3930) grad_norm 1.8419 (2.2044) [2022-10-07 10:16:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][200/1251] eta 0:05:58 lr 0.000109 time 0.3231 (0.3408) loss 6.4443 (6.3967) grad_norm 2.1858 (2.2410) [2022-10-07 10:17:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][300/1251] eta 0:05:19 lr 0.000113 time 0.3217 (0.3363) loss 6.3323 (6.3833) grad_norm 3.1756 (2.2151) [2022-10-07 10:17:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][400/1251] eta 0:04:44 lr 0.000117 time 0.3238 (0.3339) loss 6.3316 (6.3746) grad_norm 1.7199 (2.2357) [2022-10-07 10:18:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][500/1251] eta 0:04:09 lr 0.000121 time 0.3230 (0.3324) loss 6.3635 (6.3651) grad_norm 2.1889 (2.2480) [2022-10-07 10:18:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][600/1251] eta 0:03:35 lr 0.000125 time 0.3268 (0.3314) loss 6.2854 (6.3564) grad_norm 2.4838 (2.2471) [2022-10-07 10:19:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][700/1251] eta 0:03:02 lr 0.000129 time 0.3210 (0.3306) loss 6.3091 (6.3477) grad_norm 2.2912 (2.2653) [2022-10-07 10:20:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][800/1251] eta 0:02:28 lr 0.000133 time 0.3221 (0.3300) loss 6.2916 (6.3380) grad_norm 2.1223 (2.2572) [2022-10-07 10:20:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][900/1251] eta 0:01:55 lr 0.000137 time 0.3169 (0.3295) loss 6.2778 (6.3280) grad_norm 2.1426 (2.2543) [2022-10-07 10:21:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][1000/1251] eta 0:01:22 lr 0.000141 time 0.3219 (0.3290) loss 6.1563 (6.3174) grad_norm 2.0839 (2.2588) [2022-10-07 10:21:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][1100/1251] eta 0:00:49 lr 0.000145 time 0.3191 (0.3287) loss 6.2468 (6.3083) grad_norm 2.2672 (2.2627) [2022-10-07 10:22:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [2/300][1200/1251] eta 0:00:16 lr 0.000149 time 0.3217 (0.3285) loss 6.1109 (6.2987) grad_norm 1.9469 (2.2631) [2022-10-07 10:22:27 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 2 training takes 0:06:50 [2022-10-07 10:22:30 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.690 (2.690) Loss 4.8471 (4.8471) Acc@1 13.086 (13.086) Acc@5 31.445 (31.445) [2022-10-07 10:22:41 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 12.468 Acc@5 29.888 [2022-10-07 10:22:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 12.5% [2022-10-07 10:22:41 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 12.47% [2022-10-07 10:22:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][0/1251] eta 1:01:15 lr 0.000151 time 2.9384 (2.9384) loss 6.1837 (6.1837) grad_norm 2.0715 (2.0715) [2022-10-07 10:23:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][100/1251] eta 0:06:43 lr 0.000155 time 0.3230 (0.3502) loss 6.2320 (6.1854) grad_norm 1.9999 (2.2791) [2022-10-07 10:23:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][200/1251] eta 0:05:54 lr 0.000159 time 0.3194 (0.3376) loss 6.2833 (6.1599) grad_norm 1.9501 (2.2916) [2022-10-07 10:24:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][300/1251] eta 0:05:17 lr 0.000163 time 0.3220 (0.3334) loss 6.0785 (6.1470) grad_norm 1.8067 (2.2956) [2022-10-07 10:24:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][400/1251] eta 0:04:41 lr 0.000167 time 0.3196 (0.3313) loss 6.0997 (6.1408) grad_norm 2.1528 (2.3474) [2022-10-07 10:25:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][500/1251] eta 0:04:07 lr 0.000171 time 0.3252 (0.3301) loss 6.1141 (6.1318) grad_norm 3.8657 (2.3366) [2022-10-07 10:25:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][600/1251] eta 0:03:34 lr 0.000175 time 0.3271 (0.3292) loss 6.0421 (6.1231) grad_norm 2.0626 (2.3417) [2022-10-07 10:26:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][700/1251] eta 0:03:01 lr 0.000179 time 0.3248 (0.3286) loss 6.0806 (6.1168) grad_norm 2.8202 (2.3370) [2022-10-07 10:27:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][800/1251] eta 0:02:27 lr 0.000183 time 0.3216 (0.3281) loss 6.0371 (6.1093) grad_norm 3.0617 (2.3445) [2022-10-07 10:27:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][900/1251] eta 0:01:55 lr 0.000187 time 0.3243 (0.3277) loss 6.0746 (6.1014) grad_norm 1.6441 (2.3436) [2022-10-07 10:28:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][1000/1251] eta 0:01:22 lr 0.000191 time 0.3235 (0.3274) loss 6.0238 (6.0910) grad_norm 2.5628 (2.3419) [2022-10-07 10:28:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][1100/1251] eta 0:00:49 lr 0.000195 time 0.3199 (0.3271) loss 6.1837 (6.0824) grad_norm 2.2536 (2.3442) [2022-10-07 10:29:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [3/300][1200/1251] eta 0:00:16 lr 0.000199 time 0.3243 (0.3268) loss 6.0085 (6.0733) grad_norm 3.7469 (2.3521) [2022-10-07 10:29:30 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 3 training takes 0:06:48 [2022-10-07 10:29:33 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.272 (3.272) Loss 4.3972 (4.3972) Acc@1 18.555 (18.555) Acc@5 38.477 (38.477) [2022-10-07 10:29:44 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 18.930 Acc@5 39.954 [2022-10-07 10:29:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 18.9% [2022-10-07 10:29:44 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 18.93% [2022-10-07 10:29:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][0/1251] eta 0:47:45 lr 0.000201 time 2.2902 (2.2902) loss 5.8277 (5.8277) grad_norm 2.3256 (2.3256) [2022-10-07 10:30:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][100/1251] eta 0:06:40 lr 0.000205 time 0.3273 (0.3478) loss 6.1056 (5.9493) grad_norm 1.8274 (2.2767) [2022-10-07 10:30:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][200/1251] eta 0:05:54 lr 0.000209 time 0.3246 (0.3371) loss 6.1012 (5.9462) grad_norm 1.9963 (2.3582) [2022-10-07 10:31:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][300/1251] eta 0:05:17 lr 0.000213 time 0.3234 (0.3334) loss 5.8794 (5.9435) grad_norm 2.0309 (2.3698) [2022-10-07 10:31:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][400/1251] eta 0:04:41 lr 0.000217 time 0.3282 (0.3314) loss 5.8471 (5.9318) grad_norm 2.4115 (2.3712) [2022-10-07 10:32:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][500/1251] eta 0:04:07 lr 0.000221 time 0.3312 (0.3301) loss 5.9506 (5.9279) grad_norm 2.4916 (2.3927) [2022-10-07 10:33:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][600/1251] eta 0:03:34 lr 0.000225 time 0.3270 (0.3294) loss 5.6620 (5.9198) grad_norm 2.0167 (2.3893) [2022-10-07 10:33:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][700/1251] eta 0:03:01 lr 0.000229 time 0.3261 (0.3288) loss 5.8936 (5.9126) grad_norm 2.0469 (2.3876) [2022-10-07 10:34:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][800/1251] eta 0:02:28 lr 0.000233 time 0.3280 (0.3284) loss 5.7065 (5.9025) grad_norm 2.6662 (2.4133) [2022-10-07 10:34:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][900/1251] eta 0:01:55 lr 0.000237 time 0.3293 (0.3281) loss 5.8721 (5.8971) grad_norm 2.3170 (2.3966) [2022-10-07 10:35:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][1000/1251] eta 0:01:22 lr 0.000241 time 0.3237 (0.3278) loss 5.9763 (5.8900) grad_norm 2.1314 (2.3897) [2022-10-07 10:35:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][1100/1251] eta 0:00:49 lr 0.000245 time 0.3256 (0.3276) loss 5.5899 (5.8806) grad_norm 2.6283 (2.3891) [2022-10-07 10:36:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [4/300][1200/1251] eta 0:00:16 lr 0.000249 time 0.3315 (0.3274) loss 5.7510 (5.8719) grad_norm 1.9455 (2.3931) [2022-10-07 10:36:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 4 training takes 0:06:49 [2022-10-07 10:36:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.879 (2.879) Loss 3.8595 (3.8595) Acc@1 25.195 (25.195) Acc@5 48.438 (48.438) [2022-10-07 10:36:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 24.494 Acc@5 47.770 [2022-10-07 10:36:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 24.5% [2022-10-07 10:36:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 24.49% [2022-10-07 10:36:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][0/1251] eta 1:03:54 lr 0.000251 time 3.0654 (3.0654) loss 5.9557 (5.9557) grad_norm 2.2363 (2.2363) [2022-10-07 10:37:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][100/1251] eta 0:06:44 lr 0.000255 time 0.3224 (0.3514) loss 5.6731 (5.7707) grad_norm 2.4791 (2.4138) [2022-10-07 10:37:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][200/1251] eta 0:05:54 lr 0.000259 time 0.3243 (0.3377) loss 5.6560 (5.7687) grad_norm 2.7045 (2.3841) [2022-10-07 10:38:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][300/1251] eta 0:05:16 lr 0.000263 time 0.3269 (0.3330) loss 5.8625 (5.7559) grad_norm 1.9861 (2.3732) [2022-10-07 10:39:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][400/1251] eta 0:04:41 lr 0.000267 time 0.3232 (0.3308) loss 5.9953 (5.7484) grad_norm 2.3954 (2.3744) [2022-10-07 10:39:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][500/1251] eta 0:04:07 lr 0.000271 time 0.3209 (0.3294) loss 5.8030 (5.7380) grad_norm 2.2737 (2.3814) [2022-10-07 10:40:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][600/1251] eta 0:03:33 lr 0.000275 time 0.3315 (0.3285) loss 5.5900 (5.7297) grad_norm 2.4594 (2.3828) [2022-10-07 10:40:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][700/1251] eta 0:03:00 lr 0.000279 time 0.3195 (0.3278) loss 5.5885 (5.7245) grad_norm 2.4247 (2.3868) [2022-10-07 10:41:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][800/1251] eta 0:02:27 lr 0.000283 time 0.3220 (0.3274) loss 5.8770 (5.7149) grad_norm 2.0817 (2.3770) [2022-10-07 10:41:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][900/1251] eta 0:01:54 lr 0.000287 time 0.3250 (0.3270) loss 5.5334 (5.7100) grad_norm 1.9992 (2.3597) [2022-10-07 10:42:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][1000/1251] eta 0:01:22 lr 0.000291 time 0.3256 (0.3267) loss 5.6661 (5.7055) grad_norm 1.6976 (2.3540) [2022-10-07 10:42:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][1100/1251] eta 0:00:49 lr 0.000295 time 0.3191 (0.3265) loss 5.5226 (5.6987) grad_norm 2.2845 (2.3557) [2022-10-07 10:43:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [5/300][1200/1251] eta 0:00:16 lr 0.000299 time 0.3280 (0.3263) loss 5.7323 (5.6929) grad_norm 2.6192 (2.3486) [2022-10-07 10:43:35 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 5 training takes 0:06:48 [2022-10-07 10:43:38 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.207 (3.207) Loss 3.5429 (3.5429) Acc@1 30.078 (30.078) Acc@5 53.809 (53.809) [2022-10-07 10:43:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 29.758 Acc@5 54.416 [2022-10-07 10:43:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 29.8% [2022-10-07 10:43:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 29.76% [2022-10-07 10:43:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][0/1251] eta 0:54:53 lr 0.000301 time 2.6325 (2.6325) loss 5.6057 (5.6057) grad_norm 2.6836 (2.6836) [2022-10-07 10:44:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][100/1251] eta 0:06:42 lr 0.000305 time 0.3260 (0.3497) loss 5.6699 (5.5931) grad_norm 2.1797 (2.2976) [2022-10-07 10:44:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][200/1251] eta 0:05:54 lr 0.000309 time 0.3252 (0.3377) loss 5.6064 (5.5891) grad_norm 2.1315 (2.2663) [2022-10-07 10:45:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][300/1251] eta 0:05:17 lr 0.000313 time 0.3242 (0.3340) loss 5.6425 (5.5908) grad_norm 2.2688 (2.2753) [2022-10-07 10:46:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][400/1251] eta 0:04:42 lr 0.000317 time 0.3296 (0.3318) loss 5.5259 (5.5904) grad_norm 1.9396 (2.2896) [2022-10-07 10:46:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][500/1251] eta 0:04:08 lr 0.000321 time 0.3232 (0.3305) loss 5.6282 (5.5828) grad_norm 2.3707 (2.2813) [2022-10-07 10:47:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][600/1251] eta 0:03:34 lr 0.000325 time 0.3266 (0.3298) loss 5.7082 (5.5805) grad_norm 1.7369 (2.2833) [2022-10-07 10:47:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][700/1251] eta 0:03:01 lr 0.000329 time 0.3237 (0.3291) loss 5.4530 (5.5737) grad_norm 3.6842 (2.2932) [2022-10-07 10:48:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][800/1251] eta 0:02:28 lr 0.000333 time 0.3286 (0.3286) loss 5.2208 (5.5662) grad_norm 2.6189 (2.2882) [2022-10-07 10:48:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][900/1251] eta 0:01:55 lr 0.000337 time 0.3197 (0.3282) loss 5.5931 (5.5615) grad_norm 1.6285 (2.2801) [2022-10-07 10:49:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][1000/1251] eta 0:01:22 lr 0.000341 time 0.3297 (0.3279) loss 5.4715 (5.5558) grad_norm 1.6382 (2.2752) [2022-10-07 10:49:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][1100/1251] eta 0:00:49 lr 0.000345 time 0.3285 (0.3276) loss 5.7033 (5.5505) grad_norm 2.3275 (2.2698) [2022-10-07 10:50:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [6/300][1200/1251] eta 0:00:16 lr 0.000349 time 0.3280 (0.3274) loss 5.5932 (5.5451) grad_norm 1.9083 (2.2682) [2022-10-07 10:50:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 6 training takes 0:06:49 [2022-10-07 10:50:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.287 (2.287) Loss 3.3293 (3.3293) Acc@1 31.445 (31.445) Acc@5 57.129 (57.129) [2022-10-07 10:50:52 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 33.474 Acc@5 59.066 [2022-10-07 10:50:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 33.5% [2022-10-07 10:50:52 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 33.47% [2022-10-07 10:50:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][0/1251] eta 1:04:57 lr 0.000351 time 3.1155 (3.1155) loss 5.5344 (5.5344) grad_norm 2.0784 (2.0784) [2022-10-07 10:51:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][100/1251] eta 0:06:47 lr 0.000355 time 0.3220 (0.3542) loss 5.1586 (5.4583) grad_norm 2.5664 (2.1532) [2022-10-07 10:52:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][200/1251] eta 0:05:57 lr 0.000359 time 0.3342 (0.3404) loss 5.7738 (5.4478) grad_norm 2.0966 (2.2194) [2022-10-07 10:52:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][300/1251] eta 0:05:18 lr 0.000363 time 0.3241 (0.3354) loss 5.4189 (5.4353) grad_norm 1.9735 (2.1837) [2022-10-07 10:53:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][400/1251] eta 0:04:43 lr 0.000367 time 0.3272 (0.3329) loss 5.5665 (5.4347) grad_norm 2.0427 (2.1942) [2022-10-07 10:53:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][500/1251] eta 0:04:08 lr 0.000371 time 0.3289 (0.3315) loss 5.4417 (5.4289) grad_norm 1.9527 (2.2012) [2022-10-07 10:54:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][600/1251] eta 0:03:35 lr 0.000375 time 0.3211 (0.3305) loss 5.5416 (5.4201) grad_norm 1.6426 (2.2015) [2022-10-07 10:54:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][700/1251] eta 0:03:01 lr 0.000379 time 0.3255 (0.3298) loss 5.4293 (5.4188) grad_norm 1.9373 (2.1926) [2022-10-07 10:55:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][800/1251] eta 0:02:28 lr 0.000383 time 0.3311 (0.3294) loss 5.2700 (5.4111) grad_norm 1.8086 (2.1921) [2022-10-07 10:55:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][900/1251] eta 0:01:55 lr 0.000387 time 0.3300 (0.3290) loss 5.3412 (5.4071) grad_norm 1.9075 (2.1844) [2022-10-07 10:56:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][1000/1251] eta 0:01:22 lr 0.000391 time 0.3298 (0.3287) loss 5.7298 (5.4058) grad_norm 2.8437 (2.1743) [2022-10-07 10:56:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][1100/1251] eta 0:00:49 lr 0.000395 time 0.3312 (0.3286) loss 5.3137 (5.4011) grad_norm 1.7963 (2.1708) [2022-10-07 10:57:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [7/300][1200/1251] eta 0:00:16 lr 0.000399 time 0.3275 (0.3285) loss 5.5669 (5.3946) grad_norm 2.0466 (2.1680) [2022-10-07 10:57:43 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 7 training takes 0:06:51 [2022-10-07 10:57:46 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.783 (2.783) Loss 2.9771 (2.9771) Acc@1 37.109 (37.109) Acc@5 63.574 (63.574) [2022-10-07 10:57:57 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 38.304 Acc@5 64.128 [2022-10-07 10:57:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 38.3% [2022-10-07 10:57:57 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 38.30% [2022-10-07 10:58:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][0/1251] eta 0:59:54 lr 0.000401 time 2.8732 (2.8732) loss 5.3072 (5.3072) grad_norm 2.8050 (2.8050) [2022-10-07 10:58:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][100/1251] eta 0:06:44 lr 0.000405 time 0.3252 (0.3516) loss 5.2533 (5.3070) grad_norm 1.7571 (2.1100) [2022-10-07 10:59:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][200/1251] eta 0:05:56 lr 0.000409 time 0.3269 (0.3393) loss 5.4169 (5.3048) grad_norm 1.8535 (2.0593) [2022-10-07 10:59:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][300/1251] eta 0:05:18 lr 0.000413 time 0.3262 (0.3351) loss 5.0127 (5.3064) grad_norm 2.3930 (2.0840) [2022-10-07 11:00:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][400/1251] eta 0:04:43 lr 0.000417 time 0.3244 (0.3329) loss 5.4079 (5.3060) grad_norm 1.8312 (2.0676) [2022-10-07 11:00:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][500/1251] eta 0:04:08 lr 0.000421 time 0.3242 (0.3315) loss 5.3334 (5.3046) grad_norm 1.3783 (2.0750) [2022-10-07 11:01:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][600/1251] eta 0:03:35 lr 0.000425 time 0.3262 (0.3306) loss 5.4151 (5.2996) grad_norm 2.0465 (2.0682) [2022-10-07 11:01:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][700/1251] eta 0:03:01 lr 0.000429 time 0.3252 (0.3299) loss 4.9434 (5.2969) grad_norm 1.8600 (2.0689) [2022-10-07 11:02:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][800/1251] eta 0:02:28 lr 0.000433 time 0.3297 (0.3294) loss 5.4765 (5.2927) grad_norm 1.9389 (2.0614) [2022-10-07 11:02:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][900/1251] eta 0:01:55 lr 0.000437 time 0.3260 (0.3290) loss 5.0254 (5.2912) grad_norm 2.0946 (2.0586) [2022-10-07 11:03:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][1000/1251] eta 0:01:22 lr 0.000441 time 0.3259 (0.3287) loss 5.3484 (5.2875) grad_norm 2.1592 (2.0533) [2022-10-07 11:03:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][1100/1251] eta 0:00:49 lr 0.000445 time 0.3255 (0.3284) loss 5.3954 (5.2814) grad_norm 2.1662 (2.0449) [2022-10-07 11:04:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [8/300][1200/1251] eta 0:00:16 lr 0.000449 time 0.3242 (0.3282) loss 5.1363 (5.2746) grad_norm 2.0960 (2.0454) [2022-10-07 11:04:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 8 training takes 0:06:50 [2022-10-07 11:04:50 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.755 (2.755) Loss 2.9353 (2.9353) Acc@1 39.551 (39.551) Acc@5 63.574 (63.574) [2022-10-07 11:05:01 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 40.622 Acc@5 66.278 [2022-10-07 11:05:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 40.6% [2022-10-07 11:05:01 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 40.62% [2022-10-07 11:05:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][0/1251] eta 1:10:45 lr 0.000451 time 3.3937 (3.3937) loss 5.2184 (5.2184) grad_norm 1.6567 (1.6567) [2022-10-07 11:05:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][100/1251] eta 0:06:50 lr 0.000455 time 0.3253 (0.3567) loss 5.3794 (5.2343) grad_norm 2.1105 (1.9263) [2022-10-07 11:06:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][200/1251] eta 0:05:59 lr 0.000459 time 0.3306 (0.3416) loss 4.9292 (5.2337) grad_norm 1.7423 (1.9402) [2022-10-07 11:06:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][300/1251] eta 0:05:19 lr 0.000463 time 0.3233 (0.3363) loss 5.2855 (5.2315) grad_norm 1.8845 (1.9303) [2022-10-07 11:07:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][400/1251] eta 0:04:44 lr 0.000467 time 0.3293 (0.3337) loss 5.3025 (5.2222) grad_norm 1.9903 (1.9435) [2022-10-07 11:07:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][500/1251] eta 0:04:09 lr 0.000471 time 0.3228 (0.3322) loss 5.1542 (5.2150) grad_norm 2.2647 (1.9440) [2022-10-07 11:08:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][600/1251] eta 0:03:35 lr 0.000475 time 0.3226 (0.3311) loss 5.1446 (5.2126) grad_norm 3.0745 (1.9476) [2022-10-07 11:08:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][700/1251] eta 0:03:01 lr 0.000478 time 0.3263 (0.3302) loss 5.4352 (5.2113) grad_norm 1.8544 (1.9422) [2022-10-07 11:09:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][800/1251] eta 0:02:28 lr 0.000482 time 0.3269 (0.3295) loss 4.9123 (5.2028) grad_norm 1.9230 (1.9385) [2022-10-07 11:09:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][900/1251] eta 0:01:55 lr 0.000486 time 0.3251 (0.3289) loss 5.1386 (5.1943) grad_norm 1.6676 (1.9319) [2022-10-07 11:10:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][1000/1251] eta 0:01:22 lr 0.000490 time 0.3292 (0.3286) loss 5.4682 (5.1891) grad_norm 2.0500 (1.9348) [2022-10-07 11:11:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][1100/1251] eta 0:00:49 lr 0.000494 time 0.3216 (0.3282) loss 5.3050 (5.1836) grad_norm 1.5950 (1.9329) [2022-10-07 11:11:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [9/300][1200/1251] eta 0:00:16 lr 0.000498 time 0.3211 (0.3279) loss 5.1777 (5.1788) grad_norm 1.8372 (1.9268) [2022-10-07 11:11:51 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 9 training takes 0:06:50 [2022-10-07 11:11:54 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.443 (2.443) Loss 2.5866 (2.5866) Acc@1 44.824 (44.824) Acc@5 68.555 (68.555) [2022-10-07 11:12:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 44.476 Acc@5 70.144 [2022-10-07 11:12:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 44.5% [2022-10-07 11:12:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 44.48% [2022-10-07 11:12:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][0/1251] eta 1:04:47 lr 0.000501 time 3.1079 (3.1079) loss 4.8456 (4.8456) grad_norm 1.7109 (1.7109) [2022-10-07 11:12:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][100/1251] eta 0:06:49 lr 0.000504 time 0.3295 (0.3558) loss 5.1480 (5.1095) grad_norm 1.8254 (1.8890) [2022-10-07 11:13:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][200/1251] eta 0:05:59 lr 0.000508 time 0.3247 (0.3420) loss 5.1758 (5.1072) grad_norm 1.9663 (1.8522) [2022-10-07 11:13:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][300/1251] eta 0:05:20 lr 0.000512 time 0.3261 (0.3375) loss 5.3111 (5.1084) grad_norm 1.9718 (1.8407) [2022-10-07 11:14:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][400/1251] eta 0:04:45 lr 0.000516 time 0.3241 (0.3351) loss 5.2008 (5.1010) grad_norm 1.4428 (1.8226) [2022-10-07 11:14:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][500/1251] eta 0:04:10 lr 0.000520 time 0.3233 (0.3337) loss 5.3924 (5.0971) grad_norm 1.6686 (1.8255) [2022-10-07 11:15:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][600/1251] eta 0:03:36 lr 0.000524 time 0.3226 (0.3327) loss 4.9286 (5.0913) grad_norm 1.8259 (1.8224) [2022-10-07 11:15:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][700/1251] eta 0:03:02 lr 0.000528 time 0.3263 (0.3319) loss 5.2736 (5.0893) grad_norm 1.3759 (1.8189) [2022-10-07 11:16:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][800/1251] eta 0:02:29 lr 0.000532 time 0.3244 (0.3312) loss 4.5974 (5.0882) grad_norm 2.4044 (1.8114) [2022-10-07 11:17:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][900/1251] eta 0:01:56 lr 0.000536 time 0.3272 (0.3308) loss 5.1387 (5.0872) grad_norm 1.8108 (1.8104) [2022-10-07 11:17:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][1000/1251] eta 0:01:22 lr 0.000540 time 0.3260 (0.3304) loss 5.0411 (5.0838) grad_norm 2.0897 (1.8034) [2022-10-07 11:18:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][1100/1251] eta 0:00:49 lr 0.000544 time 0.3219 (0.3300) loss 5.2482 (5.0778) grad_norm 1.6296 (1.8055) [2022-10-07 11:18:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [10/300][1200/1251] eta 0:00:16 lr 0.000548 time 0.3252 (0.3297) loss 4.9739 (5.0744) grad_norm 2.4214 (1.8011) [2022-10-07 11:18:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 10 training takes 0:06:52 [2022-10-07 11:18:57 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_10 saving...... [2022-10-07 11:18:58 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_10 saved !!! [2022-10-07 11:19:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.843 (2.843) Loss 2.3781 (2.3781) Acc@1 48.047 (48.047) Acc@5 75.391 (75.391) [2022-10-07 11:19:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 46.740 Acc@5 72.374 [2022-10-07 11:19:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 46.7% [2022-10-07 11:19:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 46.74% [2022-10-07 11:19:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][0/1251] eta 0:48:03 lr 0.000550 time 2.3047 (2.3047) loss 5.1149 (5.1149) grad_norm 1.9206 (1.9206) [2022-10-07 11:19:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][100/1251] eta 0:06:40 lr 0.000554 time 0.3241 (0.3476) loss 5.1646 (4.9839) grad_norm 2.1570 (1.7805) [2022-10-07 11:20:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][200/1251] eta 0:05:53 lr 0.000558 time 0.3221 (0.3362) loss 4.8090 (4.9876) grad_norm 1.8814 (1.7705) [2022-10-07 11:20:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][300/1251] eta 0:05:16 lr 0.000562 time 0.3251 (0.3326) loss 5.1305 (4.9795) grad_norm 1.8153 (1.7745) [2022-10-07 11:21:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][400/1251] eta 0:04:41 lr 0.000566 time 0.3256 (0.3307) loss 5.4240 (4.9927) grad_norm 1.4912 (1.7506) [2022-10-07 11:21:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][500/1251] eta 0:04:07 lr 0.000570 time 0.3266 (0.3296) loss 4.9136 (4.9942) grad_norm 1.7184 (1.7503) [2022-10-07 11:22:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][600/1251] eta 0:03:33 lr 0.000574 time 0.3240 (0.3287) loss 5.0077 (4.9888) grad_norm 2.0491 (1.7437) [2022-10-07 11:23:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][700/1251] eta 0:03:00 lr 0.000578 time 0.3281 (0.3281) loss 5.0026 (4.9901) grad_norm 1.4610 (1.7403) [2022-10-07 11:23:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][800/1251] eta 0:02:27 lr 0.000582 time 0.3208 (0.3277) loss 4.8948 (4.9857) grad_norm 1.4524 (1.7412) [2022-10-07 11:24:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][900/1251] eta 0:01:54 lr 0.000586 time 0.3263 (0.3274) loss 4.9787 (4.9843) grad_norm 1.4916 (1.7370) [2022-10-07 11:24:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][1000/1251] eta 0:01:22 lr 0.000590 time 0.3245 (0.3272) loss 4.9556 (4.9817) grad_norm 1.6932 (1.7299) [2022-10-07 11:25:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][1100/1251] eta 0:00:49 lr 0.000594 time 0.3322 (0.3271) loss 4.7699 (4.9783) grad_norm 1.7239 (1.7273) [2022-10-07 11:25:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [11/300][1200/1251] eta 0:00:16 lr 0.000598 time 0.3256 (0.3270) loss 5.0762 (4.9740) grad_norm 1.6572 (1.7229) [2022-10-07 11:26:00 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 11 training takes 0:06:49 [2022-10-07 11:26:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.645 (2.645) Loss 2.3723 (2.3723) Acc@1 48.730 (48.730) Acc@5 73.535 (73.535) [2022-10-07 11:26:14 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 49.010 Acc@5 74.512 [2022-10-07 11:26:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 49.0% [2022-10-07 11:26:14 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 49.01% [2022-10-07 11:26:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][0/1251] eta 0:57:47 lr 0.000600 time 2.7719 (2.7719) loss 4.8439 (4.8439) grad_norm 1.5971 (1.5971) [2022-10-07 11:26:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][100/1251] eta 0:06:41 lr 0.000604 time 0.3287 (0.3489) loss 4.8139 (4.9292) grad_norm 1.4516 (1.7212) [2022-10-07 11:27:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][200/1251] eta 0:05:53 lr 0.000608 time 0.3240 (0.3365) loss 5.1895 (4.9339) grad_norm 1.5938 (1.7033) [2022-10-07 11:27:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][300/1251] eta 0:05:16 lr 0.000612 time 0.3223 (0.3325) loss 4.8959 (4.9348) grad_norm 1.7512 (1.6914) [2022-10-07 11:28:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][400/1251] eta 0:04:41 lr 0.000616 time 0.3209 (0.3303) loss 4.8140 (4.9301) grad_norm 1.7169 (1.6831) [2022-10-07 11:28:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][500/1251] eta 0:04:07 lr 0.000620 time 0.3246 (0.3289) loss 5.1225 (4.9210) grad_norm 1.6690 (1.6809) [2022-10-07 11:29:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][600/1251] eta 0:03:33 lr 0.000624 time 0.3239 (0.3280) loss 5.0543 (4.9233) grad_norm 1.7651 (1.6697) [2022-10-07 11:30:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][700/1251] eta 0:03:00 lr 0.000628 time 0.3247 (0.3273) loss 4.5907 (4.9213) grad_norm 2.2576 (1.6573) [2022-10-07 11:30:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][800/1251] eta 0:02:27 lr 0.000632 time 0.3251 (0.3268) loss 4.6311 (4.9205) grad_norm 1.7123 (1.6559) [2022-10-07 11:31:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][900/1251] eta 0:01:54 lr 0.000636 time 0.3231 (0.3265) loss 4.7944 (4.9173) grad_norm 1.5520 (1.6531) [2022-10-07 11:31:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][1000/1251] eta 0:01:21 lr 0.000640 time 0.3245 (0.3262) loss 5.0282 (4.9166) grad_norm 1.5888 (1.6571) [2022-10-07 11:32:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][1100/1251] eta 0:00:49 lr 0.000644 time 0.3206 (0.3261) loss 4.6768 (4.9139) grad_norm 1.6614 (1.6546) [2022-10-07 11:32:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [12/300][1200/1251] eta 0:00:16 lr 0.000648 time 0.3248 (0.3260) loss 4.9130 (4.9112) grad_norm 1.5346 (1.6496) [2022-10-07 11:33:01 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 12 training takes 0:06:47 [2022-10-07 11:33:04 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.696 (2.696) Loss 2.3001 (2.3001) Acc@1 49.609 (49.609) Acc@5 75.391 (75.391) [2022-10-07 11:33:15 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 50.422 Acc@5 75.914 [2022-10-07 11:33:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 50.4% [2022-10-07 11:33:15 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 50.42% [2022-10-07 11:33:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][0/1251] eta 1:00:37 lr 0.000650 time 2.9081 (2.9081) loss 4.7860 (4.7860) grad_norm 1.5179 (1.5179) [2022-10-07 11:33:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][100/1251] eta 0:06:43 lr 0.000654 time 0.3340 (0.3508) loss 4.8297 (4.8384) grad_norm 1.7557 (1.5863) [2022-10-07 11:34:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][200/1251] eta 0:05:55 lr 0.000658 time 0.3250 (0.3384) loss 4.5908 (4.8391) grad_norm 1.8436 (1.5890) [2022-10-07 11:34:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][300/1251] eta 0:05:17 lr 0.000662 time 0.3297 (0.3341) loss 4.6914 (4.8460) grad_norm 1.7919 (1.5968) [2022-10-07 11:35:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][400/1251] eta 0:04:42 lr 0.000666 time 0.3253 (0.3316) loss 4.8497 (4.8503) grad_norm 1.5685 (1.5883) [2022-10-07 11:36:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][500/1251] eta 0:04:07 lr 0.000670 time 0.3243 (0.3301) loss 5.1118 (4.8494) grad_norm 1.6700 (1.5891) [2022-10-07 11:36:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][600/1251] eta 0:03:34 lr 0.000674 time 0.3205 (0.3291) loss 4.9533 (4.8470) grad_norm 1.3801 (1.5841) [2022-10-07 11:37:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][700/1251] eta 0:03:00 lr 0.000678 time 0.3227 (0.3284) loss 4.6887 (4.8439) grad_norm 1.8625 (1.5818) [2022-10-07 11:37:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][800/1251] eta 0:02:27 lr 0.000682 time 0.3288 (0.3279) loss 4.6194 (4.8409) grad_norm 1.8558 (1.5773) [2022-10-07 11:38:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][900/1251] eta 0:01:54 lr 0.000686 time 0.3235 (0.3275) loss 4.9271 (4.8379) grad_norm 1.2339 (1.5732) [2022-10-07 11:38:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][1000/1251] eta 0:01:22 lr 0.000690 time 0.3222 (0.3272) loss 4.8256 (4.8324) grad_norm 1.7236 (1.5731) [2022-10-07 11:39:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][1100/1251] eta 0:00:49 lr 0.000694 time 0.3263 (0.3269) loss 4.7867 (4.8290) grad_norm 1.2779 (1.5684) [2022-10-07 11:39:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [13/300][1200/1251] eta 0:00:16 lr 0.000698 time 0.3235 (0.3266) loss 4.7772 (4.8230) grad_norm 1.4065 (1.5668) [2022-10-07 11:40:04 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 13 training takes 0:06:48 [2022-10-07 11:40:07 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.166 (3.166) Loss 2.1875 (2.1875) Acc@1 52.930 (52.930) Acc@5 78.320 (78.320) [2022-10-07 11:40:17 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 52.456 Acc@5 77.420 [2022-10-07 11:40:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 52.5% [2022-10-07 11:40:17 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 52.46% [2022-10-07 11:40:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][0/1251] eta 1:04:22 lr 0.000700 time 3.0874 (3.0874) loss 4.6685 (4.6685) grad_norm 1.5440 (1.5440) [2022-10-07 11:40:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][100/1251] eta 0:06:46 lr 0.000704 time 0.3279 (0.3529) loss 4.3313 (4.7548) grad_norm 1.7886 (1.5999) [2022-10-07 11:41:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][200/1251] eta 0:05:57 lr 0.000708 time 0.3268 (0.3397) loss 4.7128 (4.7757) grad_norm 1.5151 (1.5711) [2022-10-07 11:41:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][300/1251] eta 0:05:18 lr 0.000712 time 0.3235 (0.3351) loss 4.6777 (4.7760) grad_norm 1.5031 (1.5610) [2022-10-07 11:42:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][400/1251] eta 0:04:43 lr 0.000716 time 0.3230 (0.3327) loss 4.9115 (4.7789) grad_norm 1.3290 (1.5522) [2022-10-07 11:43:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][500/1251] eta 0:04:08 lr 0.000720 time 0.3214 (0.3312) loss 4.9751 (4.7746) grad_norm 1.4314 (1.5506) [2022-10-07 11:43:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][600/1251] eta 0:03:34 lr 0.000724 time 0.3184 (0.3301) loss 5.0903 (4.7779) grad_norm 1.2897 (1.5403) [2022-10-07 11:44:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][700/1251] eta 0:03:01 lr 0.000728 time 0.3217 (0.3294) loss 4.7477 (4.7769) grad_norm 1.4229 (1.5391) [2022-10-07 11:44:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][800/1251] eta 0:02:28 lr 0.000732 time 0.3299 (0.3287) loss 4.8971 (4.7757) grad_norm 1.5672 (1.5336) [2022-10-07 11:45:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][900/1251] eta 0:01:55 lr 0.000736 time 0.3266 (0.3282) loss 4.3963 (4.7746) grad_norm 1.6219 (1.5335) [2022-10-07 11:45:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][1000/1251] eta 0:01:22 lr 0.000740 time 0.3227 (0.3277) loss 4.6639 (4.7730) grad_norm 1.4369 (1.5231) [2022-10-07 11:46:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][1100/1251] eta 0:00:49 lr 0.000744 time 0.3226 (0.3274) loss 4.8771 (4.7707) grad_norm 1.6845 (1.5263) [2022-10-07 11:46:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [14/300][1200/1251] eta 0:00:16 lr 0.000748 time 0.3219 (0.3271) loss 4.6216 (4.7674) grad_norm 1.3828 (1.5201) [2022-10-07 11:47:07 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 14 training takes 0:06:49 [2022-10-07 11:47:10 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.342 (3.342) Loss 2.0983 (2.0983) Acc@1 54.102 (54.102) Acc@5 77.930 (77.930) [2022-10-07 11:47:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 53.820 Acc@5 78.630 [2022-10-07 11:47:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 53.8% [2022-10-07 11:47:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 53.82% [2022-10-07 11:47:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][0/1251] eta 1:00:11 lr 0.000750 time 2.8873 (2.8873) loss 4.9802 (4.9802) grad_norm 1.6916 (1.6916) [2022-10-07 11:47:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][100/1251] eta 0:06:42 lr 0.000754 time 0.3216 (0.3495) loss 5.1154 (4.7173) grad_norm 1.5899 (1.4989) [2022-10-07 11:48:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][200/1251] eta 0:05:53 lr 0.000758 time 0.3232 (0.3368) loss 4.8505 (4.7068) grad_norm 2.0159 (1.5062) [2022-10-07 11:49:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][300/1251] eta 0:05:16 lr 0.000762 time 0.3241 (0.3327) loss 4.6840 (4.7178) grad_norm 1.5658 (1.5187) [2022-10-07 11:49:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][400/1251] eta 0:04:41 lr 0.000766 time 0.3226 (0.3306) loss 4.4471 (4.7197) grad_norm 1.5321 (1.5106) [2022-10-07 11:50:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][500/1251] eta 0:04:07 lr 0.000770 time 0.3226 (0.3293) loss 4.7449 (4.7241) grad_norm 1.3280 (1.5087) [2022-10-07 11:50:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][600/1251] eta 0:03:33 lr 0.000774 time 0.3247 (0.3285) loss 4.6950 (4.7291) grad_norm 1.4156 (1.4980) [2022-10-07 11:51:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][700/1251] eta 0:03:00 lr 0.000778 time 0.3249 (0.3278) loss 5.0828 (4.7220) grad_norm 1.2719 (1.4931) [2022-10-07 11:51:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][800/1251] eta 0:02:27 lr 0.000782 time 0.3261 (0.3273) loss 4.5497 (4.7167) grad_norm 1.7199 (1.4900) [2022-10-07 11:52:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][900/1251] eta 0:01:54 lr 0.000786 time 0.3222 (0.3270) loss 4.4727 (4.7134) grad_norm 1.2605 (1.4842) [2022-10-07 11:52:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][1000/1251] eta 0:01:22 lr 0.000790 time 0.3218 (0.3267) loss 4.8393 (4.7137) grad_norm 1.3507 (1.4815) [2022-10-07 11:53:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][1100/1251] eta 0:00:49 lr 0.000794 time 0.3236 (0.3265) loss 4.5972 (4.7123) grad_norm 1.5797 (1.4842) [2022-10-07 11:53:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [15/300][1200/1251] eta 0:00:16 lr 0.000798 time 0.3240 (0.3263) loss 4.7329 (4.7097) grad_norm 1.3714 (1.4805) [2022-10-07 11:54:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 15 training takes 0:06:48 [2022-10-07 11:54:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.598 (2.598) Loss 1.9660 (1.9660) Acc@1 56.934 (56.934) Acc@5 81.055 (81.055) [2022-10-07 11:54:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 55.296 Acc@5 79.746 [2022-10-07 11:54:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 55.3% [2022-10-07 11:54:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 55.30% [2022-10-07 11:54:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][0/1251] eta 1:07:49 lr 0.000800 time 3.2531 (3.2531) loss 4.5971 (4.5971) grad_norm 1.2928 (1.2928) [2022-10-07 11:54:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][100/1251] eta 0:06:48 lr 0.000804 time 0.3233 (0.3546) loss 4.8328 (4.6791) grad_norm 1.2633 (1.4530) [2022-10-07 11:55:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][200/1251] eta 0:05:57 lr 0.000808 time 0.3234 (0.3404) loss 4.9526 (4.6577) grad_norm 1.2900 (1.4454) [2022-10-07 11:56:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][300/1251] eta 0:05:19 lr 0.000812 time 0.3312 (0.3357) loss 4.9791 (4.6571) grad_norm 1.3870 (1.4371) [2022-10-07 11:56:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][400/1251] eta 0:04:43 lr 0.000816 time 0.3233 (0.3333) loss 4.2513 (4.6601) grad_norm 1.7880 (1.4393) [2022-10-07 11:57:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][500/1251] eta 0:04:09 lr 0.000820 time 0.3305 (0.3319) loss 4.7145 (4.6674) grad_norm 1.1526 (1.4301) [2022-10-07 11:57:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][600/1251] eta 0:03:35 lr 0.000824 time 0.3212 (0.3309) loss 4.3938 (4.6641) grad_norm 1.8561 (1.4301) [2022-10-07 11:58:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][700/1251] eta 0:03:01 lr 0.000828 time 0.3290 (0.3301) loss 4.9257 (4.6678) grad_norm 1.4915 (1.4283) [2022-10-07 11:58:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][800/1251] eta 0:02:28 lr 0.000832 time 0.3254 (0.3296) loss 4.4597 (4.6656) grad_norm 1.5797 (1.4261) [2022-10-07 11:59:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][900/1251] eta 0:01:55 lr 0.000836 time 0.3260 (0.3291) loss 4.9008 (4.6647) grad_norm 1.2717 (1.4221) [2022-10-07 11:59:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][1000/1251] eta 0:01:22 lr 0.000840 time 0.3241 (0.3286) loss 4.8672 (4.6614) grad_norm 1.6743 (1.4262) [2022-10-07 12:00:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][1100/1251] eta 0:00:49 lr 0.000844 time 0.3278 (0.3283) loss 4.5823 (4.6592) grad_norm 1.6749 (1.4212) [2022-10-07 12:00:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [16/300][1200/1251] eta 0:00:16 lr 0.000848 time 0.3202 (0.3281) loss 4.8646 (4.6609) grad_norm 1.1915 (1.4229) [2022-10-07 12:01:13 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 16 training takes 0:06:50 [2022-10-07 12:01:16 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.303 (3.303) Loss 2.0143 (2.0143) Acc@1 56.055 (56.055) Acc@5 79.102 (79.102) [2022-10-07 12:01:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 56.624 Acc@5 80.728 [2022-10-07 12:01:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 56.6% [2022-10-07 12:01:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 56.62% [2022-10-07 12:01:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][0/1251] eta 0:56:32 lr 0.000850 time 2.7116 (2.7116) loss 4.7480 (4.7480) grad_norm 1.5099 (1.5099) [2022-10-07 12:02:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][100/1251] eta 0:06:44 lr 0.000854 time 0.3225 (0.3510) loss 4.4614 (4.6496) grad_norm 1.5471 (1.4072) [2022-10-07 12:02:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][200/1251] eta 0:05:55 lr 0.000858 time 0.3224 (0.3382) loss 4.5779 (4.6322) grad_norm 1.2633 (1.3849) [2022-10-07 12:03:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][300/1251] eta 0:05:17 lr 0.000862 time 0.3216 (0.3335) loss 4.6411 (4.6342) grad_norm 1.5045 (1.3781) [2022-10-07 12:03:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][400/1251] eta 0:04:41 lr 0.000866 time 0.3265 (0.3313) loss 4.6069 (4.6294) grad_norm 1.4910 (1.3916) [2022-10-07 12:04:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][500/1251] eta 0:04:07 lr 0.000870 time 0.3244 (0.3299) loss 4.6686 (4.6295) grad_norm 1.2523 (1.3982) [2022-10-07 12:04:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][600/1251] eta 0:03:34 lr 0.000874 time 0.3235 (0.3291) loss 4.2918 (4.6258) grad_norm 1.4388 (1.3927) [2022-10-07 12:05:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][700/1251] eta 0:03:00 lr 0.000878 time 0.3182 (0.3283) loss 4.4410 (4.6278) grad_norm 1.4635 (1.3901) [2022-10-07 12:05:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][800/1251] eta 0:02:27 lr 0.000882 time 0.3250 (0.3277) loss 4.5343 (4.6255) grad_norm 1.1800 (1.3873) [2022-10-07 12:06:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][900/1251] eta 0:01:54 lr 0.000886 time 0.3282 (0.3272) loss 4.9361 (4.6250) grad_norm 1.2627 (1.3827) [2022-10-07 12:06:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][1000/1251] eta 0:01:22 lr 0.000890 time 0.3250 (0.3268) loss 4.4598 (4.6240) grad_norm 1.4939 (1.3828) [2022-10-07 12:07:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][1100/1251] eta 0:00:49 lr 0.000894 time 0.3245 (0.3265) loss 4.5433 (4.6244) grad_norm 1.1185 (1.3824) [2022-10-07 12:07:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [17/300][1200/1251] eta 0:00:16 lr 0.000898 time 0.3229 (0.3263) loss 4.3929 (4.6242) grad_norm 1.4836 (1.3828) [2022-10-07 12:08:15 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 17 training takes 0:06:48 [2022-10-07 12:08:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.422 (2.422) Loss 1.9978 (1.9978) Acc@1 55.664 (55.664) Acc@5 82.812 (82.812) [2022-10-07 12:08:28 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 56.858 Acc@5 81.348 [2022-10-07 12:08:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 56.9% [2022-10-07 12:08:28 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 56.86% [2022-10-07 12:08:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][0/1251] eta 1:07:04 lr 0.000900 time 3.2169 (3.2169) loss 4.9908 (4.9908) grad_norm 1.2656 (1.2656) [2022-10-07 12:09:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][100/1251] eta 0:06:47 lr 0.000904 time 0.3261 (0.3542) loss 4.1245 (4.5635) grad_norm 1.3558 (1.3648) [2022-10-07 12:09:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][200/1251] eta 0:05:56 lr 0.000908 time 0.3269 (0.3396) loss 4.6953 (4.5692) grad_norm 1.3409 (1.3651) [2022-10-07 12:10:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][300/1251] eta 0:05:18 lr 0.000912 time 0.3299 (0.3349) loss 4.6595 (4.5784) grad_norm 1.1876 (1.3779) [2022-10-07 12:10:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][400/1251] eta 0:04:43 lr 0.000916 time 0.3219 (0.3326) loss 4.7530 (4.5762) grad_norm 1.7343 (1.3760) [2022-10-07 12:11:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][500/1251] eta 0:04:08 lr 0.000920 time 0.3221 (0.3310) loss 4.7207 (4.5781) grad_norm 1.5935 (1.3697) [2022-10-07 12:11:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][600/1251] eta 0:03:34 lr 0.000924 time 0.3284 (0.3300) loss 4.9697 (4.5833) grad_norm 1.3652 (1.3616) [2022-10-07 12:12:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][700/1251] eta 0:03:01 lr 0.000928 time 0.3223 (0.3291) loss 4.4894 (4.5822) grad_norm 1.8022 (1.3603) [2022-10-07 12:12:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][800/1251] eta 0:02:28 lr 0.000932 time 0.3191 (0.3285) loss 4.8544 (4.5851) grad_norm 1.2998 (1.3587) [2022-10-07 12:13:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][900/1251] eta 0:01:55 lr 0.000936 time 0.3202 (0.3280) loss 4.6867 (4.5775) grad_norm 1.5344 (1.3567) [2022-10-07 12:13:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][1000/1251] eta 0:01:22 lr 0.000940 time 0.3250 (0.3276) loss 4.6350 (4.5772) grad_norm 1.5613 (1.3528) [2022-10-07 12:14:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][1100/1251] eta 0:00:49 lr 0.000944 time 0.3228 (0.3273) loss 4.7999 (4.5790) grad_norm 1.2855 (1.3520) [2022-10-07 12:15:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [18/300][1200/1251] eta 0:00:16 lr 0.000948 time 0.3249 (0.3270) loss 4.5988 (4.5776) grad_norm 1.1841 (1.3505) [2022-10-07 12:15:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 18 training takes 0:06:49 [2022-10-07 12:15:20 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.084 (3.084) Loss 2.0045 (2.0045) Acc@1 56.836 (56.836) Acc@5 80.566 (80.566) [2022-10-07 12:15:31 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 57.726 Acc@5 81.718 [2022-10-07 12:15:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 57.7% [2022-10-07 12:15:31 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 57.73% [2022-10-07 12:15:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][0/1251] eta 0:56:54 lr 0.000950 time 2.7294 (2.7294) loss 4.2757 (4.2757) grad_norm 1.3301 (1.3301) [2022-10-07 12:16:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][100/1251] eta 0:06:43 lr 0.000954 time 0.3319 (0.3510) loss 4.5821 (4.5563) grad_norm 1.6113 (1.3795) [2022-10-07 12:16:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][200/1251] eta 0:05:55 lr 0.000958 time 0.3281 (0.3386) loss 4.1789 (4.5546) grad_norm 1.3068 (1.3430) [2022-10-07 12:17:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][300/1251] eta 0:05:17 lr 0.000962 time 0.3269 (0.3343) loss 4.3628 (4.5514) grad_norm 1.2917 (1.3270) [2022-10-07 12:17:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][400/1251] eta 0:04:42 lr 0.000966 time 0.3204 (0.3323) loss 4.3771 (4.5560) grad_norm 1.2593 (1.3268) [2022-10-07 12:18:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][500/1251] eta 0:04:08 lr 0.000970 time 0.3251 (0.3310) loss 4.6925 (4.5535) grad_norm 1.4293 (1.3152) [2022-10-07 12:18:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][600/1251] eta 0:03:34 lr 0.000974 time 0.3257 (0.3302) loss 4.4707 (4.5547) grad_norm 1.0823 (1.3185) [2022-10-07 12:19:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][700/1251] eta 0:03:01 lr 0.000978 time 0.3278 (0.3296) loss 4.6153 (4.5515) grad_norm 1.4440 (1.3163) [2022-10-07 12:19:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][800/1251] eta 0:02:28 lr 0.000982 time 0.3260 (0.3292) loss 4.0003 (4.5477) grad_norm 1.6951 (1.3161) [2022-10-07 12:20:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][900/1251] eta 0:01:55 lr 0.000986 time 0.3274 (0.3289) loss 4.7458 (4.5480) grad_norm 1.1746 (1.3178) [2022-10-07 12:21:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][1000/1251] eta 0:01:22 lr 0.000990 time 0.3215 (0.3287) loss 4.6558 (4.5490) grad_norm 1.1005 (1.3152) [2022-10-07 12:21:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][1100/1251] eta 0:00:49 lr 0.000994 time 0.3231 (0.3285) loss 4.8891 (4.5461) grad_norm 1.1029 (1.3113) [2022-10-07 12:22:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [19/300][1200/1251] eta 0:00:16 lr 0.000998 time 0.3319 (0.3284) loss 4.5680 (4.5452) grad_norm 1.1470 (1.3091) [2022-10-07 12:22:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 19 training takes 0:06:51 [2022-10-07 12:22:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.672 (2.672) Loss 1.8570 (1.8570) Acc@1 60.254 (60.254) Acc@5 81.055 (81.055) [2022-10-07 12:22:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 59.178 Acc@5 82.998 [2022-10-07 12:22:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 59.2% [2022-10-07 12:22:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 59.18% [2022-10-07 12:22:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][0/1251] eta 1:02:13 lr 0.000989 time 2.9845 (2.9845) loss 4.3125 (4.3125) grad_norm 1.2221 (1.2221) [2022-10-07 12:23:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][100/1251] eta 0:06:45 lr 0.000989 time 0.3227 (0.3523) loss 4.4448 (4.4952) grad_norm 1.2493 (1.2981) [2022-10-07 12:23:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][200/1251] eta 0:05:56 lr 0.000989 time 0.3247 (0.3389) loss 4.5693 (4.5204) grad_norm 0.9985 (1.2911) [2022-10-07 12:24:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][300/1251] eta 0:05:18 lr 0.000989 time 0.3217 (0.3344) loss 4.6854 (4.5296) grad_norm 1.0447 (1.2859) [2022-10-07 12:24:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][400/1251] eta 0:04:42 lr 0.000989 time 0.3252 (0.3324) loss 4.3374 (4.5196) grad_norm 1.3348 (1.2858) [2022-10-07 12:25:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][500/1251] eta 0:04:08 lr 0.000989 time 0.3259 (0.3313) loss 4.3291 (4.5189) grad_norm 1.0307 (1.2831) [2022-10-07 12:25:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][600/1251] eta 0:03:34 lr 0.000989 time 0.3227 (0.3301) loss 4.2519 (4.5075) grad_norm 1.4227 (1.2861) [2022-10-07 12:26:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][700/1251] eta 0:03:01 lr 0.000989 time 0.3234 (0.3293) loss 4.7274 (4.5098) grad_norm 1.0071 (1.2798) [2022-10-07 12:26:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][800/1251] eta 0:02:28 lr 0.000988 time 0.3297 (0.3287) loss 4.7343 (4.5092) grad_norm 1.2834 (1.2756) [2022-10-07 12:27:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][900/1251] eta 0:01:55 lr 0.000988 time 0.3226 (0.3282) loss 4.8701 (4.5043) grad_norm 1.4805 (1.2767) [2022-10-07 12:28:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][1000/1251] eta 0:01:22 lr 0.000988 time 0.3233 (0.3277) loss 4.5783 (4.5028) grad_norm 1.1390 (1.2726) [2022-10-07 12:28:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][1100/1251] eta 0:00:49 lr 0.000988 time 0.3237 (0.3274) loss 4.7192 (4.5010) grad_norm 1.0720 (1.2697) [2022-10-07 12:29:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [20/300][1200/1251] eta 0:00:16 lr 0.000988 time 0.3218 (0.3271) loss 4.7390 (4.5008) grad_norm 1.0659 (1.2764) [2022-10-07 12:29:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 20 training takes 0:06:49 [2022-10-07 12:29:25 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_20 saving...... [2022-10-07 12:29:25 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_20 saved !!! [2022-10-07 12:29:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.623 (2.623) Loss 1.7385 (1.7385) Acc@1 60.352 (60.352) Acc@5 84.277 (84.277) [2022-10-07 12:29:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 60.136 Acc@5 83.474 [2022-10-07 12:29:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 60.1% [2022-10-07 12:29:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 60.14% [2022-10-07 12:29:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][0/1251] eta 0:44:56 lr 0.000988 time 2.1553 (2.1553) loss 4.1964 (4.1964) grad_norm 1.0039 (1.0039) [2022-10-07 12:30:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][100/1251] eta 0:06:42 lr 0.000988 time 0.3342 (0.3496) loss 4.4967 (4.4429) grad_norm 1.5050 (1.2550) [2022-10-07 12:30:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][200/1251] eta 0:05:55 lr 0.000988 time 0.3269 (0.3379) loss 4.4218 (4.4506) grad_norm 1.1460 (1.2446) [2022-10-07 12:31:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][300/1251] eta 0:05:17 lr 0.000988 time 0.3256 (0.3339) loss 4.5667 (4.4443) grad_norm 1.4327 (1.2595) [2022-10-07 12:31:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][400/1251] eta 0:04:42 lr 0.000988 time 0.3274 (0.3320) loss 4.2143 (4.4515) grad_norm 1.5708 (1.2562) [2022-10-07 12:32:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][500/1251] eta 0:04:08 lr 0.000988 time 0.3225 (0.3307) loss 4.4430 (4.4570) grad_norm 1.2797 (1.2649) [2022-10-07 12:32:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][600/1251] eta 0:03:34 lr 0.000988 time 0.3319 (0.3299) loss 4.5182 (4.4580) grad_norm 1.5840 (1.2547) [2022-10-07 12:33:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][700/1251] eta 0:03:01 lr 0.000987 time 0.3241 (0.3292) loss 4.2706 (4.4553) grad_norm 1.2996 (1.2525) [2022-10-07 12:34:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][800/1251] eta 0:02:28 lr 0.000987 time 0.3292 (0.3287) loss 4.2539 (4.4531) grad_norm 1.7280 (1.2515) [2022-10-07 12:34:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][900/1251] eta 0:01:55 lr 0.000987 time 0.3268 (0.3283) loss 4.4866 (4.4546) grad_norm 1.0164 (1.2490) [2022-10-07 12:35:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][1000/1251] eta 0:01:22 lr 0.000987 time 0.3252 (0.3281) loss 4.6886 (4.4568) grad_norm 0.9642 (1.2462) [2022-10-07 12:35:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][1100/1251] eta 0:00:49 lr 0.000987 time 0.3296 (0.3279) loss 4.5589 (4.4565) grad_norm 1.3059 (1.2441) [2022-10-07 12:36:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [21/300][1200/1251] eta 0:00:16 lr 0.000987 time 0.3230 (0.3278) loss 4.2967 (4.4561) grad_norm 1.4041 (1.2431) [2022-10-07 12:36:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 21 training takes 0:06:50 [2022-10-07 12:36:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.949 (2.949) Loss 1.8148 (1.8148) Acc@1 60.449 (60.449) Acc@5 82.227 (82.227) [2022-10-07 12:36:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 61.004 Acc@5 84.254 [2022-10-07 12:36:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 61.0% [2022-10-07 12:36:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 61.00% [2022-10-07 12:36:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][0/1251] eta 0:51:25 lr 0.000987 time 2.4661 (2.4661) loss 4.6401 (4.6401) grad_norm 1.0742 (1.0742) [2022-10-07 12:37:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][100/1251] eta 0:06:42 lr 0.000987 time 0.3318 (0.3497) loss 4.4208 (4.4146) grad_norm 1.2847 (1.2384) [2022-10-07 12:37:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][200/1251] eta 0:05:55 lr 0.000987 time 0.3275 (0.3381) loss 4.8449 (4.4287) grad_norm 1.2066 (1.2490) [2022-10-07 12:38:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][300/1251] eta 0:05:17 lr 0.000987 time 0.3325 (0.3343) loss 4.4568 (4.4304) grad_norm 1.2386 (1.2521) [2022-10-07 12:38:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][400/1251] eta 0:04:42 lr 0.000987 time 0.3245 (0.3324) loss 4.7538 (4.4110) grad_norm 1.0755 (1.2450) [2022-10-07 12:39:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][500/1251] eta 0:04:08 lr 0.000986 time 0.3299 (0.3311) loss 4.5430 (4.4111) grad_norm 1.4653 (1.2408) [2022-10-07 12:40:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][600/1251] eta 0:03:34 lr 0.000986 time 0.3281 (0.3302) loss 4.5104 (4.4141) grad_norm 1.0372 (1.2387) [2022-10-07 12:40:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][700/1251] eta 0:03:01 lr 0.000986 time 0.3307 (0.3296) loss 4.4604 (4.4161) grad_norm 0.9754 (1.2394) [2022-10-07 12:41:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][800/1251] eta 0:02:28 lr 0.000986 time 0.3248 (0.3291) loss 4.0822 (4.4150) grad_norm 1.2132 (1.2376) [2022-10-07 12:41:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][900/1251] eta 0:01:55 lr 0.000986 time 0.3367 (0.3288) loss 4.5409 (4.4118) grad_norm 1.2648 (1.2345) [2022-10-07 12:42:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][1000/1251] eta 0:01:22 lr 0.000986 time 0.3244 (0.3285) loss 4.4009 (4.4075) grad_norm 1.1504 (1.2346) [2022-10-07 12:42:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][1100/1251] eta 0:00:49 lr 0.000986 time 0.3281 (0.3282) loss 4.4897 (4.4102) grad_norm 1.1977 (1.2304) [2022-10-07 12:43:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [22/300][1200/1251] eta 0:00:16 lr 0.000986 time 0.3200 (0.3280) loss 4.4559 (4.4091) grad_norm 1.3696 (1.2293) [2022-10-07 12:43:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 22 training takes 0:06:50 [2022-10-07 12:43:35 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.789 (2.789) Loss 1.6645 (1.6645) Acc@1 61.523 (61.523) Acc@5 86.133 (86.133) [2022-10-07 12:43:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 61.914 Acc@5 84.812 [2022-10-07 12:43:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 61.9% [2022-10-07 12:43:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 61.91% [2022-10-07 12:43:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][0/1251] eta 0:59:21 lr 0.000986 time 2.8473 (2.8473) loss 4.4072 (4.4072) grad_norm 1.2371 (1.2371) [2022-10-07 12:44:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][100/1251] eta 0:06:43 lr 0.000986 time 0.3246 (0.3506) loss 3.9823 (4.3778) grad_norm 1.2026 (1.1667) [2022-10-07 12:44:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][200/1251] eta 0:05:55 lr 0.000986 time 0.3251 (0.3378) loss 4.1374 (4.3734) grad_norm 1.1962 (1.1995) [2022-10-07 12:45:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][300/1251] eta 0:05:17 lr 0.000985 time 0.3253 (0.3336) loss 4.2875 (4.3770) grad_norm 1.2573 (1.2034) [2022-10-07 12:45:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][400/1251] eta 0:04:42 lr 0.000985 time 0.3218 (0.3316) loss 4.3829 (4.3754) grad_norm 1.0964 (1.2048) [2022-10-07 12:46:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][500/1251] eta 0:04:07 lr 0.000985 time 0.3221 (0.3301) loss 4.5865 (4.3755) grad_norm 1.0979 (1.2089) [2022-10-07 12:47:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][600/1251] eta 0:03:34 lr 0.000985 time 0.3240 (0.3291) loss 4.5219 (4.3771) grad_norm 1.0564 (1.2060) [2022-10-07 12:47:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][700/1251] eta 0:03:00 lr 0.000985 time 0.3222 (0.3284) loss 4.6408 (4.3730) grad_norm 0.9710 (1.2042) [2022-10-07 12:48:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][800/1251] eta 0:02:27 lr 0.000985 time 0.3229 (0.3278) loss 4.4639 (4.3739) grad_norm 1.3249 (1.2039) [2022-10-07 12:48:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][900/1251] eta 0:01:54 lr 0.000985 time 0.3214 (0.3272) loss 4.5753 (4.3739) grad_norm 1.0713 (1.1997) [2022-10-07 12:49:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][1000/1251] eta 0:01:22 lr 0.000985 time 0.3205 (0.3268) loss 4.1243 (4.3742) grad_norm 1.1157 (1.1971) [2022-10-07 12:49:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][1100/1251] eta 0:00:49 lr 0.000985 time 0.3241 (0.3264) loss 4.4082 (4.3742) grad_norm 1.1124 (1.1953) [2022-10-07 12:50:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [23/300][1200/1251] eta 0:00:16 lr 0.000985 time 0.3233 (0.3263) loss 3.9241 (4.3722) grad_norm 1.0723 (1.1981) [2022-10-07 12:50:34 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 23 training takes 0:06:48 [2022-10-07 12:50:37 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.874 (2.874) Loss 1.6010 (1.6010) Acc@1 64.062 (64.062) Acc@5 87.793 (87.793) [2022-10-07 12:50:48 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 62.108 Acc@5 84.914 [2022-10-07 12:50:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 62.1% [2022-10-07 12:50:48 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 62.11% [2022-10-07 12:50:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][0/1251] eta 0:55:04 lr 0.000984 time 2.6418 (2.6418) loss 4.4810 (4.4810) grad_norm 1.3173 (1.3173) [2022-10-07 12:51:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][100/1251] eta 0:06:40 lr 0.000984 time 0.3289 (0.3481) loss 4.2734 (4.3640) grad_norm 1.2657 (1.2117) [2022-10-07 12:51:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][200/1251] eta 0:05:53 lr 0.000984 time 0.3242 (0.3365) loss 3.6972 (4.3611) grad_norm 1.2329 (1.1940) [2022-10-07 12:52:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][300/1251] eta 0:05:16 lr 0.000984 time 0.3235 (0.3326) loss 4.3634 (4.3588) grad_norm 1.0217 (1.1918) [2022-10-07 12:53:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][400/1251] eta 0:04:41 lr 0.000984 time 0.3238 (0.3306) loss 4.5604 (4.3543) grad_norm 1.0216 (1.1932) [2022-10-07 12:53:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][500/1251] eta 0:04:07 lr 0.000984 time 0.3271 (0.3294) loss 4.0925 (4.3571) grad_norm 1.1767 (1.1918) [2022-10-07 12:54:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][600/1251] eta 0:03:33 lr 0.000984 time 0.3157 (0.3285) loss 4.1931 (4.3537) grad_norm 0.9895 (1.1961) [2022-10-07 12:54:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][700/1251] eta 0:03:00 lr 0.000984 time 0.3269 (0.3279) loss 4.2947 (4.3501) grad_norm 1.0327 (1.1959) [2022-10-07 12:55:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][800/1251] eta 0:02:27 lr 0.000984 time 0.3238 (0.3275) loss 4.6403 (4.3527) grad_norm 1.1159 (1.1910) [2022-10-07 12:55:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][900/1251] eta 0:01:54 lr 0.000984 time 0.3287 (0.3272) loss 4.1876 (4.3500) grad_norm 1.0555 (1.1925) [2022-10-07 12:56:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][1000/1251] eta 0:01:22 lr 0.000983 time 0.3254 (0.3270) loss 4.5410 (4.3470) grad_norm 1.1249 (1.1931) [2022-10-07 12:56:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][1100/1251] eta 0:00:49 lr 0.000983 time 0.3362 (0.3268) loss 4.1064 (4.3474) grad_norm 1.1801 (1.1904) [2022-10-07 12:57:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [24/300][1200/1251] eta 0:00:16 lr 0.000983 time 0.3285 (0.3267) loss 4.2130 (4.3461) grad_norm 1.0639 (1.1887) [2022-10-07 12:57:37 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 24 training takes 0:06:48 [2022-10-07 12:57:40 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.174 (3.174) Loss 1.5078 (1.5078) Acc@1 66.113 (66.113) Acc@5 87.012 (87.012) [2022-10-07 12:57:50 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 63.254 Acc@5 85.726 [2022-10-07 12:57:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 63.3% [2022-10-07 12:57:50 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 63.25% [2022-10-07 12:57:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][0/1251] eta 1:07:10 lr 0.000983 time 3.2218 (3.2218) loss 4.4348 (4.4348) grad_norm 1.2258 (1.2258) [2022-10-07 12:58:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][100/1251] eta 0:06:47 lr 0.000983 time 0.3257 (0.3543) loss 4.6935 (4.3328) grad_norm 1.1581 (1.2124) [2022-10-07 12:58:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][200/1251] eta 0:05:57 lr 0.000983 time 0.3224 (0.3401) loss 4.2410 (4.3185) grad_norm 1.0915 (1.1906) [2022-10-07 12:59:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][300/1251] eta 0:05:18 lr 0.000983 time 0.3236 (0.3352) loss 4.3031 (4.3071) grad_norm 0.9581 (1.1873) [2022-10-07 13:00:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][400/1251] eta 0:04:43 lr 0.000983 time 0.3228 (0.3326) loss 4.0798 (4.3055) grad_norm 1.4169 (1.1904) [2022-10-07 13:00:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][500/1251] eta 0:04:08 lr 0.000983 time 0.3219 (0.3310) loss 4.5304 (4.3109) grad_norm 0.9830 (1.1860) [2022-10-07 13:01:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][600/1251] eta 0:03:34 lr 0.000982 time 0.3259 (0.3298) loss 4.7399 (4.3125) grad_norm 1.1339 (1.1847) [2022-10-07 13:01:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][700/1251] eta 0:03:01 lr 0.000982 time 0.3221 (0.3289) loss 4.2131 (4.3132) grad_norm 1.0280 (1.1904) [2022-10-07 13:02:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][800/1251] eta 0:02:28 lr 0.000982 time 0.3218 (0.3283) loss 4.3840 (4.3103) grad_norm 1.2948 (1.1864) [2022-10-07 13:02:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][900/1251] eta 0:01:55 lr 0.000982 time 0.3255 (0.3279) loss 4.1857 (4.3106) grad_norm 1.1709 (1.1866) [2022-10-07 13:03:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][1000/1251] eta 0:01:22 lr 0.000982 time 0.3249 (0.3275) loss 4.2923 (4.3113) grad_norm 1.2020 (1.1845) [2022-10-07 13:03:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][1100/1251] eta 0:00:49 lr 0.000982 time 0.3244 (0.3273) loss 4.2038 (4.3130) grad_norm 1.1589 (1.1833) [2022-10-07 13:04:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [25/300][1200/1251] eta 0:00:16 lr 0.000982 time 0.3242 (0.3271) loss 4.5666 (4.3129) grad_norm 1.5847 (1.1834) [2022-10-07 13:04:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 25 training takes 0:06:49 [2022-10-07 13:04:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.262 (2.262) Loss 1.7119 (1.7119) Acc@1 59.766 (59.766) Acc@5 84.277 (84.277) [2022-10-07 13:04:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 63.530 Acc@5 85.996 [2022-10-07 13:04:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 63.5% [2022-10-07 13:04:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 63.53% [2022-10-07 13:04:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][0/1251] eta 0:44:27 lr 0.000982 time 2.1327 (2.1327) loss 4.3870 (4.3870) grad_norm 1.0399 (1.0399) [2022-10-07 13:05:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][100/1251] eta 0:06:41 lr 0.000982 time 0.3257 (0.3485) loss 4.2679 (4.3091) grad_norm 1.1651 (1.1547) [2022-10-07 13:06:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][200/1251] eta 0:05:53 lr 0.000982 time 0.3269 (0.3367) loss 4.3346 (4.2981) grad_norm 1.0227 (1.1810) [2022-10-07 13:06:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][300/1251] eta 0:05:17 lr 0.000981 time 0.3210 (0.3334) loss 4.3990 (4.2967) grad_norm 1.2438 (1.1706) [2022-10-07 13:07:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][400/1251] eta 0:04:41 lr 0.000981 time 0.3218 (0.3312) loss 4.0886 (4.2845) grad_norm 1.4839 (1.1677) [2022-10-07 13:07:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][500/1251] eta 0:04:07 lr 0.000981 time 0.3192 (0.3299) loss 4.2392 (4.2855) grad_norm 1.3439 (1.1765) [2022-10-07 13:08:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][600/1251] eta 0:03:34 lr 0.000981 time 0.3244 (0.3290) loss 4.2629 (4.2852) grad_norm 1.0016 (1.1667) [2022-10-07 13:08:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][700/1251] eta 0:03:00 lr 0.000981 time 0.3257 (0.3284) loss 4.1300 (4.2863) grad_norm 1.3498 (1.1748) [2022-10-07 13:09:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][800/1251] eta 0:02:27 lr 0.000981 time 0.3238 (0.3280) loss 4.1676 (4.2869) grad_norm 1.1459 (1.1726) [2022-10-07 13:09:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][900/1251] eta 0:01:54 lr 0.000981 time 0.3259 (0.3276) loss 4.3713 (4.2866) grad_norm 0.9419 (1.1718) [2022-10-07 13:10:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][1000/1251] eta 0:01:22 lr 0.000981 time 0.3256 (0.3273) loss 4.2295 (4.2844) grad_norm 1.3487 (1.1675) [2022-10-07 13:10:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][1100/1251] eta 0:00:49 lr 0.000981 time 0.3236 (0.3270) loss 4.3527 (4.2845) grad_norm 1.1912 (1.1689) [2022-10-07 13:11:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [26/300][1200/1251] eta 0:00:16 lr 0.000980 time 0.3272 (0.3268) loss 4.3876 (4.2820) grad_norm 1.1295 (1.1689) [2022-10-07 13:11:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 26 training takes 0:06:49 [2022-10-07 13:11:44 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.431 (2.431) Loss 1.5583 (1.5583) Acc@1 65.137 (65.137) Acc@5 87.012 (87.012) [2022-10-07 13:11:55 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 64.364 Acc@5 86.466 [2022-10-07 13:11:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 64.4% [2022-10-07 13:11:55 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 64.36% [2022-10-07 13:11:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][0/1251] eta 1:08:10 lr 0.000980 time 3.2696 (3.2696) loss 4.1688 (4.1688) grad_norm 1.3862 (1.3862) [2022-10-07 13:12:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][100/1251] eta 0:06:49 lr 0.000980 time 0.3262 (0.3560) loss 4.3113 (4.2702) grad_norm 0.9770 (1.1578) [2022-10-07 13:13:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][200/1251] eta 0:05:58 lr 0.000980 time 0.3216 (0.3412) loss 4.1855 (4.2549) grad_norm 1.5824 (1.1708) [2022-10-07 13:13:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][300/1251] eta 0:05:19 lr 0.000980 time 0.3259 (0.3362) loss 4.3921 (4.2575) grad_norm 1.0745 (1.1830) [2022-10-07 13:14:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][400/1251] eta 0:04:43 lr 0.000980 time 0.3271 (0.3336) loss 4.0758 (4.2495) grad_norm 1.1579 (1.1792) [2022-10-07 13:14:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][500/1251] eta 0:04:09 lr 0.000980 time 0.3250 (0.3319) loss 4.0672 (4.2512) grad_norm 0.9879 (1.1694) [2022-10-07 13:15:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][600/1251] eta 0:03:35 lr 0.000980 time 0.3245 (0.3308) loss 4.4271 (4.2538) grad_norm 1.0105 (1.1680) [2022-10-07 13:15:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][700/1251] eta 0:03:01 lr 0.000980 time 0.3317 (0.3301) loss 4.3259 (4.2536) grad_norm 1.5098 (1.1666) [2022-10-07 13:16:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][800/1251] eta 0:02:28 lr 0.000979 time 0.3251 (0.3295) loss 4.1338 (4.2556) grad_norm 1.2331 (1.1653) [2022-10-07 13:16:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][900/1251] eta 0:01:55 lr 0.000979 time 0.3289 (0.3290) loss 4.0987 (4.2537) grad_norm 1.2437 (1.1632) [2022-10-07 13:17:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][1000/1251] eta 0:01:22 lr 0.000979 time 0.3236 (0.3287) loss 4.2413 (4.2560) grad_norm 1.2471 (1.1632) [2022-10-07 13:17:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][1100/1251] eta 0:00:49 lr 0.000979 time 0.3237 (0.3284) loss 4.1266 (4.2567) grad_norm 1.1530 (1.1608) [2022-10-07 13:18:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [27/300][1200/1251] eta 0:00:16 lr 0.000979 time 0.3257 (0.3282) loss 4.0679 (4.2545) grad_norm 1.0255 (1.1610) [2022-10-07 13:18:46 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 27 training takes 0:06:50 [2022-10-07 13:18:48 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.460 (2.460) Loss 1.5743 (1.5743) Acc@1 63.672 (63.672) Acc@5 86.035 (86.035) [2022-10-07 13:18:59 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 64.702 Acc@5 86.526 [2022-10-07 13:18:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 64.7% [2022-10-07 13:18:59 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 64.70% [2022-10-07 13:19:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][0/1251] eta 1:08:10 lr 0.000979 time 3.2695 (3.2695) loss 4.3000 (4.3000) grad_norm 1.3774 (1.3774) [2022-10-07 13:19:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][100/1251] eta 0:06:49 lr 0.000979 time 0.3255 (0.3560) loss 4.5086 (4.2262) grad_norm 1.1334 (1.1427) [2022-10-07 13:20:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][200/1251] eta 0:05:58 lr 0.000979 time 0.3260 (0.3414) loss 4.3406 (4.2442) grad_norm 1.1578 (1.1478) [2022-10-07 13:20:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][300/1251] eta 0:05:19 lr 0.000979 time 0.3312 (0.3363) loss 4.3916 (4.2382) grad_norm 1.1764 (1.1411) [2022-10-07 13:21:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][400/1251] eta 0:04:43 lr 0.000978 time 0.3283 (0.3336) loss 4.3428 (4.2372) grad_norm 1.1916 (1.1541) [2022-10-07 13:21:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][500/1251] eta 0:04:09 lr 0.000978 time 0.3223 (0.3319) loss 4.2657 (4.2346) grad_norm 1.3384 (1.1512) [2022-10-07 13:22:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][600/1251] eta 0:03:35 lr 0.000978 time 0.3268 (0.3307) loss 3.9010 (4.2336) grad_norm 1.1867 (1.1505) [2022-10-07 13:22:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][700/1251] eta 0:03:01 lr 0.000978 time 0.3241 (0.3299) loss 3.9097 (4.2318) grad_norm 1.0427 (1.1560) [2022-10-07 13:23:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][800/1251] eta 0:02:28 lr 0.000978 time 0.3223 (0.3293) loss 4.4083 (4.2315) grad_norm 1.0611 (1.1551) [2022-10-07 13:23:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][900/1251] eta 0:01:55 lr 0.000978 time 0.3256 (0.3287) loss 4.1846 (4.2292) grad_norm 1.3038 (1.1550) [2022-10-07 13:24:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][1000/1251] eta 0:01:22 lr 0.000978 time 0.3175 (0.3283) loss 4.2766 (4.2283) grad_norm 1.1822 (1.1566) [2022-10-07 13:25:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][1100/1251] eta 0:00:49 lr 0.000978 time 0.3239 (0.3280) loss 4.5340 (4.2297) grad_norm 1.0098 (1.1556) [2022-10-07 13:25:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [28/300][1200/1251] eta 0:00:16 lr 0.000977 time 0.3213 (0.3277) loss 4.2545 (4.2311) grad_norm 1.3288 (1.1553) [2022-10-07 13:25:49 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 28 training takes 0:06:50 [2022-10-07 13:25:52 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.554 (2.554) Loss 1.5406 (1.5406) Acc@1 66.016 (66.016) Acc@5 86.426 (86.426) [2022-10-07 13:26:03 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 65.298 Acc@5 87.232 [2022-10-07 13:26:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 65.3% [2022-10-07 13:26:03 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 65.30% [2022-10-07 13:26:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][0/1251] eta 0:54:26 lr 0.000977 time 2.6113 (2.6113) loss 4.1895 (4.1895) grad_norm 1.1864 (1.1864) [2022-10-07 13:26:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][100/1251] eta 0:06:41 lr 0.000977 time 0.3282 (0.3490) loss 4.4762 (4.1852) grad_norm 1.2449 (1.1722) [2022-10-07 13:27:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][200/1251] eta 0:05:54 lr 0.000977 time 0.3204 (0.3377) loss 3.8926 (4.1743) grad_norm 0.9484 (1.1667) [2022-10-07 13:27:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][300/1251] eta 0:05:16 lr 0.000977 time 0.3235 (0.3332) loss 4.2690 (4.1895) grad_norm 0.9245 (1.1674) [2022-10-07 13:28:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][400/1251] eta 0:04:41 lr 0.000977 time 0.3261 (0.3311) loss 4.3471 (4.1953) grad_norm 1.1758 (1.1564) [2022-10-07 13:28:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][500/1251] eta 0:04:07 lr 0.000977 time 0.3285 (0.3297) loss 4.1080 (4.1978) grad_norm 1.1151 (1.1504) [2022-10-07 13:29:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][600/1251] eta 0:03:34 lr 0.000977 time 0.3263 (0.3289) loss 3.9870 (4.1961) grad_norm 1.0354 (1.1508) [2022-10-07 13:29:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][700/1251] eta 0:03:00 lr 0.000976 time 0.3261 (0.3283) loss 4.3805 (4.1986) grad_norm 1.0210 (1.1488) [2022-10-07 13:30:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][800/1251] eta 0:02:27 lr 0.000976 time 0.3241 (0.3278) loss 4.2271 (4.1989) grad_norm 1.0933 (1.1473) [2022-10-07 13:30:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][900/1251] eta 0:01:54 lr 0.000976 time 0.3265 (0.3275) loss 4.3121 (4.2030) grad_norm 1.0829 (1.1473) [2022-10-07 13:31:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][1000/1251] eta 0:01:22 lr 0.000976 time 0.3239 (0.3272) loss 4.4199 (4.2001) grad_norm 0.9248 (1.1466) [2022-10-07 13:32:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][1100/1251] eta 0:00:49 lr 0.000976 time 0.3238 (0.3269) loss 4.5537 (4.1986) grad_norm 0.9679 (1.1454) [2022-10-07 13:32:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [29/300][1200/1251] eta 0:00:16 lr 0.000976 time 0.3249 (0.3267) loss 4.1584 (4.2012) grad_norm 1.0962 (1.1452) [2022-10-07 13:32:52 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 29 training takes 0:06:48 [2022-10-07 13:32:54 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.850 (2.850) Loss 1.5744 (1.5744) Acc@1 64.844 (64.844) Acc@5 86.719 (86.719) [2022-10-07 13:33:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 65.600 Acc@5 87.434 [2022-10-07 13:33:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 65.6% [2022-10-07 13:33:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 65.60% [2022-10-07 13:33:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][0/1251] eta 1:01:00 lr 0.000976 time 2.9262 (2.9262) loss 4.0653 (4.0653) grad_norm 1.0528 (1.0528) [2022-10-07 13:33:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][100/1251] eta 0:06:45 lr 0.000976 time 0.3279 (0.3520) loss 3.9341 (4.1663) grad_norm 1.3037 (1.1268) [2022-10-07 13:34:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][200/1251] eta 0:05:56 lr 0.000976 time 0.3229 (0.3389) loss 3.9053 (4.1651) grad_norm 1.2020 (1.1338) [2022-10-07 13:34:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][300/1251] eta 0:05:18 lr 0.000975 time 0.3268 (0.3345) loss 4.1456 (4.1696) grad_norm 1.1374 (1.1397) [2022-10-07 13:35:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][400/1251] eta 0:04:42 lr 0.000975 time 0.3265 (0.3322) loss 4.1872 (4.1611) grad_norm 1.0260 (1.1342) [2022-10-07 13:35:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][500/1251] eta 0:04:08 lr 0.000975 time 0.3267 (0.3307) loss 4.2458 (4.1610) grad_norm 1.3030 (1.1360) [2022-10-07 13:36:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][600/1251] eta 0:03:34 lr 0.000975 time 0.3256 (0.3298) loss 4.1700 (4.1658) grad_norm 1.1554 (1.1363) [2022-10-07 13:36:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][700/1251] eta 0:03:01 lr 0.000975 time 0.3263 (0.3291) loss 3.9455 (4.1735) grad_norm 1.0788 (1.1383) [2022-10-07 13:37:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][800/1251] eta 0:02:28 lr 0.000975 time 0.3204 (0.3286) loss 4.1646 (4.1761) grad_norm 1.4965 (1.1364) [2022-10-07 13:38:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][900/1251] eta 0:01:55 lr 0.000975 time 0.3239 (0.3282) loss 4.1792 (4.1781) grad_norm 1.2799 (1.1364) [2022-10-07 13:38:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][1000/1251] eta 0:01:22 lr 0.000974 time 0.3224 (0.3279) loss 3.7711 (4.1809) grad_norm 1.3781 (1.1363) [2022-10-07 13:39:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][1100/1251] eta 0:00:49 lr 0.000974 time 0.3243 (0.3276) loss 4.5575 (4.1779) grad_norm 1.1077 (1.1381) [2022-10-07 13:39:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [30/300][1200/1251] eta 0:00:16 lr 0.000974 time 0.3330 (0.3275) loss 4.2343 (4.1770) grad_norm 1.0496 (1.1367) [2022-10-07 13:39:55 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 30 training takes 0:06:49 [2022-10-07 13:39:55 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_30 saving...... [2022-10-07 13:39:55 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_30 saved !!! [2022-10-07 13:39:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.097 (3.097) Loss 1.5014 (1.5014) Acc@1 66.602 (66.602) Acc@5 87.402 (87.402) [2022-10-07 13:40:09 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 65.734 Acc@5 87.280 [2022-10-07 13:40:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 65.7% [2022-10-07 13:40:09 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 65.73% [2022-10-07 13:40:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][0/1251] eta 0:53:56 lr 0.000974 time 2.5875 (2.5875) loss 3.7776 (3.7776) grad_norm 1.0427 (1.0427) [2022-10-07 13:40:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][100/1251] eta 0:06:44 lr 0.000974 time 0.3305 (0.3512) loss 4.3687 (4.1436) grad_norm 1.0060 (1.1707) [2022-10-07 13:41:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][200/1251] eta 0:05:55 lr 0.000974 time 0.3247 (0.3385) loss 4.3161 (4.1580) grad_norm 0.8807 (1.1652) [2022-10-07 13:41:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][300/1251] eta 0:05:17 lr 0.000974 time 0.3251 (0.3340) loss 4.0469 (4.1605) grad_norm 1.0102 (1.1659) [2022-10-07 13:42:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][400/1251] eta 0:04:42 lr 0.000974 time 0.3201 (0.3317) loss 4.4935 (4.1699) grad_norm 1.6658 (1.1576) [2022-10-07 13:42:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][500/1251] eta 0:04:08 lr 0.000973 time 0.3235 (0.3303) loss 4.0721 (4.1727) grad_norm 1.4203 (1.1560) [2022-10-07 13:43:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][600/1251] eta 0:03:34 lr 0.000973 time 0.3248 (0.3294) loss 4.1454 (4.1758) grad_norm 1.3074 (1.1536) [2022-10-07 13:43:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][700/1251] eta 0:03:01 lr 0.000973 time 0.3313 (0.3287) loss 4.4456 (4.1785) grad_norm 1.0104 (1.1578) [2022-10-07 13:44:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][800/1251] eta 0:02:28 lr 0.000973 time 0.3236 (0.3282) loss 4.2821 (4.1794) grad_norm 1.0100 (1.1476) [2022-10-07 13:45:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][900/1251] eta 0:01:55 lr 0.000973 time 0.3314 (0.3278) loss 3.9376 (4.1795) grad_norm 1.1098 (1.1478) [2022-10-07 13:45:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][1000/1251] eta 0:01:22 lr 0.000973 time 0.3319 (0.3276) loss 4.0377 (4.1758) grad_norm 1.3042 (1.1501) [2022-10-07 13:46:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][1100/1251] eta 0:00:49 lr 0.000973 time 0.3233 (0.3273) loss 4.3052 (4.1736) grad_norm 1.1049 (1.1481) [2022-10-07 13:46:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [31/300][1200/1251] eta 0:00:16 lr 0.000973 time 0.3248 (0.3273) loss 4.3633 (4.1734) grad_norm 1.2626 (1.1460) [2022-10-07 13:46:58 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 31 training takes 0:06:49 [2022-10-07 13:47:01 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.284 (2.284) Loss 1.4170 (1.4170) Acc@1 68.457 (68.457) Acc@5 89.258 (89.258) [2022-10-07 13:47:12 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 66.412 Acc@5 87.662 [2022-10-07 13:47:12 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 66.4% [2022-10-07 13:47:12 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 66.41% [2022-10-07 13:47:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][0/1251] eta 0:47:05 lr 0.000972 time 2.2590 (2.2590) loss 4.0535 (4.0535) grad_norm 1.0735 (1.0735) [2022-10-07 13:47:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][100/1251] eta 0:06:47 lr 0.000972 time 0.3282 (0.3541) loss 3.9000 (4.1245) grad_norm 1.3356 (1.1771) [2022-10-07 13:48:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][200/1251] eta 0:05:57 lr 0.000972 time 0.3234 (0.3405) loss 3.9385 (4.1444) grad_norm 1.0940 (1.1550) [2022-10-07 13:48:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][300/1251] eta 0:05:19 lr 0.000972 time 0.3234 (0.3359) loss 4.0125 (4.1414) grad_norm 1.1354 (1.1430) [2022-10-07 13:49:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][400/1251] eta 0:04:43 lr 0.000972 time 0.3272 (0.3335) loss 3.7418 (4.1495) grad_norm 1.1973 (1.1404) [2022-10-07 13:49:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][500/1251] eta 0:04:09 lr 0.000972 time 0.3261 (0.3320) loss 4.0054 (4.1575) grad_norm 1.1369 (1.1399) [2022-10-07 13:50:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][600/1251] eta 0:03:35 lr 0.000972 time 0.3265 (0.3310) loss 4.1082 (4.1572) grad_norm 1.3844 (1.1403) [2022-10-07 13:51:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][700/1251] eta 0:03:01 lr 0.000972 time 0.3247 (0.3302) loss 4.3207 (4.1600) grad_norm 0.9660 (1.1398) [2022-10-07 13:51:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][800/1251] eta 0:02:28 lr 0.000971 time 0.3243 (0.3296) loss 4.2760 (4.1540) grad_norm 0.9971 (1.1366) [2022-10-07 13:52:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][900/1251] eta 0:01:55 lr 0.000971 time 0.3246 (0.3290) loss 4.0783 (4.1531) grad_norm 0.9746 (1.1356) [2022-10-07 13:52:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][1000/1251] eta 0:01:22 lr 0.000971 time 0.3252 (0.3286) loss 4.1859 (4.1522) grad_norm 0.9055 (1.1376) [2022-10-07 13:53:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][1100/1251] eta 0:00:49 lr 0.000971 time 0.3250 (0.3283) loss 4.3750 (4.1505) grad_norm 0.9521 (1.1356) [2022-10-07 13:53:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [32/300][1200/1251] eta 0:00:16 lr 0.000971 time 0.3339 (0.3283) loss 4.0986 (4.1506) grad_norm 1.1005 (1.1363) [2022-10-07 13:54:03 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 32 training takes 0:06:50 [2022-10-07 13:54:05 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.816 (2.816) Loss 1.3681 (1.3681) Acc@1 67.969 (67.969) Acc@5 89.258 (89.258) [2022-10-07 13:54:16 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 66.774 Acc@5 87.990 [2022-10-07 13:54:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 66.8% [2022-10-07 13:54:16 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 66.77% [2022-10-07 13:54:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][0/1251] eta 0:47:10 lr 0.000971 time 2.2627 (2.2627) loss 4.3007 (4.3007) grad_norm 1.3674 (1.3674) [2022-10-07 13:54:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][100/1251] eta 0:06:41 lr 0.000971 time 0.3282 (0.3489) loss 4.0232 (4.0959) grad_norm 1.2439 (1.1568) [2022-10-07 13:55:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][200/1251] eta 0:05:53 lr 0.000970 time 0.3253 (0.3368) loss 4.2258 (4.1351) grad_norm 1.0135 (1.1252) [2022-10-07 13:55:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][300/1251] eta 0:05:16 lr 0.000970 time 0.3239 (0.3328) loss 4.2463 (4.1268) grad_norm 0.9799 (1.1245) [2022-10-07 13:56:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][400/1251] eta 0:04:41 lr 0.000970 time 0.3225 (0.3308) loss 4.0925 (4.1194) grad_norm 0.9375 (1.1400) [2022-10-07 13:57:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][500/1251] eta 0:04:07 lr 0.000970 time 0.3274 (0.3295) loss 4.3827 (4.1183) grad_norm 1.0944 (1.1372) [2022-10-07 13:57:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][600/1251] eta 0:03:33 lr 0.000970 time 0.3286 (0.3286) loss 4.0490 (4.1193) grad_norm 1.1318 (1.1337) [2022-10-07 13:58:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][700/1251] eta 0:03:00 lr 0.000970 time 0.3214 (0.3280) loss 4.2016 (4.1265) grad_norm 1.0664 (1.1388) [2022-10-07 13:58:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][800/1251] eta 0:02:27 lr 0.000970 time 0.3224 (0.3276) loss 4.1530 (4.1235) grad_norm 1.0384 (1.1377) [2022-10-07 13:59:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][900/1251] eta 0:01:54 lr 0.000969 time 0.3228 (0.3272) loss 4.2242 (4.1246) grad_norm 1.2284 (1.1402) [2022-10-07 13:59:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][1000/1251] eta 0:01:22 lr 0.000969 time 0.3237 (0.3269) loss 4.1375 (4.1241) grad_norm 1.1336 (1.1406) [2022-10-07 14:00:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][1100/1251] eta 0:00:49 lr 0.000969 time 0.3255 (0.3268) loss 3.7656 (4.1257) grad_norm 1.0683 (1.1428) [2022-10-07 14:00:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [33/300][1200/1251] eta 0:00:16 lr 0.000969 time 0.3284 (0.3268) loss 4.1865 (4.1269) grad_norm 0.9819 (1.1407) [2022-10-07 14:01:05 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 33 training takes 0:06:49 [2022-10-07 14:01:08 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.703 (2.703) Loss 1.3169 (1.3169) Acc@1 70.020 (70.020) Acc@5 88.867 (88.867) [2022-10-07 14:01:19 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 66.806 Acc@5 88.194 [2022-10-07 14:01:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 66.8% [2022-10-07 14:01:19 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 66.81% [2022-10-07 14:01:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][0/1251] eta 0:57:14 lr 0.000969 time 2.7455 (2.7455) loss 4.1016 (4.1016) grad_norm 1.4765 (1.4765) [2022-10-07 14:01:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][100/1251] eta 0:06:43 lr 0.000969 time 0.3243 (0.3502) loss 3.9319 (4.0664) grad_norm 1.1161 (1.1458) [2022-10-07 14:02:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][200/1251] eta 0:05:54 lr 0.000969 time 0.3304 (0.3377) loss 4.0164 (4.0960) grad_norm 1.0998 (1.1343) [2022-10-07 14:02:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][300/1251] eta 0:05:17 lr 0.000969 time 0.3276 (0.3336) loss 3.9813 (4.1045) grad_norm 1.1471 (1.1389) [2022-10-07 14:03:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][400/1251] eta 0:04:42 lr 0.000968 time 0.3282 (0.3315) loss 4.1454 (4.0966) grad_norm 1.0785 (1.1363) [2022-10-07 14:04:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][500/1251] eta 0:04:07 lr 0.000968 time 0.3229 (0.3302) loss 4.3142 (4.1000) grad_norm 1.2188 (1.1364) [2022-10-07 14:04:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][600/1251] eta 0:03:34 lr 0.000968 time 0.3251 (0.3293) loss 4.0976 (4.1024) grad_norm 0.9147 (1.1354) [2022-10-07 14:05:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][700/1251] eta 0:03:01 lr 0.000968 time 0.3279 (0.3287) loss 4.2933 (4.1062) grad_norm 0.9534 (1.1330) [2022-10-07 14:05:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][800/1251] eta 0:02:28 lr 0.000968 time 0.3266 (0.3282) loss 4.1803 (4.1109) grad_norm 1.0363 (1.1317) [2022-10-07 14:06:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][900/1251] eta 0:01:55 lr 0.000968 time 0.3239 (0.3279) loss 4.3003 (4.1115) grad_norm 1.1733 (1.1352) [2022-10-07 14:06:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][1000/1251] eta 0:01:22 lr 0.000967 time 0.3296 (0.3276) loss 3.8480 (4.1126) grad_norm 1.0786 (1.1318) [2022-10-07 14:07:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][1100/1251] eta 0:00:49 lr 0.000967 time 0.3248 (0.3275) loss 3.8021 (4.1134) grad_norm 0.9521 (1.1289) [2022-10-07 14:07:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [34/300][1200/1251] eta 0:00:16 lr 0.000967 time 0.3243 (0.3275) loss 4.5228 (4.1146) grad_norm 0.9988 (1.1257) [2022-10-07 14:08:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 34 training takes 0:06:49 [2022-10-07 14:08:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.230 (2.230) Loss 1.3537 (1.3537) Acc@1 67.090 (67.090) Acc@5 89.160 (89.160) [2022-10-07 14:08:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 67.260 Acc@5 88.228 [2022-10-07 14:08:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 67.3% [2022-10-07 14:08:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 67.26% [2022-10-07 14:08:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][0/1251] eta 1:09:39 lr 0.000967 time 3.3409 (3.3409) loss 4.0775 (4.0775) grad_norm 1.1897 (1.1897) [2022-10-07 14:08:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][100/1251] eta 0:06:47 lr 0.000967 time 0.3252 (0.3542) loss 3.9283 (4.1372) grad_norm 0.9600 (1.1426) [2022-10-07 14:09:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][200/1251] eta 0:05:56 lr 0.000967 time 0.3255 (0.3396) loss 3.7585 (4.1314) grad_norm 1.2259 (1.1426) [2022-10-07 14:10:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][300/1251] eta 0:05:18 lr 0.000967 time 0.3215 (0.3345) loss 4.1249 (4.1327) grad_norm 1.0201 (1.1314) [2022-10-07 14:10:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][400/1251] eta 0:04:42 lr 0.000967 time 0.3226 (0.3321) loss 4.4620 (4.1349) grad_norm 1.1573 (1.1336) [2022-10-07 14:11:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][500/1251] eta 0:04:08 lr 0.000966 time 0.3290 (0.3306) loss 3.7144 (4.1278) grad_norm 1.1483 (1.1297) [2022-10-07 14:11:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][600/1251] eta 0:03:34 lr 0.000966 time 0.3214 (0.3294) loss 4.3789 (4.1292) grad_norm 1.1132 (1.1249) [2022-10-07 14:12:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][700/1251] eta 0:03:01 lr 0.000966 time 0.3245 (0.3286) loss 4.2422 (4.1205) grad_norm 1.0501 (1.1308) [2022-10-07 14:12:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][800/1251] eta 0:02:27 lr 0.000966 time 0.3210 (0.3280) loss 3.7760 (4.1163) grad_norm 1.3246 (1.1359) [2022-10-07 14:13:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][900/1251] eta 0:01:54 lr 0.000966 time 0.3210 (0.3275) loss 3.7319 (4.1129) grad_norm 0.9804 (1.1306) [2022-10-07 14:13:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][1000/1251] eta 0:01:22 lr 0.000966 time 0.3267 (0.3273) loss 4.1304 (4.1123) grad_norm 1.1460 (1.1292) [2022-10-07 14:14:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][1100/1251] eta 0:00:49 lr 0.000965 time 0.3216 (0.3271) loss 4.3132 (4.1099) grad_norm 1.0176 (1.1297) [2022-10-07 14:14:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [35/300][1200/1251] eta 0:00:16 lr 0.000965 time 0.3227 (0.3269) loss 3.8925 (4.1068) grad_norm 1.0476 (1.1291) [2022-10-07 14:15:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 35 training takes 0:06:49 [2022-10-07 14:15:14 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.353 (2.353) Loss 1.4068 (1.4068) Acc@1 68.457 (68.457) Acc@5 88.574 (88.574) [2022-10-07 14:15:25 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 67.126 Acc@5 88.350 [2022-10-07 14:15:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 67.1% [2022-10-07 14:15:25 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 67.26% [2022-10-07 14:15:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][0/1251] eta 0:58:06 lr 0.000965 time 2.7868 (2.7868) loss 4.2087 (4.2087) grad_norm 1.1376 (1.1376) [2022-10-07 14:16:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][100/1251] eta 0:06:46 lr 0.000965 time 0.3320 (0.3536) loss 4.0298 (4.0895) grad_norm 1.1209 (1.1080) [2022-10-07 14:16:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][200/1251] eta 0:05:57 lr 0.000965 time 0.3249 (0.3401) loss 4.3856 (4.0789) grad_norm 1.1344 (1.1161) [2022-10-07 14:17:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][300/1251] eta 0:05:19 lr 0.000965 time 0.3273 (0.3357) loss 4.0376 (4.0865) grad_norm 1.0813 (1.1151) [2022-10-07 14:17:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][400/1251] eta 0:04:43 lr 0.000965 time 0.3263 (0.3334) loss 4.2986 (4.0868) grad_norm 1.1267 (1.1127) [2022-10-07 14:18:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][500/1251] eta 0:04:09 lr 0.000964 time 0.3247 (0.3321) loss 4.2642 (4.0895) grad_norm 1.0112 (1.1217) [2022-10-07 14:18:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][600/1251] eta 0:03:35 lr 0.000964 time 0.3267 (0.3312) loss 4.0193 (4.0876) grad_norm 0.9979 (1.1289) [2022-10-07 14:19:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][700/1251] eta 0:03:02 lr 0.000964 time 0.3260 (0.3304) loss 3.8925 (4.0879) grad_norm 1.4506 (1.1285) [2022-10-07 14:19:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][800/1251] eta 0:02:28 lr 0.000964 time 0.3222 (0.3299) loss 3.8872 (4.0884) grad_norm 1.1507 (1.1315) [2022-10-07 14:20:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][900/1251] eta 0:01:55 lr 0.000964 time 0.3280 (0.3295) loss 3.9111 (4.0878) grad_norm 1.1006 (1.1307) [2022-10-07 14:20:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][1000/1251] eta 0:01:22 lr 0.000964 time 0.3316 (0.3292) loss 3.7038 (4.0881) grad_norm 1.1769 (1.1317) [2022-10-07 14:21:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][1100/1251] eta 0:00:49 lr 0.000964 time 0.3263 (0.3290) loss 3.8573 (4.0895) grad_norm 1.0649 (1.1305) [2022-10-07 14:22:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [36/300][1200/1251] eta 0:00:16 lr 0.000963 time 0.3332 (0.3289) loss 4.5508 (4.0896) grad_norm 1.0510 (1.1293) [2022-10-07 14:22:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 36 training takes 0:06:51 [2022-10-07 14:22:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.171 (2.171) Loss 1.4268 (1.4268) Acc@1 68.262 (68.262) Acc@5 89.062 (89.062) [2022-10-07 14:22:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 67.512 Acc@5 88.430 [2022-10-07 14:22:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 67.5% [2022-10-07 14:22:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 67.51% [2022-10-07 14:22:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][0/1251] eta 1:07:49 lr 0.000963 time 3.2533 (3.2533) loss 4.3958 (4.3958) grad_norm 1.3219 (1.3219) [2022-10-07 14:23:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][100/1251] eta 0:06:48 lr 0.000963 time 0.3186 (0.3551) loss 4.4784 (4.0579) grad_norm 1.0310 (1.1120) [2022-10-07 14:23:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][200/1251] eta 0:05:57 lr 0.000963 time 0.3245 (0.3402) loss 4.1279 (4.0609) grad_norm 1.1528 (1.1212) [2022-10-07 14:24:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][300/1251] eta 0:05:18 lr 0.000963 time 0.3228 (0.3354) loss 3.7799 (4.0610) grad_norm 0.9929 (1.1082) [2022-10-07 14:24:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][400/1251] eta 0:04:43 lr 0.000963 time 0.3273 (0.3327) loss 4.1809 (4.0635) grad_norm 1.1179 (1.1182) [2022-10-07 14:25:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][500/1251] eta 0:04:08 lr 0.000963 time 0.3212 (0.3312) loss 3.9359 (4.0776) grad_norm 1.0883 (1.1283) [2022-10-07 14:25:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][600/1251] eta 0:03:34 lr 0.000962 time 0.3202 (0.3300) loss 4.1672 (4.0779) grad_norm 1.1833 (1.1305) [2022-10-07 14:26:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][700/1251] eta 0:03:01 lr 0.000962 time 0.3243 (0.3292) loss 4.3236 (4.0744) grad_norm 1.0199 (1.1298) [2022-10-07 14:26:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][800/1251] eta 0:02:28 lr 0.000962 time 0.3181 (0.3285) loss 4.1584 (4.0766) grad_norm 1.2760 (1.1286) [2022-10-07 14:27:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][900/1251] eta 0:01:55 lr 0.000962 time 0.3235 (0.3279) loss 4.3497 (4.0763) grad_norm 1.1710 (1.1287) [2022-10-07 14:27:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][1000/1251] eta 0:01:22 lr 0.000962 time 0.3208 (0.3275) loss 4.1588 (4.0762) grad_norm 1.2710 (1.1285) [2022-10-07 14:28:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][1100/1251] eta 0:00:49 lr 0.000962 time 0.3215 (0.3272) loss 4.0016 (4.0752) grad_norm 1.0985 (1.1293) [2022-10-07 14:29:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [37/300][1200/1251] eta 0:00:16 lr 0.000961 time 0.3255 (0.3269) loss 3.9964 (4.0771) grad_norm 1.0457 (1.1296) [2022-10-07 14:29:20 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 37 training takes 0:06:49 [2022-10-07 14:29:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.496 (2.496) Loss 1.3942 (1.3942) Acc@1 67.285 (67.285) Acc@5 89.355 (89.355) [2022-10-07 14:29:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 67.988 Acc@5 88.846 [2022-10-07 14:29:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.0% [2022-10-07 14:29:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 67.99% [2022-10-07 14:29:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][0/1251] eta 1:06:32 lr 0.000961 time 3.1917 (3.1917) loss 4.0460 (4.0460) grad_norm 1.2547 (1.2547) [2022-10-07 14:30:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][100/1251] eta 0:06:47 lr 0.000961 time 0.3256 (0.3540) loss 4.0614 (4.0930) grad_norm 1.3680 (1.1695) [2022-10-07 14:30:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][200/1251] eta 0:05:57 lr 0.000961 time 0.3260 (0.3399) loss 4.1226 (4.0915) grad_norm 1.0040 (1.1441) [2022-10-07 14:31:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][300/1251] eta 0:05:18 lr 0.000961 time 0.3228 (0.3349) loss 3.7848 (4.0863) grad_norm 1.2198 (1.1448) [2022-10-07 14:31:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][400/1251] eta 0:04:42 lr 0.000961 time 0.3251 (0.3325) loss 3.8334 (4.0801) grad_norm 1.2040 (1.1361) [2022-10-07 14:32:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][500/1251] eta 0:04:08 lr 0.000961 time 0.3215 (0.3308) loss 3.7834 (4.0760) grad_norm 1.0034 (1.1396) [2022-10-07 14:32:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][600/1251] eta 0:03:34 lr 0.000960 time 0.3309 (0.3296) loss 4.4426 (4.0747) grad_norm 1.2553 (1.1332) [2022-10-07 14:33:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][700/1251] eta 0:03:01 lr 0.000960 time 0.3217 (0.3288) loss 4.0690 (4.0720) grad_norm 1.6820 (1.1325) [2022-10-07 14:33:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][800/1251] eta 0:02:28 lr 0.000960 time 0.3300 (0.3282) loss 3.9358 (4.0691) grad_norm 1.4240 (1.1337) [2022-10-07 14:34:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][900/1251] eta 0:01:55 lr 0.000960 time 0.3199 (0.3278) loss 4.1973 (4.0685) grad_norm 1.0312 (1.1329) [2022-10-07 14:35:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][1000/1251] eta 0:01:22 lr 0.000960 time 0.3256 (0.3274) loss 4.2213 (4.0694) grad_norm 1.3295 (1.1331) [2022-10-07 14:35:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][1100/1251] eta 0:00:49 lr 0.000960 time 0.3291 (0.3272) loss 3.9147 (4.0690) grad_norm 1.0263 (1.1335) [2022-10-07 14:36:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [38/300][1200/1251] eta 0:00:16 lr 0.000959 time 0.3237 (0.3271) loss 3.9079 (4.0663) grad_norm 1.1387 (1.1336) [2022-10-07 14:36:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 38 training takes 0:06:49 [2022-10-07 14:36:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.169 (3.169) Loss 1.3703 (1.3703) Acc@1 68.848 (68.848) Acc@5 88.281 (88.281) [2022-10-07 14:36:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.356 Acc@5 88.976 [2022-10-07 14:36:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-07 14:36:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.36% [2022-10-07 14:36:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][0/1251] eta 1:05:24 lr 0.000959 time 3.1373 (3.1373) loss 3.8913 (3.8913) grad_norm 1.1697 (1.1697) [2022-10-07 14:37:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][100/1251] eta 0:06:46 lr 0.000959 time 0.3210 (0.3533) loss 4.0630 (4.0344) grad_norm 1.1163 (1.1002) [2022-10-07 14:37:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][200/1251] eta 0:05:56 lr 0.000959 time 0.3240 (0.3391) loss 4.3559 (4.0452) grad_norm 1.2979 (1.1039) [2022-10-07 14:38:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][300/1251] eta 0:05:17 lr 0.000959 time 0.3276 (0.3344) loss 4.5431 (4.0426) grad_norm 1.1485 (1.1038) [2022-10-07 14:38:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][400/1251] eta 0:04:42 lr 0.000959 time 0.3240 (0.3319) loss 4.3550 (4.0445) grad_norm 1.0696 (1.1060) [2022-10-07 14:39:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][500/1251] eta 0:04:08 lr 0.000958 time 0.3204 (0.3303) loss 4.1583 (4.0478) grad_norm 1.4651 (1.1110) [2022-10-07 14:39:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][600/1251] eta 0:03:34 lr 0.000958 time 0.3241 (0.3293) loss 4.0007 (4.0483) grad_norm 1.2286 (1.1219) [2022-10-07 14:40:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][700/1251] eta 0:03:01 lr 0.000958 time 0.3232 (0.3285) loss 3.7927 (4.0433) grad_norm 1.1199 (1.1195) [2022-10-07 14:40:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][800/1251] eta 0:02:27 lr 0.000958 time 0.3238 (0.3280) loss 3.8973 (4.0431) grad_norm 1.4001 (1.1211) [2022-10-07 14:41:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][900/1251] eta 0:01:54 lr 0.000958 time 0.3210 (0.3276) loss 4.3698 (4.0435) grad_norm 1.0305 (1.1196) [2022-10-07 14:42:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][1000/1251] eta 0:01:22 lr 0.000958 time 0.3232 (0.3274) loss 3.9373 (4.0444) grad_norm 1.2336 (1.1213) [2022-10-07 14:42:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][1100/1251] eta 0:00:49 lr 0.000957 time 0.3216 (0.3272) loss 4.1042 (4.0462) grad_norm 1.0387 (1.1225) [2022-10-07 14:43:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [39/300][1200/1251] eta 0:00:16 lr 0.000957 time 0.3287 (0.3271) loss 3.8686 (4.0502) grad_norm 1.1631 (1.1226) [2022-10-07 14:43:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 39 training takes 0:06:49 [2022-10-07 14:43:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.773 (2.773) Loss 1.3559 (1.3559) Acc@1 69.141 (69.141) Acc@5 90.137 (90.137) [2022-10-07 14:43:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.360 Acc@5 89.024 [2022-10-07 14:43:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-07 14:43:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.36% [2022-10-07 14:43:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][0/1251] eta 1:00:24 lr 0.000957 time 2.8970 (2.8970) loss 3.9243 (3.9243) grad_norm 1.5184 (1.5184) [2022-10-07 14:44:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][100/1251] eta 0:06:45 lr 0.000957 time 0.3249 (0.3521) loss 4.0216 (4.0220) grad_norm 1.3007 (1.1701) [2022-10-07 14:44:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][200/1251] eta 0:05:56 lr 0.000957 time 0.3265 (0.3391) loss 4.1631 (4.0137) grad_norm 1.0354 (1.1481) [2022-10-07 14:45:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][300/1251] eta 0:05:18 lr 0.000957 time 0.3305 (0.3346) loss 4.2790 (4.0207) grad_norm 1.1142 (1.1411) [2022-10-07 14:45:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][400/1251] eta 0:04:42 lr 0.000957 time 0.3209 (0.3322) loss 4.2841 (4.0241) grad_norm 1.0435 (1.1392) [2022-10-07 14:46:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][500/1251] eta 0:04:08 lr 0.000956 time 0.3238 (0.3307) loss 4.3370 (4.0318) grad_norm 1.2396 (1.1422) [2022-10-07 14:46:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][600/1251] eta 0:03:34 lr 0.000956 time 0.3297 (0.3297) loss 3.5844 (4.0318) grad_norm 1.4688 (1.1453) [2022-10-07 14:47:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][700/1251] eta 0:03:01 lr 0.000956 time 0.3214 (0.3289) loss 3.9700 (4.0398) grad_norm 0.9473 (1.1400) [2022-10-07 14:48:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][800/1251] eta 0:02:28 lr 0.000956 time 0.3225 (0.3283) loss 3.7655 (4.0391) grad_norm 1.2078 (1.1369) [2022-10-07 14:48:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][900/1251] eta 0:01:55 lr 0.000956 time 0.3224 (0.3279) loss 4.1099 (4.0404) grad_norm 1.2990 (1.1346) [2022-10-07 14:49:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][1000/1251] eta 0:01:22 lr 0.000956 time 0.3250 (0.3276) loss 3.9354 (4.0377) grad_norm 1.1739 (1.1355) [2022-10-07 14:49:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][1100/1251] eta 0:00:49 lr 0.000955 time 0.3250 (0.3274) loss 4.0683 (4.0384) grad_norm 1.1966 (1.1375) [2022-10-07 14:50:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [40/300][1200/1251] eta 0:00:16 lr 0.000955 time 0.3227 (0.3272) loss 3.9571 (4.0374) grad_norm 1.6658 (1.1382) [2022-10-07 14:50:28 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 40 training takes 0:06:49 [2022-10-07 14:50:28 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_40 saving...... [2022-10-07 14:50:29 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_40 saved !!! [2022-10-07 14:50:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.653 (2.653) Loss 1.4049 (1.4049) Acc@1 68.359 (68.359) Acc@5 87.988 (87.988) [2022-10-07 14:50:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.742 Acc@5 89.152 [2022-10-07 14:50:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-07 14:50:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.74% [2022-10-07 14:50:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][0/1251] eta 0:51:11 lr 0.000955 time 2.4555 (2.4555) loss 3.7390 (3.7390) grad_norm 1.0270 (1.0270) [2022-10-07 14:51:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][100/1251] eta 0:06:39 lr 0.000955 time 0.3227 (0.3467) loss 4.0304 (4.0423) grad_norm 1.2824 (1.1304) [2022-10-07 14:51:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][200/1251] eta 0:05:52 lr 0.000955 time 0.3231 (0.3357) loss 3.6754 (4.0423) grad_norm 1.1366 (1.1188) [2022-10-07 14:52:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][300/1251] eta 0:05:15 lr 0.000955 time 0.3196 (0.3318) loss 4.2037 (4.0451) grad_norm 1.1470 (1.1362) [2022-10-07 14:52:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][400/1251] eta 0:04:40 lr 0.000954 time 0.3251 (0.3297) loss 4.0667 (4.0432) grad_norm 0.9530 (1.1312) [2022-10-07 14:53:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][500/1251] eta 0:04:06 lr 0.000954 time 0.3218 (0.3285) loss 4.1293 (4.0372) grad_norm 1.0450 (1.1324) [2022-10-07 14:53:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][600/1251] eta 0:03:33 lr 0.000954 time 0.3192 (0.3277) loss 4.2006 (4.0357) grad_norm 1.1439 (1.1265) [2022-10-07 14:54:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][700/1251] eta 0:03:00 lr 0.000954 time 0.3270 (0.3271) loss 4.3095 (4.0360) grad_norm 1.1758 (1.1278) [2022-10-07 14:55:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][800/1251] eta 0:02:27 lr 0.000954 time 0.3243 (0.3268) loss 3.9719 (4.0324) grad_norm 1.4967 (1.1330) [2022-10-07 14:55:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][900/1251] eta 0:01:54 lr 0.000954 time 0.3237 (0.3267) loss 4.0279 (4.0354) grad_norm 1.1987 (1.1364) [2022-10-07 14:56:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][1000/1251] eta 0:01:21 lr 0.000953 time 0.3290 (0.3266) loss 3.9460 (4.0377) grad_norm 1.0294 (1.1349) [2022-10-07 14:56:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][1100/1251] eta 0:00:49 lr 0.000953 time 0.3276 (0.3266) loss 4.0681 (4.0378) grad_norm 1.0711 (1.1355) [2022-10-07 14:57:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [41/300][1200/1251] eta 0:00:16 lr 0.000953 time 0.3307 (0.3266) loss 3.7773 (4.0366) grad_norm 0.9347 (1.1341) [2022-10-07 14:57:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 41 training takes 0:06:48 [2022-10-07 14:57:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.172 (3.172) Loss 1.4272 (1.4272) Acc@1 67.773 (67.773) Acc@5 88.379 (88.379) [2022-10-07 14:57:44 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.694 Acc@5 89.276 [2022-10-07 14:57:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-07 14:57:44 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.74% [2022-10-07 14:57:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][0/1251] eta 0:52:04 lr 0.000953 time 2.4977 (2.4977) loss 4.3099 (4.3099) grad_norm 0.9268 (0.9268) [2022-10-07 14:58:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][100/1251] eta 0:06:43 lr 0.000953 time 0.3318 (0.3506) loss 4.1875 (4.0036) grad_norm 1.3327 (1.1209) [2022-10-07 14:58:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][200/1251] eta 0:05:56 lr 0.000953 time 0.3273 (0.3389) loss 3.7015 (3.9855) grad_norm 1.4623 (1.1318) [2022-10-07 14:59:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][300/1251] eta 0:05:18 lr 0.000952 time 0.3241 (0.3347) loss 3.8689 (3.9961) grad_norm 0.9780 (1.1327) [2022-10-07 14:59:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][400/1251] eta 0:04:43 lr 0.000952 time 0.3278 (0.3327) loss 3.7986 (3.9958) grad_norm 1.2447 (1.1387) [2022-10-07 15:00:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][500/1251] eta 0:04:08 lr 0.000952 time 0.3305 (0.3314) loss 4.3625 (3.9996) grad_norm 1.0100 (1.1364) [2022-10-07 15:01:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][600/1251] eta 0:03:35 lr 0.000952 time 0.3233 (0.3304) loss 3.7235 (3.9998) grad_norm 1.1254 (1.1314) [2022-10-07 15:01:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][700/1251] eta 0:03:01 lr 0.000952 time 0.3253 (0.3298) loss 4.1369 (4.0032) grad_norm 1.1937 (1.1301) [2022-10-07 15:02:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][800/1251] eta 0:02:28 lr 0.000951 time 0.3255 (0.3293) loss 3.9403 (4.0059) grad_norm 1.3399 (1.1332) [2022-10-07 15:02:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][900/1251] eta 0:01:55 lr 0.000951 time 0.3335 (0.3290) loss 4.0994 (4.0124) grad_norm 1.1384 (1.1317) [2022-10-07 15:03:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][1000/1251] eta 0:01:22 lr 0.000951 time 0.3282 (0.3287) loss 3.8789 (4.0124) grad_norm 1.0637 (1.1304) [2022-10-07 15:03:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][1100/1251] eta 0:00:49 lr 0.000951 time 0.3236 (0.3286) loss 3.9448 (4.0124) grad_norm 1.0449 (1.1301) [2022-10-07 15:04:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [42/300][1200/1251] eta 0:00:16 lr 0.000951 time 0.3285 (0.3285) loss 4.1782 (4.0114) grad_norm 1.0547 (1.1314) [2022-10-07 15:04:36 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 42 training takes 0:06:51 [2022-10-07 15:04:38 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.875 (2.875) Loss 1.4186 (1.4186) Acc@1 68.848 (68.848) Acc@5 88.672 (88.672) [2022-10-07 15:04:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.684 Acc@5 89.204 [2022-10-07 15:04:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-07 15:04:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.74% [2022-10-07 15:04:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][0/1251] eta 0:58:15 lr 0.000951 time 2.7943 (2.7943) loss 3.9191 (3.9191) grad_norm 1.1337 (1.1337) [2022-10-07 15:05:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][100/1251] eta 0:06:42 lr 0.000950 time 0.3237 (0.3496) loss 3.9489 (4.0135) grad_norm 1.0814 (1.1244) [2022-10-07 15:05:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][200/1251] eta 0:05:54 lr 0.000950 time 0.3294 (0.3370) loss 4.1619 (4.0060) grad_norm 1.0815 (1.1385) [2022-10-07 15:06:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][300/1251] eta 0:05:16 lr 0.000950 time 0.3230 (0.3326) loss 3.7860 (4.0064) grad_norm 1.0634 (1.1356) [2022-10-07 15:07:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][400/1251] eta 0:04:41 lr 0.000950 time 0.3269 (0.3304) loss 4.1176 (4.0026) grad_norm 1.1227 (1.1368) [2022-10-07 15:07:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][500/1251] eta 0:04:07 lr 0.000950 time 0.3235 (0.3292) loss 4.2519 (4.0100) grad_norm 1.0056 (1.1315) [2022-10-07 15:08:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][600/1251] eta 0:03:33 lr 0.000950 time 0.3208 (0.3283) loss 4.0610 (4.0132) grad_norm 0.9799 (1.1306) [2022-10-07 15:08:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][700/1251] eta 0:03:00 lr 0.000949 time 0.3239 (0.3277) loss 3.9722 (4.0111) grad_norm 1.0210 (1.1276) [2022-10-07 15:09:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][800/1251] eta 0:02:27 lr 0.000949 time 0.3210 (0.3274) loss 4.1488 (4.0086) grad_norm 1.5248 (1.1296) [2022-10-07 15:09:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][900/1251] eta 0:01:54 lr 0.000949 time 0.3263 (0.3274) loss 4.0430 (4.0074) grad_norm 1.0015 (1.1300) [2022-10-07 15:10:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][1000/1251] eta 0:01:22 lr 0.000949 time 0.3262 (0.3272) loss 4.0289 (4.0051) grad_norm 1.6511 (1.1288) [2022-10-07 15:10:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][1100/1251] eta 0:00:49 lr 0.000949 time 0.3249 (0.3272) loss 4.0565 (4.0077) grad_norm 1.0425 (1.1286) [2022-10-07 15:11:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [43/300][1200/1251] eta 0:00:16 lr 0.000948 time 0.3296 (0.3272) loss 3.7089 (4.0053) grad_norm 1.5033 (1.1294) [2022-10-07 15:11:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 43 training takes 0:06:49 [2022-10-07 15:11:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.437 (2.437) Loss 1.3020 (1.3020) Acc@1 70.312 (70.312) Acc@5 90.918 (90.918) [2022-10-07 15:11:52 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 68.926 Acc@5 89.296 [2022-10-07 15:11:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 68.9% [2022-10-07 15:11:52 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 68.93% [2022-10-07 15:11:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][0/1251] eta 1:07:25 lr 0.000948 time 3.2340 (3.2340) loss 3.7239 (3.7239) grad_norm 1.1882 (1.1882) [2022-10-07 15:12:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][100/1251] eta 0:06:48 lr 0.000948 time 0.3263 (0.3551) loss 4.0129 (4.0121) grad_norm 1.1195 (1.1592) [2022-10-07 15:13:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][200/1251] eta 0:05:57 lr 0.000948 time 0.3263 (0.3404) loss 3.6768 (3.9823) grad_norm 1.0601 (1.1456) [2022-10-07 15:13:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][300/1251] eta 0:05:18 lr 0.000948 time 0.3232 (0.3352) loss 3.9449 (3.9818) grad_norm 1.1607 (1.1371) [2022-10-07 15:14:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][400/1251] eta 0:04:42 lr 0.000948 time 0.3243 (0.3325) loss 4.2867 (3.9960) grad_norm 1.1073 (1.1311) [2022-10-07 15:14:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][500/1251] eta 0:04:08 lr 0.000947 time 0.3226 (0.3308) loss 4.1836 (3.9975) grad_norm 1.1685 (1.1383) [2022-10-07 15:15:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][600/1251] eta 0:03:34 lr 0.000947 time 0.3252 (0.3297) loss 3.9450 (3.9991) grad_norm 1.0450 (1.1339) [2022-10-07 15:15:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][700/1251] eta 0:03:01 lr 0.000947 time 0.3231 (0.3290) loss 3.7219 (3.9945) grad_norm 0.9754 (1.1356) [2022-10-07 15:16:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][800/1251] eta 0:02:28 lr 0.000947 time 0.3258 (0.3284) loss 3.9894 (3.9909) grad_norm 1.1766 (1.1405) [2022-10-07 15:16:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][900/1251] eta 0:01:55 lr 0.000947 time 0.3210 (0.3280) loss 4.0015 (3.9907) grad_norm 0.9601 (1.1429) [2022-10-07 15:17:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][1000/1251] eta 0:01:22 lr 0.000947 time 0.3183 (0.3278) loss 4.0348 (3.9944) grad_norm 1.3872 (1.1415) [2022-10-07 15:17:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][1100/1251] eta 0:00:49 lr 0.000946 time 0.3210 (0.3277) loss 4.0364 (3.9961) grad_norm 1.1236 (1.1392) [2022-10-07 15:18:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [44/300][1200/1251] eta 0:00:16 lr 0.000946 time 0.3204 (0.3277) loss 3.9411 (3.9936) grad_norm 0.9677 (1.1361) [2022-10-07 15:18:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 44 training takes 0:06:50 [2022-10-07 15:18:45 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.912 (2.912) Loss 1.3254 (1.3254) Acc@1 66.797 (66.797) Acc@5 89.551 (89.551) [2022-10-07 15:18:56 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.380 Acc@5 89.626 [2022-10-07 15:18:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.4% [2022-10-07 15:18:56 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.38% [2022-10-07 15:18:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][0/1251] eta 1:09:23 lr 0.000946 time 3.3284 (3.3284) loss 4.3064 (4.3064) grad_norm 1.4850 (1.4850) [2022-10-07 15:19:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][100/1251] eta 0:06:50 lr 0.000946 time 0.3323 (0.3571) loss 4.3165 (4.0141) grad_norm 1.0834 (1.1534) [2022-10-07 15:20:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][200/1251] eta 0:05:59 lr 0.000946 time 0.3282 (0.3422) loss 3.9424 (4.0034) grad_norm 1.3464 (1.1564) [2022-10-07 15:20:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][300/1251] eta 0:05:20 lr 0.000945 time 0.3296 (0.3370) loss 4.1242 (3.9995) grad_norm 1.1890 (1.1460) [2022-10-07 15:21:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][400/1251] eta 0:04:44 lr 0.000945 time 0.3307 (0.3345) loss 4.1396 (3.9926) grad_norm 1.3948 (1.1345) [2022-10-07 15:21:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][500/1251] eta 0:04:10 lr 0.000945 time 0.3266 (0.3330) loss 4.1788 (3.9968) grad_norm 1.1785 (1.1390) [2022-10-07 15:22:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][600/1251] eta 0:03:36 lr 0.000945 time 0.3266 (0.3318) loss 3.8215 (3.9902) grad_norm 1.0577 (1.1330) [2022-10-07 15:22:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][700/1251] eta 0:03:02 lr 0.000945 time 0.3255 (0.3310) loss 4.2204 (3.9942) grad_norm 1.2982 (1.1387) [2022-10-07 15:23:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][800/1251] eta 0:02:29 lr 0.000945 time 0.3295 (0.3304) loss 4.1869 (3.9952) grad_norm 1.1198 (1.1401) [2022-10-07 15:23:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][900/1251] eta 0:01:55 lr 0.000944 time 0.3239 (0.3300) loss 3.7232 (3.9922) grad_norm 1.1860 (1.1381) [2022-10-07 15:24:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][1000/1251] eta 0:01:22 lr 0.000944 time 0.3295 (0.3297) loss 4.0113 (3.9905) grad_norm 0.9721 (1.1344) [2022-10-07 15:24:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][1100/1251] eta 0:00:49 lr 0.000944 time 0.3248 (0.3295) loss 4.1665 (3.9888) grad_norm 1.1965 (1.1355) [2022-10-07 15:25:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [45/300][1200/1251] eta 0:00:16 lr 0.000944 time 0.3269 (0.3293) loss 4.0229 (3.9840) grad_norm 1.2486 (1.1361) [2022-10-07 15:25:48 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 45 training takes 0:06:52 [2022-10-07 15:25:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.972 (2.972) Loss 1.3635 (1.3635) Acc@1 70.215 (70.215) Acc@5 88.965 (88.965) [2022-10-07 15:26:02 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.234 Acc@5 89.518 [2022-10-07 15:26:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.2% [2022-10-07 15:26:02 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.38% [2022-10-07 15:26:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][0/1251] eta 1:06:05 lr 0.000944 time 3.1698 (3.1698) loss 3.8476 (3.8476) grad_norm 1.3322 (1.3322) [2022-10-07 15:26:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][100/1251] eta 0:06:45 lr 0.000943 time 0.3243 (0.3519) loss 4.2421 (3.9841) grad_norm 1.1125 (1.1669) [2022-10-07 15:27:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][200/1251] eta 0:05:54 lr 0.000943 time 0.3210 (0.3377) loss 4.0970 (3.9850) grad_norm 1.0427 (1.1400) [2022-10-07 15:27:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][300/1251] eta 0:05:16 lr 0.000943 time 0.3223 (0.3331) loss 4.0160 (3.9736) grad_norm 1.1162 (1.1415) [2022-10-07 15:28:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][400/1251] eta 0:04:41 lr 0.000943 time 0.3231 (0.3307) loss 3.8114 (3.9754) grad_norm 0.9144 (1.1452) [2022-10-07 15:28:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][500/1251] eta 0:04:07 lr 0.000943 time 0.3246 (0.3292) loss 4.0114 (3.9808) grad_norm 1.2171 (1.1373) [2022-10-07 15:29:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][600/1251] eta 0:03:33 lr 0.000943 time 0.3227 (0.3282) loss 3.9512 (3.9791) grad_norm 1.3455 (1.1357) [2022-10-07 15:29:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][700/1251] eta 0:03:00 lr 0.000942 time 0.3219 (0.3277) loss 4.0522 (3.9793) grad_norm 1.2157 (1.1348) [2022-10-07 15:30:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][800/1251] eta 0:02:27 lr 0.000942 time 0.3247 (0.3277) loss 3.8309 (3.9775) grad_norm 1.0460 (1.1405) [2022-10-07 15:30:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][900/1251] eta 0:01:54 lr 0.000942 time 0.3274 (0.3275) loss 4.2341 (3.9766) grad_norm 1.2386 (1.1370) [2022-10-07 15:31:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][1000/1251] eta 0:01:22 lr 0.000942 time 0.3268 (0.3274) loss 3.9102 (3.9753) grad_norm 1.1133 (1.1361) [2022-10-07 15:32:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][1100/1251] eta 0:00:49 lr 0.000942 time 0.3440 (0.3274) loss 3.9832 (3.9766) grad_norm 1.0674 (1.1355) [2022-10-07 15:32:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [46/300][1200/1251] eta 0:00:16 lr 0.000941 time 0.3308 (0.3275) loss 4.0753 (3.9810) grad_norm 1.5529 (1.1386) [2022-10-07 15:32:52 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 46 training takes 0:06:50 [2022-10-07 15:32:55 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.193 (3.193) Loss 1.3152 (1.3152) Acc@1 67.773 (67.773) Acc@5 89.551 (89.551) [2022-10-07 15:33:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.774 Acc@5 89.828 [2022-10-07 15:33:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-10-07 15:33:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.77% [2022-10-07 15:33:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][0/1251] eta 0:57:07 lr 0.000941 time 2.7395 (2.7395) loss 4.1195 (4.1195) grad_norm 1.1605 (1.1605) [2022-10-07 15:33:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][100/1251] eta 0:06:43 lr 0.000941 time 0.3262 (0.3510) loss 3.8883 (3.9621) grad_norm 1.3073 (1.1636) [2022-10-07 15:34:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][200/1251] eta 0:05:55 lr 0.000941 time 0.3292 (0.3385) loss 3.9458 (3.9668) grad_norm 1.0304 (1.1604) [2022-10-07 15:34:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][300/1251] eta 0:05:17 lr 0.000941 time 0.3236 (0.3339) loss 4.0647 (3.9627) grad_norm 1.3241 (1.1470) [2022-10-07 15:35:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][400/1251] eta 0:04:42 lr 0.000940 time 0.3276 (0.3315) loss 3.9297 (3.9670) grad_norm 1.1643 (1.1451) [2022-10-07 15:35:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][500/1251] eta 0:04:07 lr 0.000940 time 0.3224 (0.3302) loss 3.8279 (3.9689) grad_norm 1.3128 (1.1339) [2022-10-07 15:36:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][600/1251] eta 0:03:34 lr 0.000940 time 0.3283 (0.3294) loss 3.7334 (3.9691) grad_norm 1.2025 (1.1316) [2022-10-07 15:36:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][700/1251] eta 0:03:01 lr 0.000940 time 0.3253 (0.3287) loss 4.1366 (3.9678) grad_norm 1.1644 (1.1372) [2022-10-07 15:37:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][800/1251] eta 0:02:28 lr 0.000940 time 0.3249 (0.3282) loss 4.1777 (3.9656) grad_norm 0.9772 (1.1416) [2022-10-07 15:38:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][900/1251] eta 0:01:55 lr 0.000939 time 0.3211 (0.3280) loss 3.6780 (3.9648) grad_norm 1.2829 (1.1465) [2022-10-07 15:38:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][1000/1251] eta 0:01:22 lr 0.000939 time 0.3300 (0.3279) loss 4.1256 (3.9649) grad_norm 1.0478 (1.1449) [2022-10-07 15:39:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][1100/1251] eta 0:00:49 lr 0.000939 time 0.3207 (0.3279) loss 4.0023 (3.9684) grad_norm 1.0983 (1.1430) [2022-10-07 15:39:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [47/300][1200/1251] eta 0:00:16 lr 0.000939 time 0.3269 (0.3279) loss 4.0088 (3.9695) grad_norm 1.2377 (1.1444) [2022-10-07 15:39:56 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 47 training takes 0:06:50 [2022-10-07 15:39:58 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.444 (2.444) Loss 1.2653 (1.2653) Acc@1 70.020 (70.020) Acc@5 90.527 (90.527) [2022-10-07 15:40:09 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.914 Acc@5 89.822 [2022-10-07 15:40:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.9% [2022-10-07 15:40:09 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.91% [2022-10-07 15:40:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][0/1251] eta 0:53:22 lr 0.000939 time 2.5601 (2.5601) loss 3.6749 (3.6749) grad_norm 1.1122 (1.1122) [2022-10-07 15:40:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][100/1251] eta 0:06:40 lr 0.000939 time 0.3229 (0.3481) loss 3.9424 (3.9369) grad_norm 1.2470 (1.1589) [2022-10-07 15:41:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][200/1251] eta 0:05:53 lr 0.000938 time 0.3247 (0.3361) loss 4.1227 (3.9444) grad_norm 1.1519 (1.1389) [2022-10-07 15:41:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][300/1251] eta 0:05:15 lr 0.000938 time 0.3268 (0.3320) loss 4.0970 (3.9509) grad_norm 1.5531 (1.1348) [2022-10-07 15:42:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][400/1251] eta 0:04:40 lr 0.000938 time 0.3219 (0.3299) loss 4.0074 (3.9608) grad_norm 1.0311 (1.1421) [2022-10-07 15:42:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][500/1251] eta 0:04:06 lr 0.000938 time 0.3250 (0.3286) loss 4.1745 (3.9562) grad_norm 1.1263 (1.1382) [2022-10-07 15:43:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][600/1251] eta 0:03:33 lr 0.000938 time 0.3234 (0.3278) loss 3.8980 (3.9609) grad_norm 1.3815 (1.1384) [2022-10-07 15:43:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][700/1251] eta 0:03:00 lr 0.000937 time 0.3212 (0.3273) loss 4.0996 (3.9620) grad_norm 1.0905 (1.1378) [2022-10-07 15:44:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][800/1251] eta 0:02:27 lr 0.000937 time 0.3282 (0.3270) loss 3.9406 (3.9615) grad_norm 1.2383 (1.1436) [2022-10-07 15:45:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][900/1251] eta 0:01:54 lr 0.000937 time 0.3299 (0.3268) loss 4.2133 (3.9618) grad_norm 1.0448 (1.1482) [2022-10-07 15:45:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][1000/1251] eta 0:01:22 lr 0.000937 time 0.3219 (0.3267) loss 3.8319 (3.9633) grad_norm 1.0008 (1.1457) [2022-10-07 15:46:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][1100/1251] eta 0:00:49 lr 0.000937 time 0.3306 (0.3267) loss 3.8604 (3.9634) grad_norm 1.0100 (1.1418) [2022-10-07 15:46:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [48/300][1200/1251] eta 0:00:16 lr 0.000936 time 0.3258 (0.3268) loss 4.0058 (3.9612) grad_norm 1.1116 (1.1385) [2022-10-07 15:46:58 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 48 training takes 0:06:49 [2022-10-07 15:47:01 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.628 (2.628) Loss 1.3013 (1.3013) Acc@1 70.898 (70.898) Acc@5 90.625 (90.625) [2022-10-07 15:47:12 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.736 Acc@5 89.666 [2022-10-07 15:47:12 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.7% [2022-10-07 15:47:12 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.91% [2022-10-07 15:47:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][0/1251] eta 1:07:03 lr 0.000936 time 3.2166 (3.2166) loss 3.7541 (3.7541) grad_norm 1.5086 (1.5086) [2022-10-07 15:47:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][100/1251] eta 0:06:50 lr 0.000936 time 0.3299 (0.3566) loss 3.7412 (3.9227) grad_norm 0.9618 (1.1586) [2022-10-07 15:48:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][200/1251] eta 0:05:59 lr 0.000936 time 0.3274 (0.3422) loss 4.0931 (3.9318) grad_norm 1.1648 (1.1676) [2022-10-07 15:48:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][300/1251] eta 0:05:20 lr 0.000936 time 0.3265 (0.3370) loss 3.4128 (3.9379) grad_norm 1.1773 (1.1649) [2022-10-07 15:49:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][400/1251] eta 0:04:44 lr 0.000935 time 0.3277 (0.3342) loss 4.0121 (3.9401) grad_norm 0.9887 (1.1538) [2022-10-07 15:49:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][500/1251] eta 0:04:09 lr 0.000935 time 0.3292 (0.3325) loss 4.0789 (3.9474) grad_norm 1.3125 (1.1554) [2022-10-07 15:50:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][600/1251] eta 0:03:35 lr 0.000935 time 0.3283 (0.3316) loss 4.1343 (3.9448) grad_norm 1.1608 (1.1530) [2022-10-07 15:51:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][700/1251] eta 0:03:02 lr 0.000935 time 0.3220 (0.3307) loss 3.8165 (3.9464) grad_norm 1.2653 (1.1497) [2022-10-07 15:51:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][800/1251] eta 0:02:28 lr 0.000935 time 0.3287 (0.3301) loss 3.9069 (3.9473) grad_norm 1.3235 (1.1457) [2022-10-07 15:52:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][900/1251] eta 0:01:55 lr 0.000934 time 0.3250 (0.3296) loss 3.5721 (3.9497) grad_norm 0.9950 (1.1466) [2022-10-07 15:52:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][1000/1251] eta 0:01:22 lr 0.000934 time 0.3263 (0.3294) loss 4.1055 (3.9512) grad_norm 1.0206 (1.1431) [2022-10-07 15:53:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][1100/1251] eta 0:00:49 lr 0.000934 time 0.3209 (0.3293) loss 3.6594 (3.9534) grad_norm 1.1233 (1.1427) [2022-10-07 15:53:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [49/300][1200/1251] eta 0:00:16 lr 0.000934 time 0.3378 (0.3292) loss 4.2897 (3.9533) grad_norm 0.9245 (1.1445) [2022-10-07 15:54:04 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 49 training takes 0:06:52 [2022-10-07 15:54:07 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.272 (3.272) Loss 1.2783 (1.2783) Acc@1 70.312 (70.312) Acc@5 90.137 (90.137) [2022-10-07 15:54:18 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.730 Acc@5 89.930 [2022-10-07 15:54:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.7% [2022-10-07 15:54:18 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 69.91% [2022-10-07 15:54:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][0/1251] eta 1:04:25 lr 0.000934 time 3.0903 (3.0903) loss 3.9131 (3.9131) grad_norm 1.0071 (1.0071) [2022-10-07 15:54:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][100/1251] eta 0:06:47 lr 0.000933 time 0.3215 (0.3538) loss 4.0283 (3.9337) grad_norm 1.0530 (1.1376) [2022-10-07 15:55:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][200/1251] eta 0:05:57 lr 0.000933 time 0.3251 (0.3402) loss 4.0363 (3.9329) grad_norm 1.4535 (1.1503) [2022-10-07 15:55:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][300/1251] eta 0:05:18 lr 0.000933 time 0.3254 (0.3352) loss 3.4248 (3.9276) grad_norm 1.1741 (1.1496) [2022-10-07 15:56:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][400/1251] eta 0:04:43 lr 0.000933 time 0.3294 (0.3327) loss 4.1220 (3.9343) grad_norm 0.9856 (1.1495) [2022-10-07 15:57:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][500/1251] eta 0:04:08 lr 0.000933 time 0.3233 (0.3312) loss 3.9110 (3.9398) grad_norm 1.1965 (1.1420) [2022-10-07 15:57:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][600/1251] eta 0:03:34 lr 0.000932 time 0.3232 (0.3302) loss 3.9087 (3.9466) grad_norm 1.1756 (1.1461) [2022-10-07 15:58:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][700/1251] eta 0:03:01 lr 0.000932 time 0.3246 (0.3294) loss 4.1345 (3.9434) grad_norm 1.2171 (1.1496) [2022-10-07 15:58:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][800/1251] eta 0:02:28 lr 0.000932 time 0.3180 (0.3288) loss 3.6022 (3.9455) grad_norm 1.0836 (1.1449) [2022-10-07 15:59:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][900/1251] eta 0:01:55 lr 0.000932 time 0.3293 (0.3284) loss 3.8754 (3.9477) grad_norm 1.1699 (1.1448) [2022-10-07 15:59:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][1000/1251] eta 0:01:22 lr 0.000932 time 0.3317 (0.3282) loss 3.9257 (3.9482) grad_norm 1.2317 (1.1428) [2022-10-07 16:00:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][1100/1251] eta 0:00:49 lr 0.000931 time 0.3322 (0.3281) loss 4.2319 (3.9470) grad_norm 1.0365 (1.1438) [2022-10-07 16:00:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [50/300][1200/1251] eta 0:00:16 lr 0.000931 time 0.3269 (0.3280) loss 3.9776 (3.9483) grad_norm 1.3157 (1.1453) [2022-10-07 16:01:08 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 50 training takes 0:06:50 [2022-10-07 16:01:08 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_50 saving...... [2022-10-07 16:01:09 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_50 saved !!! [2022-10-07 16:01:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.251 (2.251) Loss 1.2837 (1.2837) Acc@1 72.559 (72.559) Acc@5 90.625 (90.625) [2022-10-07 16:01:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.046 Acc@5 89.988 [2022-10-07 16:01:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-10-07 16:01:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.05% [2022-10-07 16:01:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][0/1251] eta 1:02:05 lr 0.000931 time 2.9778 (2.9778) loss 3.6283 (3.6283) grad_norm 1.1594 (1.1594) [2022-10-07 16:01:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][100/1251] eta 0:06:44 lr 0.000931 time 0.3184 (0.3517) loss 4.0302 (3.9174) grad_norm 0.9584 (1.1266) [2022-10-07 16:02:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][200/1251] eta 0:05:56 lr 0.000931 time 0.3284 (0.3389) loss 3.9783 (3.9328) grad_norm 1.1036 (1.1265) [2022-10-07 16:03:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][300/1251] eta 0:05:18 lr 0.000930 time 0.3221 (0.3344) loss 4.1290 (3.9273) grad_norm 1.0718 (1.1394) [2022-10-07 16:03:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][400/1251] eta 0:04:42 lr 0.000930 time 0.3265 (0.3320) loss 3.9862 (3.9293) grad_norm 0.9850 (1.1457) [2022-10-07 16:04:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][500/1251] eta 0:04:08 lr 0.000930 time 0.3245 (0.3307) loss 4.0279 (3.9271) grad_norm 1.0717 (1.1379) [2022-10-07 16:04:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][600/1251] eta 0:03:34 lr 0.000930 time 0.3238 (0.3299) loss 3.8765 (3.9278) grad_norm 1.0520 (1.1318) [2022-10-07 16:05:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][700/1251] eta 0:03:01 lr 0.000930 time 0.3265 (0.3295) loss 3.5936 (3.9287) grad_norm 1.1761 (1.1354) [2022-10-07 16:05:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][800/1251] eta 0:02:28 lr 0.000929 time 0.3323 (0.3292) loss 3.7701 (3.9261) grad_norm 1.3535 (1.1350) [2022-10-07 16:06:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][900/1251] eta 0:01:55 lr 0.000929 time 0.3326 (0.3290) loss 3.8894 (3.9280) grad_norm 1.0704 (1.1344) [2022-10-07 16:06:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][1000/1251] eta 0:01:22 lr 0.000929 time 0.3285 (0.3289) loss 4.0250 (3.9268) grad_norm 1.6218 (1.1352) [2022-10-07 16:07:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][1100/1251] eta 0:00:49 lr 0.000929 time 0.3271 (0.3288) loss 3.8966 (3.9282) grad_norm 1.2422 (1.1364) [2022-10-07 16:07:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [51/300][1200/1251] eta 0:00:16 lr 0.000929 time 0.3371 (0.3289) loss 3.8366 (3.9297) grad_norm 1.0057 (1.1343) [2022-10-07 16:08:14 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 51 training takes 0:06:51 [2022-10-07 16:08:16 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.742 (2.742) Loss 1.3397 (1.3397) Acc@1 69.238 (69.238) Acc@5 89.355 (89.355) [2022-10-07 16:08:27 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 69.784 Acc@5 89.886 [2022-10-07 16:08:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-10-07 16:08:27 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.05% [2022-10-07 16:08:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][0/1251] eta 0:53:28 lr 0.000928 time 2.5648 (2.5648) loss 3.8386 (3.8386) grad_norm 1.2921 (1.2921) [2022-10-07 16:09:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][100/1251] eta 0:06:43 lr 0.000928 time 0.3203 (0.3505) loss 4.1111 (3.9178) grad_norm 1.2069 (1.1884) [2022-10-07 16:09:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][200/1251] eta 0:05:55 lr 0.000928 time 0.3256 (0.3379) loss 4.0133 (3.9200) grad_norm 1.0822 (1.1647) [2022-10-07 16:10:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][300/1251] eta 0:05:17 lr 0.000928 time 0.3242 (0.3336) loss 3.9779 (3.9169) grad_norm 1.1616 (1.1680) [2022-10-07 16:10:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][400/1251] eta 0:04:41 lr 0.000928 time 0.3242 (0.3312) loss 4.1515 (3.9246) grad_norm 1.0957 (1.1612) [2022-10-07 16:11:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][500/1251] eta 0:04:08 lr 0.000927 time 0.3235 (0.3302) loss 3.9558 (3.9245) grad_norm 0.9901 (1.1564) [2022-10-07 16:11:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][600/1251] eta 0:03:34 lr 0.000927 time 0.3273 (0.3293) loss 3.7713 (3.9238) grad_norm 1.0009 (1.1490) [2022-10-07 16:12:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][700/1251] eta 0:03:01 lr 0.000927 time 0.3277 (0.3287) loss 4.0011 (3.9227) grad_norm 1.1002 (1.1512) [2022-10-07 16:12:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][800/1251] eta 0:02:28 lr 0.000927 time 0.3366 (0.3284) loss 4.4187 (3.9212) grad_norm 1.0546 (1.1484) [2022-10-07 16:13:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][900/1251] eta 0:01:55 lr 0.000926 time 0.3274 (0.3283) loss 4.0755 (3.9240) grad_norm 1.0307 (1.1478) [2022-10-07 16:13:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][1000/1251] eta 0:01:22 lr 0.000926 time 0.3233 (0.3283) loss 4.0959 (3.9254) grad_norm 1.0588 (1.1470) [2022-10-07 16:14:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][1100/1251] eta 0:00:49 lr 0.000926 time 0.3248 (0.3284) loss 4.0490 (3.9259) grad_norm 1.0830 (1.1456) [2022-10-07 16:15:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [52/300][1200/1251] eta 0:00:16 lr 0.000926 time 0.3314 (0.3284) loss 3.7574 (3.9241) grad_norm 1.1057 (1.1465) [2022-10-07 16:15:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 52 training takes 0:06:51 [2022-10-07 16:15:21 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.782 (2.782) Loss 1.2800 (1.2800) Acc@1 68.164 (68.164) Acc@5 91.406 (91.406) [2022-10-07 16:15:32 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.456 Acc@5 90.298 [2022-10-07 16:15:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-10-07 16:15:32 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.46% [2022-10-07 16:15:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][0/1251] eta 1:06:04 lr 0.000926 time 3.1691 (3.1691) loss 3.8151 (3.8151) grad_norm 1.0659 (1.0659) [2022-10-07 16:16:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][100/1251] eta 0:06:48 lr 0.000925 time 0.3312 (0.3545) loss 4.0210 (3.9092) grad_norm 1.2297 (1.1643) [2022-10-07 16:16:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][200/1251] eta 0:05:57 lr 0.000925 time 0.3275 (0.3402) loss 4.0142 (3.8901) grad_norm 1.1554 (1.1499) [2022-10-07 16:17:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][300/1251] eta 0:05:18 lr 0.000925 time 0.3220 (0.3353) loss 3.6621 (3.8965) grad_norm 1.2016 (1.1552) [2022-10-07 16:17:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][400/1251] eta 0:04:43 lr 0.000925 time 0.3229 (0.3329) loss 3.8125 (3.9050) grad_norm 1.0726 (1.1559) [2022-10-07 16:18:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][500/1251] eta 0:04:08 lr 0.000925 time 0.3318 (0.3314) loss 4.0023 (3.9094) grad_norm 1.0847 (1.1513) [2022-10-07 16:18:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][600/1251] eta 0:03:35 lr 0.000924 time 0.3255 (0.3305) loss 3.8350 (3.9129) grad_norm 1.1854 (1.1474) [2022-10-07 16:19:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][700/1251] eta 0:03:01 lr 0.000924 time 0.3283 (0.3299) loss 3.9336 (3.9091) grad_norm 0.9675 (1.1525) [2022-10-07 16:19:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][800/1251] eta 0:02:28 lr 0.000924 time 0.3266 (0.3296) loss 3.7521 (3.9103) grad_norm 1.3942 (1.1567) [2022-10-07 16:20:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][900/1251] eta 0:01:55 lr 0.000924 time 0.3368 (0.3295) loss 3.7701 (3.9096) grad_norm 1.0431 (1.1563) [2022-10-07 16:21:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][1000/1251] eta 0:01:22 lr 0.000923 time 0.3357 (0.3294) loss 3.9419 (3.9103) grad_norm 1.1725 (1.1557) [2022-10-07 16:21:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][1100/1251] eta 0:00:49 lr 0.000923 time 0.3246 (0.3295) loss 3.6088 (3.9120) grad_norm 1.0606 (1.1514) [2022-10-07 16:22:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [53/300][1200/1251] eta 0:00:16 lr 0.000923 time 0.3387 (0.3295) loss 4.3462 (3.9133) grad_norm 1.1106 (1.1559) [2022-10-07 16:22:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 53 training takes 0:06:52 [2022-10-07 16:22:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.423 (2.423) Loss 1.2203 (1.2203) Acc@1 72.754 (72.754) Acc@5 91.113 (91.113) [2022-10-07 16:22:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.560 Acc@5 90.332 [2022-10-07 16:22:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.6% [2022-10-07 16:22:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.56% [2022-10-07 16:22:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][0/1251] eta 0:56:27 lr 0.000923 time 2.7079 (2.7079) loss 3.6476 (3.6476) grad_norm 1.1083 (1.1083) [2022-10-07 16:23:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][100/1251] eta 0:06:42 lr 0.000923 time 0.3213 (0.3496) loss 4.1777 (3.9084) grad_norm 1.3700 (1.1861) [2022-10-07 16:23:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][200/1251] eta 0:05:54 lr 0.000922 time 0.3237 (0.3369) loss 3.7549 (3.9207) grad_norm 1.1113 (1.1771) [2022-10-07 16:24:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][300/1251] eta 0:05:16 lr 0.000922 time 0.3260 (0.3329) loss 3.7763 (3.9262) grad_norm 1.0372 (1.1689) [2022-10-07 16:24:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][400/1251] eta 0:04:41 lr 0.000922 time 0.3260 (0.3308) loss 4.2299 (3.9174) grad_norm 1.1900 (1.1577) [2022-10-07 16:25:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][500/1251] eta 0:04:07 lr 0.000922 time 0.3288 (0.3297) loss 3.7699 (3.9121) grad_norm 1.0531 (1.1538) [2022-10-07 16:25:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][600/1251] eta 0:03:34 lr 0.000922 time 0.3263 (0.3290) loss 3.8056 (3.9154) grad_norm 1.3345 (1.1589) [2022-10-07 16:26:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][700/1251] eta 0:03:01 lr 0.000921 time 0.3288 (0.3286) loss 3.8872 (3.9171) grad_norm 1.0842 (1.1551) [2022-10-07 16:27:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][800/1251] eta 0:02:28 lr 0.000921 time 0.3247 (0.3284) loss 4.1391 (3.9189) grad_norm 1.1126 (1.1545) [2022-10-07 16:27:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][900/1251] eta 0:01:55 lr 0.000921 time 0.3284 (0.3283) loss 3.7185 (3.9182) grad_norm 1.1311 (1.1535) [2022-10-07 16:28:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][1000/1251] eta 0:01:22 lr 0.000921 time 0.3199 (0.3282) loss 3.9563 (3.9160) grad_norm 1.0316 (1.1574) [2022-10-07 16:28:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][1100/1251] eta 0:00:49 lr 0.000920 time 0.3251 (0.3282) loss 3.7536 (3.9193) grad_norm 1.3989 (1.1591) [2022-10-07 16:29:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [54/300][1200/1251] eta 0:00:16 lr 0.000920 time 0.3265 (0.3282) loss 3.7464 (3.9171) grad_norm 0.9140 (1.1555) [2022-10-07 16:29:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 54 training takes 0:06:50 [2022-10-07 16:29:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.848 (2.848) Loss 1.1618 (1.1618) Acc@1 74.121 (74.121) Acc@5 90.527 (90.527) [2022-10-07 16:29:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.506 Acc@5 90.320 [2022-10-07 16:29:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-10-07 16:29:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.56% [2022-10-07 16:29:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][0/1251] eta 1:06:07 lr 0.000920 time 3.1717 (3.1717) loss 3.8293 (3.8293) grad_norm 1.0789 (1.0789) [2022-10-07 16:30:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][100/1251] eta 0:06:45 lr 0.000920 time 0.3303 (0.3525) loss 4.0127 (3.8618) grad_norm 0.9658 (1.1481) [2022-10-07 16:30:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][200/1251] eta 0:05:55 lr 0.000920 time 0.3254 (0.3386) loss 3.8505 (3.9021) grad_norm 1.3116 (1.1485) [2022-10-07 16:31:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][300/1251] eta 0:05:17 lr 0.000919 time 0.3264 (0.3339) loss 4.1026 (3.9005) grad_norm 1.3918 (1.1559) [2022-10-07 16:31:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][400/1251] eta 0:04:42 lr 0.000919 time 0.3235 (0.3319) loss 3.8212 (3.8990) grad_norm 1.0593 (1.1474) [2022-10-07 16:32:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][500/1251] eta 0:04:08 lr 0.000919 time 0.3251 (0.3303) loss 3.9385 (3.9035) grad_norm 1.0684 (1.1485) [2022-10-07 16:33:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][600/1251] eta 0:03:34 lr 0.000919 time 0.3229 (0.3293) loss 3.8929 (3.9085) grad_norm 1.1192 (1.1486) [2022-10-07 16:33:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][700/1251] eta 0:03:01 lr 0.000919 time 0.3256 (0.3286) loss 3.9584 (3.9114) grad_norm 1.1937 (1.1509) [2022-10-07 16:34:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][800/1251] eta 0:02:28 lr 0.000918 time 0.3220 (0.3284) loss 3.7077 (3.9135) grad_norm 0.9068 (1.1526) [2022-10-07 16:34:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][900/1251] eta 0:01:55 lr 0.000918 time 0.3357 (0.3283) loss 3.9833 (3.9165) grad_norm 0.9830 (1.1546) [2022-10-07 16:35:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][1000/1251] eta 0:01:22 lr 0.000918 time 0.3297 (0.3282) loss 3.8602 (3.9166) grad_norm 1.3016 (1.1558) [2022-10-07 16:35:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][1100/1251] eta 0:00:49 lr 0.000918 time 0.3275 (0.3282) loss 3.7811 (3.9169) grad_norm 1.0737 (1.1565) [2022-10-07 16:36:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [55/300][1200/1251] eta 0:00:16 lr 0.000917 time 0.3345 (0.3282) loss 4.0235 (3.9158) grad_norm 1.2383 (1.1542) [2022-10-07 16:36:34 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 55 training takes 0:06:50 [2022-10-07 16:36:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.441 (2.441) Loss 1.2104 (1.2104) Acc@1 71.680 (71.680) Acc@5 91.406 (91.406) [2022-10-07 16:36:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.690 Acc@5 90.464 [2022-10-07 16:36:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.7% [2022-10-07 16:36:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.69% [2022-10-07 16:36:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][0/1251] eta 1:06:59 lr 0.000917 time 3.2130 (3.2130) loss 4.1353 (4.1353) grad_norm 1.0400 (1.0400) [2022-10-07 16:37:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][100/1251] eta 0:06:47 lr 0.000917 time 0.3190 (0.3538) loss 3.8439 (3.8913) grad_norm 1.2204 (1.1441) [2022-10-07 16:37:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][200/1251] eta 0:05:56 lr 0.000917 time 0.3251 (0.3395) loss 4.0817 (3.9037) grad_norm 1.0665 (1.1402) [2022-10-07 16:38:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][300/1251] eta 0:05:18 lr 0.000917 time 0.3241 (0.3349) loss 4.0643 (3.9101) grad_norm 1.2005 (1.1529) [2022-10-07 16:39:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][400/1251] eta 0:04:43 lr 0.000916 time 0.3248 (0.3327) loss 3.8883 (3.9121) grad_norm 1.2449 (1.1537) [2022-10-07 16:39:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][500/1251] eta 0:04:08 lr 0.000916 time 0.3265 (0.3313) loss 4.1315 (3.9142) grad_norm 1.2115 (1.1570) [2022-10-07 16:40:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][600/1251] eta 0:03:35 lr 0.000916 time 0.3252 (0.3305) loss 3.9432 (3.9108) grad_norm 1.1037 (1.1536) [2022-10-07 16:40:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][700/1251] eta 0:03:01 lr 0.000916 time 0.3263 (0.3301) loss 3.5682 (3.9150) grad_norm 1.2911 (1.1554) [2022-10-07 16:41:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][800/1251] eta 0:02:28 lr 0.000915 time 0.3389 (0.3297) loss 3.7059 (3.9135) grad_norm 1.1293 (1.1511) [2022-10-07 16:41:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][900/1251] eta 0:01:55 lr 0.000915 time 0.3220 (0.3296) loss 3.9763 (3.9143) grad_norm 1.1240 (1.1551) [2022-10-07 16:42:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][1000/1251] eta 0:01:22 lr 0.000915 time 0.3281 (0.3295) loss 4.0647 (3.9142) grad_norm 1.1469 (1.1570) [2022-10-07 16:42:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][1100/1251] eta 0:00:49 lr 0.000915 time 0.3236 (0.3296) loss 3.6050 (3.9125) grad_norm 0.9551 (1.1578) [2022-10-07 16:43:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [56/300][1200/1251] eta 0:00:16 lr 0.000915 time 0.3310 (0.3296) loss 3.9453 (3.9138) grad_norm 1.0543 (1.1568) [2022-10-07 16:43:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 56 training takes 0:06:52 [2022-10-07 16:43:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.861 (2.861) Loss 1.2361 (1.2361) Acc@1 72.070 (72.070) Acc@5 91.992 (91.992) [2022-10-07 16:43:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.744 Acc@5 90.286 [2022-10-07 16:43:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.7% [2022-10-07 16:43:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.74% [2022-10-07 16:43:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][0/1251] eta 0:58:51 lr 0.000914 time 2.8226 (2.8226) loss 3.6683 (3.6683) grad_norm 1.1840 (1.1840) [2022-10-07 16:44:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][100/1251] eta 0:06:44 lr 0.000914 time 0.3284 (0.3518) loss 3.9409 (3.8298) grad_norm 1.0504 (1.1633) [2022-10-07 16:45:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][200/1251] eta 0:05:56 lr 0.000914 time 0.3283 (0.3392) loss 3.6313 (3.8540) grad_norm 1.5941 (1.1601) [2022-10-07 16:45:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][300/1251] eta 0:05:18 lr 0.000914 time 0.3256 (0.3352) loss 3.8066 (3.8742) grad_norm 1.1082 (1.1525) [2022-10-07 16:46:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][400/1251] eta 0:04:43 lr 0.000913 time 0.3237 (0.3329) loss 3.8591 (3.8753) grad_norm 1.1688 (1.1643) [2022-10-07 16:46:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][500/1251] eta 0:04:08 lr 0.000913 time 0.3344 (0.3314) loss 3.9469 (3.8762) grad_norm 1.3237 (1.1641) [2022-10-07 16:47:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][600/1251] eta 0:03:35 lr 0.000913 time 0.3286 (0.3304) loss 3.8875 (3.8798) grad_norm 1.0902 (1.1631) [2022-10-07 16:47:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][700/1251] eta 0:03:01 lr 0.000913 time 0.3253 (0.3295) loss 4.0505 (3.8840) grad_norm 1.1615 (1.1589) [2022-10-07 16:48:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][800/1251] eta 0:02:28 lr 0.000913 time 0.3230 (0.3290) loss 4.1537 (3.8868) grad_norm 1.1266 (1.1550) [2022-10-07 16:48:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][900/1251] eta 0:01:55 lr 0.000912 time 0.3267 (0.3286) loss 3.8336 (3.8824) grad_norm 1.0897 (1.1563) [2022-10-07 16:49:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][1000/1251] eta 0:01:22 lr 0.000912 time 0.3243 (0.3284) loss 3.7712 (3.8877) grad_norm 1.3722 (1.1557) [2022-10-07 16:49:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][1100/1251] eta 0:00:49 lr 0.000912 time 0.3216 (0.3282) loss 4.0581 (3.8865) grad_norm 1.5057 (1.1594) [2022-10-07 16:50:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [57/300][1200/1251] eta 0:00:16 lr 0.000912 time 0.3288 (0.3282) loss 3.4422 (3.8865) grad_norm 1.5425 (1.1618) [2022-10-07 16:50:44 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 57 training takes 0:06:50 [2022-10-07 16:50:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.069 (3.069) Loss 1.2763 (1.2763) Acc@1 69.922 (69.922) Acc@5 90.820 (90.820) [2022-10-07 16:50:58 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 70.834 Acc@5 90.506 [2022-10-07 16:50:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 70.8% [2022-10-07 16:50:58 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 70.83% [2022-10-07 16:51:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][0/1251] eta 0:56:50 lr 0.000911 time 2.7264 (2.7264) loss 3.7958 (3.7958) grad_norm 1.4975 (1.4975) [2022-10-07 16:51:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][100/1251] eta 0:06:41 lr 0.000911 time 0.3250 (0.3490) loss 4.2718 (3.8556) grad_norm 1.2814 (1.1666) [2022-10-07 16:52:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][200/1251] eta 0:05:54 lr 0.000911 time 0.3175 (0.3370) loss 3.5738 (3.8948) grad_norm 1.2502 (1.1747) [2022-10-07 16:52:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][300/1251] eta 0:05:17 lr 0.000911 time 0.3284 (0.3334) loss 3.8010 (3.8880) grad_norm 1.2430 (1.1716) [2022-10-07 16:53:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][400/1251] eta 0:04:41 lr 0.000911 time 0.3213 (0.3310) loss 3.9237 (3.8921) grad_norm 1.1491 (1.1707) [2022-10-07 16:53:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][500/1251] eta 0:04:07 lr 0.000910 time 0.3262 (0.3295) loss 3.7341 (3.8899) grad_norm 1.0217 (1.1656) [2022-10-07 16:54:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][600/1251] eta 0:03:33 lr 0.000910 time 0.3234 (0.3285) loss 3.9137 (3.8911) grad_norm 0.9987 (1.1646) [2022-10-07 16:54:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][700/1251] eta 0:03:00 lr 0.000910 time 0.3244 (0.3279) loss 3.7635 (3.8910) grad_norm 1.1611 (1.1586) [2022-10-07 16:55:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][800/1251] eta 0:02:27 lr 0.000910 time 0.3250 (0.3274) loss 3.6456 (3.8930) grad_norm 1.2269 (1.1600) [2022-10-07 16:55:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][900/1251] eta 0:01:54 lr 0.000909 time 0.3229 (0.3270) loss 3.7651 (3.8982) grad_norm 1.0474 (1.1594) [2022-10-07 16:56:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][1000/1251] eta 0:01:22 lr 0.000909 time 0.3259 (0.3268) loss 4.1037 (3.8972) grad_norm 1.2673 (1.1559) [2022-10-07 16:56:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][1100/1251] eta 0:00:49 lr 0.000909 time 0.3257 (0.3266) loss 3.6949 (3.8979) grad_norm 1.1452 (1.1564) [2022-10-07 16:57:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [58/300][1200/1251] eta 0:00:16 lr 0.000909 time 0.3217 (0.3265) loss 3.8381 (3.8962) grad_norm 1.0729 (1.1605) [2022-10-07 16:57:46 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 58 training takes 0:06:48 [2022-10-07 16:57:49 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.124 (3.124) Loss 1.3184 (1.3184) Acc@1 69.434 (69.434) Acc@5 88.965 (88.965) [2022-10-07 16:58:00 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.076 Acc@5 90.548 [2022-10-07 16:58:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-10-07 16:58:00 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.08% [2022-10-07 16:58:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][0/1251] eta 1:09:11 lr 0.000908 time 3.3183 (3.3183) loss 4.0772 (4.0772) grad_norm 1.0560 (1.0560) [2022-10-07 16:58:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][100/1251] eta 0:06:50 lr 0.000908 time 0.3341 (0.3570) loss 3.9428 (3.9145) grad_norm 1.2694 (1.1762) [2022-10-07 16:59:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][200/1251] eta 0:05:59 lr 0.000908 time 0.3334 (0.3422) loss 4.0588 (3.9024) grad_norm 1.0906 (1.1592) [2022-10-07 16:59:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][300/1251] eta 0:05:20 lr 0.000908 time 0.3297 (0.3370) loss 3.9001 (3.8909) grad_norm 1.0238 (1.1509) [2022-10-07 17:00:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][400/1251] eta 0:04:44 lr 0.000908 time 0.3255 (0.3342) loss 3.9957 (3.8822) grad_norm 1.3733 (1.1541) [2022-10-07 17:00:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][500/1251] eta 0:04:09 lr 0.000907 time 0.3322 (0.3325) loss 3.6278 (3.8841) grad_norm 1.2199 (1.1562) [2022-10-07 17:01:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][600/1251] eta 0:03:35 lr 0.000907 time 0.3240 (0.3313) loss 3.6578 (3.8861) grad_norm 1.3477 (1.1601) [2022-10-07 17:01:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][700/1251] eta 0:03:02 lr 0.000907 time 0.3260 (0.3305) loss 4.2295 (3.8864) grad_norm 1.0691 (1.1613) [2022-10-07 17:02:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][800/1251] eta 0:02:28 lr 0.000907 time 0.3228 (0.3300) loss 3.9112 (3.8902) grad_norm 1.2618 (1.1672) [2022-10-07 17:02:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][900/1251] eta 0:01:55 lr 0.000906 time 0.3283 (0.3297) loss 3.9370 (3.8877) grad_norm 1.1449 (1.1619) [2022-10-07 17:03:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][1000/1251] eta 0:01:22 lr 0.000906 time 0.3242 (0.3295) loss 3.9036 (3.8906) grad_norm 1.3036 (1.1612) [2022-10-07 17:04:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][1100/1251] eta 0:00:49 lr 0.000906 time 0.3284 (0.3295) loss 3.7490 (3.8926) grad_norm 1.0817 (1.1607) [2022-10-07 17:04:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [59/300][1200/1251] eta 0:00:16 lr 0.000906 time 0.3211 (0.3295) loss 4.0948 (3.8932) grad_norm 1.1507 (1.1648) [2022-10-07 17:04:52 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 59 training takes 0:06:52 [2022-10-07 17:04:56 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.277 (3.277) Loss 1.3023 (1.3023) Acc@1 69.336 (69.336) Acc@5 89.160 (89.160) [2022-10-07 17:05:06 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.088 Acc@5 90.632 [2022-10-07 17:05:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-10-07 17:05:06 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.09% [2022-10-07 17:05:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][0/1251] eta 0:58:42 lr 0.000905 time 2.8159 (2.8159) loss 3.7269 (3.7269) grad_norm 1.1898 (1.1898) [2022-10-07 17:05:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][100/1251] eta 0:06:46 lr 0.000905 time 0.3322 (0.3528) loss 3.9881 (3.8584) grad_norm 1.3403 (1.1970) [2022-10-07 17:06:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][200/1251] eta 0:05:57 lr 0.000905 time 0.3242 (0.3403) loss 4.0340 (3.8731) grad_norm 1.1944 (1.1785) [2022-10-07 17:06:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][300/1251] eta 0:05:19 lr 0.000905 time 0.3261 (0.3359) loss 4.1268 (3.8647) grad_norm 1.5129 (1.1740) [2022-10-07 17:07:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][400/1251] eta 0:04:43 lr 0.000904 time 0.3311 (0.3336) loss 3.9726 (3.8745) grad_norm 0.9983 (1.1747) [2022-10-07 17:07:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][500/1251] eta 0:04:09 lr 0.000904 time 0.3313 (0.3323) loss 3.7616 (3.8734) grad_norm 1.1802 (1.1718) [2022-10-07 17:08:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][600/1251] eta 0:03:35 lr 0.000904 time 0.3264 (0.3312) loss 4.1211 (3.8763) grad_norm 1.2232 (1.1717) [2022-10-07 17:08:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][700/1251] eta 0:03:02 lr 0.000904 time 0.3258 (0.3303) loss 3.9574 (3.8733) grad_norm 1.4377 (1.1716) [2022-10-07 17:09:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][800/1251] eta 0:02:28 lr 0.000904 time 0.3306 (0.3297) loss 3.8377 (3.8755) grad_norm 1.4569 (1.1716) [2022-10-07 17:10:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][900/1251] eta 0:01:55 lr 0.000903 time 0.3243 (0.3292) loss 4.2497 (3.8726) grad_norm 1.4002 (1.1724) [2022-10-07 17:10:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][1000/1251] eta 0:01:22 lr 0.000903 time 0.3292 (0.3289) loss 3.7760 (3.8745) grad_norm 1.3400 (1.1742) [2022-10-07 17:11:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][1100/1251] eta 0:00:49 lr 0.000903 time 0.3270 (0.3286) loss 3.9285 (3.8769) grad_norm 1.1466 (1.1716) [2022-10-07 17:11:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [60/300][1200/1251] eta 0:00:16 lr 0.000903 time 0.3272 (0.3285) loss 4.0120 (3.8770) grad_norm 1.2785 (1.1728) [2022-10-07 17:11:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 60 training takes 0:06:51 [2022-10-07 17:11:57 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_60 saving...... [2022-10-07 17:11:58 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_60 saved !!! [2022-10-07 17:12:01 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.152 (3.152) Loss 1.1696 (1.1696) Acc@1 70.801 (70.801) Acc@5 91.699 (91.699) [2022-10-07 17:12:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.368 Acc@5 90.646 [2022-10-07 17:12:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-10-07 17:12:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.37% [2022-10-07 17:12:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][0/1251] eta 0:50:49 lr 0.000902 time 2.4376 (2.4376) loss 3.7204 (3.7204) grad_norm 1.1315 (1.1315) [2022-10-07 17:12:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][100/1251] eta 0:06:46 lr 0.000902 time 0.3269 (0.3531) loss 3.7364 (3.8416) grad_norm 1.1634 (1.1659) [2022-10-07 17:13:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][200/1251] eta 0:05:59 lr 0.000902 time 0.3277 (0.3421) loss 3.6825 (3.8416) grad_norm 1.1496 (1.1705) [2022-10-07 17:13:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][300/1251] eta 0:05:20 lr 0.000902 time 0.3242 (0.3369) loss 4.2227 (3.8581) grad_norm 1.4213 (1.1703) [2022-10-07 17:14:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][400/1251] eta 0:04:44 lr 0.000901 time 0.3220 (0.3340) loss 4.0851 (3.8656) grad_norm 1.0770 (1.1755) [2022-10-07 17:14:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][500/1251] eta 0:04:09 lr 0.000901 time 0.3273 (0.3323) loss 3.8617 (3.8698) grad_norm 1.2131 (1.1696) [2022-10-07 17:15:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][600/1251] eta 0:03:35 lr 0.000901 time 0.3242 (0.3314) loss 3.6637 (3.8737) grad_norm 1.2215 (1.1671) [2022-10-07 17:16:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][700/1251] eta 0:03:02 lr 0.000901 time 0.3260 (0.3307) loss 3.9795 (3.8747) grad_norm 0.9883 (1.1682) [2022-10-07 17:16:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][800/1251] eta 0:02:28 lr 0.000900 time 0.3218 (0.3302) loss 3.6749 (3.8772) grad_norm 1.0886 (1.1731) [2022-10-07 17:17:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][900/1251] eta 0:01:55 lr 0.000900 time 0.3280 (0.3297) loss 3.9365 (3.8733) grad_norm 1.1711 (1.1751) [2022-10-07 17:17:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][1000/1251] eta 0:01:22 lr 0.000900 time 0.3195 (0.3294) loss 3.7495 (3.8749) grad_norm 1.0229 (1.1748) [2022-10-07 17:18:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][1100/1251] eta 0:00:49 lr 0.000900 time 0.3282 (0.3291) loss 3.8924 (3.8742) grad_norm 1.3186 (1.1749) [2022-10-07 17:18:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [61/300][1200/1251] eta 0:00:16 lr 0.000899 time 0.3253 (0.3290) loss 3.6667 (3.8750) grad_norm 1.2650 (1.1721) [2022-10-07 17:19:03 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 61 training takes 0:06:51 [2022-10-07 17:19:05 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.310 (2.310) Loss 1.1867 (1.1867) Acc@1 72.754 (72.754) Acc@5 91.406 (91.406) [2022-10-07 17:19:17 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.232 Acc@5 90.768 [2022-10-07 17:19:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-07 17:19:17 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.37% [2022-10-07 17:19:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][0/1251] eta 0:59:58 lr 0.000899 time 2.8762 (2.8762) loss 3.7846 (3.7846) grad_norm 1.1666 (1.1666) [2022-10-07 17:19:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][100/1251] eta 0:06:44 lr 0.000899 time 0.3225 (0.3516) loss 3.6866 (3.8830) grad_norm 1.0901 (1.1546) [2022-10-07 17:20:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][200/1251] eta 0:05:56 lr 0.000899 time 0.3227 (0.3388) loss 4.2083 (3.8697) grad_norm 1.1253 (1.1747) [2022-10-07 17:20:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][300/1251] eta 0:05:18 lr 0.000899 time 0.3220 (0.3348) loss 3.6104 (3.8672) grad_norm 1.0582 (1.1706) [2022-10-07 17:21:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][400/1251] eta 0:04:43 lr 0.000898 time 0.3322 (0.3330) loss 4.3552 (3.8624) grad_norm 1.2039 (1.1675) [2022-10-07 17:22:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][500/1251] eta 0:04:09 lr 0.000898 time 0.3253 (0.3317) loss 4.1252 (3.8622) grad_norm 1.1306 (1.1664) [2022-10-07 17:22:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][600/1251] eta 0:03:35 lr 0.000898 time 0.3285 (0.3309) loss 3.7463 (3.8628) grad_norm 1.1365 (1.1661) [2022-10-07 17:23:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][700/1251] eta 0:03:02 lr 0.000898 time 0.3230 (0.3303) loss 3.8552 (3.8608) grad_norm 0.9395 (1.1641) [2022-10-07 17:23:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][800/1251] eta 0:02:28 lr 0.000897 time 0.3305 (0.3299) loss 4.1248 (3.8624) grad_norm 1.2066 (1.1656) [2022-10-07 17:24:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][900/1251] eta 0:01:55 lr 0.000897 time 0.3330 (0.3297) loss 3.6418 (3.8614) grad_norm 1.0737 (1.1631) [2022-10-07 17:24:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][1000/1251] eta 0:01:22 lr 0.000897 time 0.3274 (0.3295) loss 3.7686 (3.8636) grad_norm 1.0205 (1.1581) [2022-10-07 17:25:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][1100/1251] eta 0:00:49 lr 0.000897 time 0.3289 (0.3294) loss 4.2702 (3.8650) grad_norm 1.4168 (1.1589) [2022-10-07 17:25:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [62/300][1200/1251] eta 0:00:16 lr 0.000896 time 0.3321 (0.3294) loss 3.6794 (3.8680) grad_norm 1.0763 (1.1612) [2022-10-07 17:26:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 62 training takes 0:06:52 [2022-10-07 17:26:12 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.185 (3.185) Loss 1.1991 (1.1991) Acc@1 72.852 (72.852) Acc@5 91.797 (91.797) [2022-10-07 17:26:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.166 Acc@5 90.792 [2022-10-07 17:26:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-07 17:26:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.37% [2022-10-07 17:26:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][0/1251] eta 1:08:14 lr 0.000896 time 3.2727 (3.2727) loss 4.0533 (4.0533) grad_norm 1.2894 (1.2894) [2022-10-07 17:26:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][100/1251] eta 0:06:47 lr 0.000896 time 0.3257 (0.3540) loss 3.9560 (3.8587) grad_norm 1.0660 (1.1636) [2022-10-07 17:27:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][200/1251] eta 0:05:57 lr 0.000896 time 0.3260 (0.3399) loss 3.7418 (3.8659) grad_norm 1.3519 (1.1796) [2022-10-07 17:28:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][300/1251] eta 0:05:18 lr 0.000895 time 0.3267 (0.3347) loss 3.9020 (3.8604) grad_norm 1.1239 (1.1804) [2022-10-07 17:28:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][400/1251] eta 0:04:42 lr 0.000895 time 0.3233 (0.3324) loss 3.9022 (3.8641) grad_norm 1.0856 (1.1777) [2022-10-07 17:29:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][500/1251] eta 0:04:08 lr 0.000895 time 0.3290 (0.3309) loss 3.7312 (3.8625) grad_norm 1.2586 (1.1732) [2022-10-07 17:29:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][600/1251] eta 0:03:34 lr 0.000895 time 0.3229 (0.3299) loss 3.8021 (3.8613) grad_norm 1.2691 (1.1804) [2022-10-07 17:30:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][700/1251] eta 0:03:01 lr 0.000894 time 0.3217 (0.3292) loss 4.1387 (3.8660) grad_norm 1.0969 (1.1805) [2022-10-07 17:30:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][800/1251] eta 0:02:28 lr 0.000894 time 0.3245 (0.3286) loss 4.0255 (3.8635) grad_norm 1.2086 (1.1837) [2022-10-07 17:31:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][900/1251] eta 0:01:55 lr 0.000894 time 0.3309 (0.3284) loss 3.5299 (3.8615) grad_norm 1.2424 (1.1853) [2022-10-07 17:31:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][1000/1251] eta 0:01:22 lr 0.000894 time 0.3217 (0.3283) loss 3.9641 (3.8603) grad_norm 1.0486 (1.1872) [2022-10-07 17:32:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][1100/1251] eta 0:00:49 lr 0.000893 time 0.3367 (0.3283) loss 3.9279 (3.8609) grad_norm 1.4184 (1.1844) [2022-10-07 17:32:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [63/300][1200/1251] eta 0:00:16 lr 0.000893 time 0.3317 (0.3284) loss 3.6257 (3.8636) grad_norm 1.1151 (1.1822) [2022-10-07 17:33:14 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 63 training takes 0:06:51 [2022-10-07 17:33:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.801 (2.801) Loss 1.3134 (1.3134) Acc@1 68.848 (68.848) Acc@5 89.844 (89.844) [2022-10-07 17:33:27 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.200 Acc@5 90.748 [2022-10-07 17:33:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-07 17:33:27 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.37% [2022-10-07 17:33:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][0/1251] eta 0:55:20 lr 0.000893 time 2.6543 (2.6543) loss 3.8666 (3.8666) grad_norm 1.1623 (1.1623) [2022-10-07 17:34:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][100/1251] eta 0:06:42 lr 0.000893 time 0.3232 (0.3496) loss 3.9310 (3.8356) grad_norm 1.1656 (1.1568) [2022-10-07 17:34:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][200/1251] eta 0:05:54 lr 0.000892 time 0.3239 (0.3370) loss 4.0799 (3.8433) grad_norm 1.2344 (1.1738) [2022-10-07 17:35:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][300/1251] eta 0:05:16 lr 0.000892 time 0.3211 (0.3325) loss 3.9602 (3.8493) grad_norm 1.2659 (1.1900) [2022-10-07 17:35:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][400/1251] eta 0:04:41 lr 0.000892 time 0.3214 (0.3302) loss 4.0446 (3.8484) grad_norm 1.2468 (1.1947) [2022-10-07 17:36:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][500/1251] eta 0:04:06 lr 0.000892 time 0.3219 (0.3287) loss 4.1345 (3.8498) grad_norm 1.1473 (1.1930) [2022-10-07 17:36:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][600/1251] eta 0:03:33 lr 0.000891 time 0.3250 (0.3279) loss 4.1632 (3.8563) grad_norm 1.1495 (1.1891) [2022-10-07 17:37:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][700/1251] eta 0:03:00 lr 0.000891 time 0.3197 (0.3273) loss 3.8090 (3.8571) grad_norm 1.4259 (1.1946) [2022-10-07 17:37:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][800/1251] eta 0:02:27 lr 0.000891 time 0.3268 (0.3269) loss 4.1283 (3.8560) grad_norm 1.4435 (1.1924) [2022-10-07 17:38:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][900/1251] eta 0:01:54 lr 0.000891 time 0.3220 (0.3266) loss 3.6658 (3.8607) grad_norm 1.2852 (1.1915) [2022-10-07 17:38:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][1000/1251] eta 0:01:21 lr 0.000890 time 0.3212 (0.3263) loss 3.8603 (3.8634) grad_norm 1.5177 (1.1880) [2022-10-07 17:39:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][1100/1251] eta 0:00:49 lr 0.000890 time 0.3222 (0.3262) loss 4.0131 (3.8637) grad_norm 1.0252 (1.1838) [2022-10-07 17:39:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [64/300][1200/1251] eta 0:00:16 lr 0.000890 time 0.3208 (0.3263) loss 4.2692 (3.8614) grad_norm 1.2633 (1.1837) [2022-10-07 17:40:16 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 64 training takes 0:06:48 [2022-10-07 17:40:18 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.509 (2.509) Loss 1.1283 (1.1283) Acc@1 71.191 (71.191) Acc@5 92.578 (92.578) [2022-10-07 17:40:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.498 Acc@5 90.808 [2022-10-07 17:40:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-10-07 17:40:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.50% [2022-10-07 17:40:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][0/1251] eta 1:07:29 lr 0.000890 time 3.2371 (3.2371) loss 3.8644 (3.8644) grad_norm 1.2959 (1.2959) [2022-10-07 17:41:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][100/1251] eta 0:06:47 lr 0.000889 time 0.3339 (0.3544) loss 4.0639 (3.8483) grad_norm 1.0504 (1.2326) [2022-10-07 17:41:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][200/1251] eta 0:05:57 lr 0.000889 time 0.3374 (0.3403) loss 3.5900 (3.8431) grad_norm 1.3614 (1.1953) [2022-10-07 17:42:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][300/1251] eta 0:05:18 lr 0.000889 time 0.3279 (0.3354) loss 3.9750 (3.8528) grad_norm 1.3021 (1.1935) [2022-10-07 17:42:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][400/1251] eta 0:04:43 lr 0.000889 time 0.3248 (0.3329) loss 4.0958 (3.8532) grad_norm 1.0101 (1.1891) [2022-10-07 17:43:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][500/1251] eta 0:04:08 lr 0.000888 time 0.3263 (0.3313) loss 3.5204 (3.8550) grad_norm 1.4721 (1.1928) [2022-10-07 17:43:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][600/1251] eta 0:03:34 lr 0.000888 time 0.3263 (0.3301) loss 3.8839 (3.8533) grad_norm 1.2894 (1.1916) [2022-10-07 17:44:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][700/1251] eta 0:03:01 lr 0.000888 time 0.3236 (0.3292) loss 3.7938 (3.8524) grad_norm 1.4596 (1.1943) [2022-10-07 17:44:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][800/1251] eta 0:02:28 lr 0.000888 time 0.3239 (0.3285) loss 3.8477 (3.8540) grad_norm 1.0422 (1.1914) [2022-10-07 17:45:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][900/1251] eta 0:01:55 lr 0.000887 time 0.3271 (0.3281) loss 3.8888 (3.8561) grad_norm 1.5023 (1.1909) [2022-10-07 17:45:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][1000/1251] eta 0:01:22 lr 0.000887 time 0.3332 (0.3279) loss 3.8702 (3.8569) grad_norm 1.0458 (1.1929) [2022-10-07 17:46:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][1100/1251] eta 0:00:49 lr 0.000887 time 0.3236 (0.3278) loss 3.6299 (3.8569) grad_norm 1.3029 (1.1919) [2022-10-07 17:47:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [65/300][1200/1251] eta 0:00:16 lr 0.000887 time 0.3264 (0.3278) loss 3.7526 (3.8553) grad_norm 1.4103 (1.1925) [2022-10-07 17:47:20 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 65 training takes 0:06:50 [2022-10-07 17:47:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.792 (2.792) Loss 1.1755 (1.1755) Acc@1 72.461 (72.461) Acc@5 91.895 (91.895) [2022-10-07 17:47:34 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.178 Acc@5 90.794 [2022-10-07 17:47:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-07 17:47:34 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.50% [2022-10-07 17:47:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][0/1251] eta 1:06:36 lr 0.000886 time 3.1944 (3.1944) loss 3.3909 (3.3909) grad_norm 1.3446 (1.3446) [2022-10-07 17:48:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][100/1251] eta 0:06:44 lr 0.000886 time 0.3209 (0.3516) loss 3.8456 (3.8547) grad_norm 1.2384 (1.2160) [2022-10-07 17:48:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][200/1251] eta 0:05:54 lr 0.000886 time 0.3250 (0.3376) loss 3.9625 (3.8568) grad_norm 1.1276 (1.1989) [2022-10-07 17:49:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][300/1251] eta 0:05:16 lr 0.000886 time 0.3242 (0.3328) loss 4.0724 (3.8621) grad_norm 1.2128 (1.1900) [2022-10-07 17:49:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][400/1251] eta 0:04:41 lr 0.000885 time 0.3238 (0.3306) loss 3.7598 (3.8557) grad_norm 1.3747 (1.1864) [2022-10-07 17:50:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][500/1251] eta 0:04:07 lr 0.000885 time 0.3236 (0.3292) loss 3.9792 (3.8487) grad_norm 1.3924 (1.1864) [2022-10-07 17:50:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][600/1251] eta 0:03:33 lr 0.000885 time 0.3222 (0.3283) loss 3.9666 (3.8537) grad_norm 1.4794 (1.1868) [2022-10-07 17:51:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][700/1251] eta 0:03:00 lr 0.000885 time 0.3198 (0.3276) loss 3.9115 (3.8524) grad_norm 1.1926 (1.1841) [2022-10-07 17:51:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][800/1251] eta 0:02:27 lr 0.000884 time 0.3269 (0.3271) loss 3.7589 (3.8526) grad_norm 1.1153 (1.1840) [2022-10-07 17:52:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][900/1251] eta 0:01:54 lr 0.000884 time 0.3232 (0.3267) loss 3.7564 (3.8513) grad_norm 1.1919 (1.1831) [2022-10-07 17:53:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][1000/1251] eta 0:01:21 lr 0.000884 time 0.3237 (0.3265) loss 3.6046 (3.8532) grad_norm 1.1746 (1.1799) [2022-10-07 17:53:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][1100/1251] eta 0:00:49 lr 0.000883 time 0.3250 (0.3264) loss 3.9753 (3.8506) grad_norm 1.1574 (1.1842) [2022-10-07 17:54:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [66/300][1200/1251] eta 0:00:16 lr 0.000883 time 0.3248 (0.3266) loss 3.9580 (3.8510) grad_norm 1.2148 (1.1865) [2022-10-07 17:54:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 66 training takes 0:06:48 [2022-10-07 17:54:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.958 (2.958) Loss 1.1312 (1.1312) Acc@1 73.145 (73.145) Acc@5 92.285 (92.285) [2022-10-07 17:54:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.416 Acc@5 90.742 [2022-10-07 17:54:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-10-07 17:54:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.50% [2022-10-07 17:54:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][0/1251] eta 1:05:00 lr 0.000883 time 3.1179 (3.1179) loss 3.8996 (3.8996) grad_norm 1.0178 (1.0178) [2022-10-07 17:55:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][100/1251] eta 0:06:45 lr 0.000883 time 0.3264 (0.3523) loss 3.6481 (3.8355) grad_norm 1.3155 (1.1598) [2022-10-07 17:55:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][200/1251] eta 0:05:55 lr 0.000883 time 0.3238 (0.3385) loss 3.7591 (3.8347) grad_norm 1.3907 (1.1770) [2022-10-07 17:56:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][300/1251] eta 0:05:17 lr 0.000882 time 0.3242 (0.3338) loss 3.6576 (3.8386) grad_norm 1.1113 (1.1733) [2022-10-07 17:56:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][400/1251] eta 0:04:41 lr 0.000882 time 0.3246 (0.3313) loss 3.7270 (3.8429) grad_norm 1.0142 (1.1803) [2022-10-07 17:57:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][500/1251] eta 0:04:07 lr 0.000882 time 0.3217 (0.3298) loss 3.8727 (3.8443) grad_norm 1.2149 (1.1818) [2022-10-07 17:57:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][600/1251] eta 0:03:34 lr 0.000881 time 0.3237 (0.3289) loss 3.8130 (3.8450) grad_norm 1.0597 (1.1795) [2022-10-07 17:58:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][700/1251] eta 0:03:00 lr 0.000881 time 0.3303 (0.3282) loss 3.9463 (3.8451) grad_norm 1.4146 (1.1788) [2022-10-07 17:58:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][800/1251] eta 0:02:27 lr 0.000881 time 0.3310 (0.3279) loss 3.9343 (3.8440) grad_norm 0.9776 (1.1765) [2022-10-07 17:59:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][900/1251] eta 0:01:55 lr 0.000881 time 0.3216 (0.3277) loss 3.8092 (3.8478) grad_norm 1.0867 (1.1814) [2022-10-07 18:00:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][1000/1251] eta 0:01:22 lr 0.000880 time 0.3303 (0.3276) loss 3.7645 (3.8498) grad_norm 1.1987 (1.1809) [2022-10-07 18:00:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][1100/1251] eta 0:00:49 lr 0.000880 time 0.3247 (0.3277) loss 3.4019 (3.8493) grad_norm 1.2084 (1.1812) [2022-10-07 18:01:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [67/300][1200/1251] eta 0:00:16 lr 0.000880 time 0.3390 (0.3278) loss 3.9918 (3.8485) grad_norm 1.1195 (1.1829) [2022-10-07 18:01:27 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 67 training takes 0:06:50 [2022-10-07 18:01:29 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.778 (2.778) Loss 1.2357 (1.2357) Acc@1 71.777 (71.777) Acc@5 91.309 (91.309) [2022-10-07 18:01:40 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.672 Acc@5 90.916 [2022-10-07 18:01:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-10-07 18:01:40 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.67% [2022-10-07 18:01:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][0/1251] eta 1:06:13 lr 0.000880 time 3.1761 (3.1761) loss 3.8728 (3.8728) grad_norm 1.6573 (1.6573) [2022-10-07 18:02:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][100/1251] eta 0:06:49 lr 0.000879 time 0.3304 (0.3556) loss 3.5784 (3.8256) grad_norm 1.1405 (1.1779) [2022-10-07 18:02:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][200/1251] eta 0:05:58 lr 0.000879 time 0.3245 (0.3413) loss 4.0206 (3.8364) grad_norm 1.0957 (1.1917) [2022-10-07 18:03:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][300/1251] eta 0:05:19 lr 0.000879 time 0.3227 (0.3365) loss 4.0172 (3.8409) grad_norm 0.9947 (1.1836) [2022-10-07 18:03:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][400/1251] eta 0:04:44 lr 0.000879 time 0.3235 (0.3340) loss 3.8675 (3.8428) grad_norm 1.0290 (1.1863) [2022-10-07 18:04:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][500/1251] eta 0:04:09 lr 0.000878 time 0.3268 (0.3325) loss 3.9966 (3.8406) grad_norm 1.2386 (1.1918) [2022-10-07 18:04:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][600/1251] eta 0:03:35 lr 0.000878 time 0.3263 (0.3315) loss 3.8230 (3.8403) grad_norm 1.1829 (1.1917) [2022-10-07 18:05:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][700/1251] eta 0:03:02 lr 0.000878 time 0.3214 (0.3308) loss 4.0452 (3.8436) grad_norm 1.1819 (1.1903) [2022-10-07 18:06:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][800/1251] eta 0:02:28 lr 0.000878 time 0.3278 (0.3303) loss 3.7558 (3.8407) grad_norm 1.0614 (1.1862) [2022-10-07 18:06:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][900/1251] eta 0:01:55 lr 0.000877 time 0.3320 (0.3298) loss 3.7933 (3.8363) grad_norm 0.9470 (1.1880) [2022-10-07 18:07:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][1000/1251] eta 0:01:22 lr 0.000877 time 0.3276 (0.3296) loss 4.1302 (3.8383) grad_norm 1.0925 (1.1870) [2022-10-07 18:07:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][1100/1251] eta 0:00:49 lr 0.000877 time 0.3258 (0.3294) loss 3.4870 (3.8387) grad_norm 1.2656 (1.1857) [2022-10-07 18:08:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [68/300][1200/1251] eta 0:00:16 lr 0.000876 time 0.3292 (0.3292) loss 3.6682 (3.8413) grad_norm 1.0548 (1.1854) [2022-10-07 18:08:32 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 68 training takes 0:06:52 [2022-10-07 18:08:35 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.817 (2.817) Loss 1.2242 (1.2242) Acc@1 72.168 (72.168) Acc@5 90.430 (90.430) [2022-10-07 18:08:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.610 Acc@5 91.016 [2022-10-07 18:08:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-10-07 18:08:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.67% [2022-10-07 18:08:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][0/1251] eta 1:02:49 lr 0.000876 time 3.0134 (3.0134) loss 3.9237 (3.9237) grad_norm 1.2477 (1.2477) [2022-10-07 18:09:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][100/1251] eta 0:06:48 lr 0.000876 time 0.3225 (0.3546) loss 3.8430 (3.8451) grad_norm 1.0750 (1.1998) [2022-10-07 18:09:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][200/1251] eta 0:05:58 lr 0.000876 time 0.3289 (0.3415) loss 3.9907 (3.8390) grad_norm 1.1615 (1.2032) [2022-10-07 18:10:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][300/1251] eta 0:05:19 lr 0.000875 time 0.3227 (0.3365) loss 4.0082 (3.8445) grad_norm 1.2138 (1.2095) [2022-10-07 18:11:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][400/1251] eta 0:04:44 lr 0.000875 time 0.3225 (0.3339) loss 4.0539 (3.8410) grad_norm 1.0443 (1.1997) [2022-10-07 18:11:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][500/1251] eta 0:04:09 lr 0.000875 time 0.3203 (0.3322) loss 4.0440 (3.8382) grad_norm 1.1237 (1.1975) [2022-10-07 18:12:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][600/1251] eta 0:03:35 lr 0.000875 time 0.3220 (0.3310) loss 4.1298 (3.8414) grad_norm 1.1719 (1.1935) [2022-10-07 18:12:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][700/1251] eta 0:03:01 lr 0.000874 time 0.3225 (0.3301) loss 3.8930 (3.8468) grad_norm 1.0521 (1.1936) [2022-10-07 18:13:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][800/1251] eta 0:02:28 lr 0.000874 time 0.3255 (0.3294) loss 3.8267 (3.8436) grad_norm 1.0809 (1.1892) [2022-10-07 18:13:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][900/1251] eta 0:01:55 lr 0.000874 time 0.3254 (0.3289) loss 3.8114 (3.8463) grad_norm 1.4050 (1.1917) [2022-10-07 18:14:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][1000/1251] eta 0:01:22 lr 0.000874 time 0.3192 (0.3284) loss 3.8988 (3.8473) grad_norm 1.4240 (1.1904) [2022-10-07 18:14:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][1100/1251] eta 0:00:49 lr 0.000873 time 0.3227 (0.3282) loss 3.6861 (3.8500) grad_norm 1.0331 (1.1910) [2022-10-07 18:15:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [69/300][1200/1251] eta 0:00:16 lr 0.000873 time 0.3286 (0.3279) loss 3.8889 (3.8468) grad_norm 1.1702 (1.1921) [2022-10-07 18:15:36 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 69 training takes 0:06:50 [2022-10-07 18:15:39 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.528 (2.528) Loss 1.1476 (1.1476) Acc@1 74.707 (74.707) Acc@5 91.113 (91.113) [2022-10-07 18:15:50 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.830 Acc@5 91.228 [2022-10-07 18:15:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.8% [2022-10-07 18:15:50 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.83% [2022-10-07 18:15:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][0/1251] eta 0:57:56 lr 0.000873 time 2.7793 (2.7793) loss 3.9881 (3.9881) grad_norm 1.0261 (1.0261) [2022-10-07 18:16:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][100/1251] eta 0:06:43 lr 0.000873 time 0.3255 (0.3510) loss 3.7692 (3.8308) grad_norm 1.3844 (1.1868) [2022-10-07 18:16:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][200/1251] eta 0:05:56 lr 0.000872 time 0.3266 (0.3389) loss 3.4788 (3.8330) grad_norm 1.3070 (1.1951) [2022-10-07 18:17:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][300/1251] eta 0:05:18 lr 0.000872 time 0.3264 (0.3352) loss 3.6645 (3.8228) grad_norm 0.9680 (1.1889) [2022-10-07 18:18:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][400/1251] eta 0:04:43 lr 0.000872 time 0.3288 (0.3334) loss 3.4425 (3.8260) grad_norm 1.2819 (1.1873) [2022-10-07 18:18:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][500/1251] eta 0:04:09 lr 0.000871 time 0.3263 (0.3324) loss 3.8058 (3.8242) grad_norm 1.1310 (1.1858) [2022-10-07 18:19:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][600/1251] eta 0:03:36 lr 0.000871 time 0.3208 (0.3318) loss 3.7392 (3.8220) grad_norm 1.1302 (1.1850) [2022-10-07 18:19:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][700/1251] eta 0:03:02 lr 0.000871 time 0.3445 (0.3315) loss 3.6261 (3.8259) grad_norm 1.6392 (1.1811) [2022-10-07 18:20:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][800/1251] eta 0:02:29 lr 0.000871 time 0.3308 (0.3312) loss 4.1438 (3.8268) grad_norm 0.9812 (1.1839) [2022-10-07 18:20:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][900/1251] eta 0:01:56 lr 0.000870 time 0.3347 (0.3311) loss 3.9491 (3.8282) grad_norm 1.1993 (1.1845) [2022-10-07 18:21:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][1000/1251] eta 0:01:23 lr 0.000870 time 0.3314 (0.3311) loss 3.9687 (3.8295) grad_norm 1.6900 (1.1886) [2022-10-07 18:21:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][1100/1251] eta 0:00:49 lr 0.000870 time 0.3245 (0.3311) loss 4.3027 (3.8323) grad_norm 1.1639 (1.1886) [2022-10-07 18:22:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [70/300][1200/1251] eta 0:00:16 lr 0.000870 time 0.3312 (0.3311) loss 3.7849 (3.8327) grad_norm 1.1331 (1.1900) [2022-10-07 18:22:44 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 70 training takes 0:06:54 [2022-10-07 18:22:44 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_70 saving...... [2022-10-07 18:22:45 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_70 saved !!! [2022-10-07 18:22:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.315 (2.315) Loss 1.2438 (1.2438) Acc@1 69.727 (69.727) Acc@5 90.918 (90.918) [2022-10-07 18:22:58 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.892 Acc@5 90.994 [2022-10-07 18:22:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-10-07 18:22:58 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.89% [2022-10-07 18:23:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][0/1251] eta 1:06:54 lr 0.000869 time 3.2088 (3.2088) loss 3.4753 (3.4753) grad_norm 1.1271 (1.1271) [2022-10-07 18:23:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][100/1251] eta 0:06:52 lr 0.000869 time 0.3343 (0.3581) loss 4.0042 (3.8041) grad_norm 1.0629 (1.2031) [2022-10-07 18:24:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][200/1251] eta 0:06:01 lr 0.000869 time 0.3306 (0.3442) loss 3.9143 (3.8054) grad_norm 1.0248 (1.1944) [2022-10-07 18:24:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][300/1251] eta 0:05:22 lr 0.000869 time 0.3277 (0.3391) loss 3.6095 (3.8143) grad_norm 1.4894 (1.1895) [2022-10-07 18:25:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][400/1251] eta 0:04:46 lr 0.000868 time 0.3298 (0.3365) loss 3.6134 (3.8101) grad_norm 1.1616 (1.1935) [2022-10-07 18:25:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][500/1251] eta 0:04:11 lr 0.000868 time 0.3329 (0.3346) loss 3.9418 (3.8108) grad_norm 1.2001 (1.1893) [2022-10-07 18:26:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][600/1251] eta 0:03:36 lr 0.000868 time 0.3241 (0.3332) loss 3.9675 (3.8134) grad_norm 1.1927 (1.1905) [2022-10-07 18:26:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][700/1251] eta 0:03:02 lr 0.000867 time 0.3264 (0.3321) loss 3.7527 (3.8165) grad_norm 1.2158 (1.1900) [2022-10-07 18:27:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][800/1251] eta 0:02:29 lr 0.000867 time 0.3185 (0.3313) loss 3.7020 (3.8175) grad_norm 1.1167 (1.1908) [2022-10-07 18:27:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][900/1251] eta 0:01:56 lr 0.000867 time 0.3266 (0.3308) loss 3.8698 (3.8168) grad_norm 0.9886 (1.1898) [2022-10-07 18:28:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][1000/1251] eta 0:01:22 lr 0.000867 time 0.3274 (0.3303) loss 3.7298 (3.8189) grad_norm 1.4216 (1.1895) [2022-10-07 18:29:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][1100/1251] eta 0:00:49 lr 0.000866 time 0.3237 (0.3300) loss 4.0948 (3.8193) grad_norm 1.5722 (1.1888) [2022-10-07 18:29:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [71/300][1200/1251] eta 0:00:16 lr 0.000866 time 0.3248 (0.3298) loss 3.9907 (3.8221) grad_norm 1.4678 (1.1907) [2022-10-07 18:29:51 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 71 training takes 0:06:52 [2022-10-07 18:29:54 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.229 (3.229) Loss 1.2147 (1.2147) Acc@1 70.996 (70.996) Acc@5 90.625 (90.625) [2022-10-07 18:30:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.906 Acc@5 91.164 [2022-10-07 18:30:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-10-07 18:30:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 71.91% [2022-10-07 18:30:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][0/1251] eta 1:08:16 lr 0.000866 time 3.2747 (3.2747) loss 3.8529 (3.8529) grad_norm 1.1447 (1.1447) [2022-10-07 18:30:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][100/1251] eta 0:06:48 lr 0.000866 time 0.3268 (0.3546) loss 3.6579 (3.7931) grad_norm 1.3214 (1.1918) [2022-10-07 18:31:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][200/1251] eta 0:05:57 lr 0.000865 time 0.3286 (0.3397) loss 4.0491 (3.8059) grad_norm 1.4048 (1.2009) [2022-10-07 18:31:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][300/1251] eta 0:05:18 lr 0.000865 time 0.3240 (0.3347) loss 3.9441 (3.8093) grad_norm 1.1584 (1.2044) [2022-10-07 18:32:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][400/1251] eta 0:04:42 lr 0.000865 time 0.3283 (0.3323) loss 3.6829 (3.8149) grad_norm 1.0506 (1.2089) [2022-10-07 18:32:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][500/1251] eta 0:04:08 lr 0.000864 time 0.3219 (0.3307) loss 3.8961 (3.8172) grad_norm 1.0207 (1.2142) [2022-10-07 18:33:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][600/1251] eta 0:03:34 lr 0.000864 time 0.3229 (0.3297) loss 4.1096 (3.8148) grad_norm 1.5535 (1.2073) [2022-10-07 18:33:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][700/1251] eta 0:03:01 lr 0.000864 time 0.3239 (0.3289) loss 4.0926 (3.8184) grad_norm 1.2023 (1.2051) [2022-10-07 18:34:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][800/1251] eta 0:02:28 lr 0.000864 time 0.3213 (0.3283) loss 4.0557 (3.8208) grad_norm 1.0395 (1.2080) [2022-10-07 18:35:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][900/1251] eta 0:01:55 lr 0.000863 time 0.3215 (0.3278) loss 3.5079 (3.8216) grad_norm 1.2994 (1.2075) [2022-10-07 18:35:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][1000/1251] eta 0:01:22 lr 0.000863 time 0.3274 (0.3277) loss 3.6926 (3.8212) grad_norm 1.0347 (1.2054) [2022-10-07 18:36:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][1100/1251] eta 0:00:49 lr 0.000863 time 0.3206 (0.3273) loss 4.0802 (3.8221) grad_norm 1.3299 (1.2056) [2022-10-07 18:36:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [72/300][1200/1251] eta 0:00:16 lr 0.000862 time 0.3187 (0.3272) loss 3.7500 (3.8236) grad_norm 1.1111 (1.2034) [2022-10-07 18:36:54 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 72 training takes 0:06:49 [2022-10-07 18:36:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.777 (2.777) Loss 1.2860 (1.2860) Acc@1 70.508 (70.508) Acc@5 89.258 (89.258) [2022-10-07 18:37:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.016 Acc@5 91.222 [2022-10-07 18:37:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-10-07 18:37:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.02% [2022-10-07 18:37:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][0/1251] eta 1:03:30 lr 0.000862 time 3.0461 (3.0461) loss 3.5965 (3.5965) grad_norm 1.4328 (1.4328) [2022-10-07 18:37:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][100/1251] eta 0:06:46 lr 0.000862 time 0.3262 (0.3534) loss 3.7324 (3.7952) grad_norm 1.1054 (1.2187) [2022-10-07 18:38:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][200/1251] eta 0:05:57 lr 0.000862 time 0.3211 (0.3400) loss 3.4720 (3.7950) grad_norm 1.0300 (1.2121) [2022-10-07 18:38:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][300/1251] eta 0:05:19 lr 0.000861 time 0.3235 (0.3355) loss 4.2612 (3.7955) grad_norm 1.2202 (1.2130) [2022-10-07 18:39:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][400/1251] eta 0:04:43 lr 0.000861 time 0.3275 (0.3332) loss 3.5954 (3.7966) grad_norm 1.2133 (1.2097) [2022-10-07 18:39:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][500/1251] eta 0:04:09 lr 0.000861 time 0.3245 (0.3321) loss 3.9599 (3.7930) grad_norm 1.1283 (1.2124) [2022-10-07 18:40:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][600/1251] eta 0:03:35 lr 0.000861 time 0.3264 (0.3314) loss 4.1464 (3.7977) grad_norm 1.2899 (1.2129) [2022-10-07 18:41:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][700/1251] eta 0:03:02 lr 0.000860 time 0.3267 (0.3308) loss 3.8561 (3.8024) grad_norm 1.2858 (1.2082) [2022-10-07 18:41:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][800/1251] eta 0:02:29 lr 0.000860 time 0.3266 (0.3307) loss 3.6813 (3.8019) grad_norm 1.2351 (1.2127) [2022-10-07 18:42:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][900/1251] eta 0:01:55 lr 0.000860 time 0.3300 (0.3304) loss 3.9070 (3.8055) grad_norm 1.1903 (1.2103) [2022-10-07 18:42:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][1000/1251] eta 0:01:22 lr 0.000859 time 0.3245 (0.3302) loss 3.8331 (3.8059) grad_norm 1.1848 (1.2093) [2022-10-07 18:43:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][1100/1251] eta 0:00:49 lr 0.000859 time 0.3400 (0.3301) loss 3.5735 (3.8090) grad_norm 1.0170 (1.2092) [2022-10-07 18:43:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [73/300][1200/1251] eta 0:00:16 lr 0.000859 time 0.3276 (0.3301) loss 3.7123 (3.8119) grad_norm 1.3562 (1.2085) [2022-10-07 18:44:01 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 73 training takes 0:06:53 [2022-10-07 18:44:04 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.647 (2.647) Loss 1.2326 (1.2326) Acc@1 71.191 (71.191) Acc@5 90.430 (90.430) [2022-10-07 18:44:15 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.236 Acc@5 91.304 [2022-10-07 18:44:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-10-07 18:44:15 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.24% [2022-10-07 18:44:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][0/1251] eta 1:06:16 lr 0.000859 time 3.1789 (3.1789) loss 3.5754 (3.5754) grad_norm 1.4631 (1.4631) [2022-10-07 18:44:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][100/1251] eta 0:06:48 lr 0.000858 time 0.3236 (0.3545) loss 3.9648 (3.7916) grad_norm 1.1657 (1.1961) [2022-10-07 18:45:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][200/1251] eta 0:05:57 lr 0.000858 time 0.3242 (0.3399) loss 3.5085 (3.8034) grad_norm 1.1324 (1.1844) [2022-10-07 18:45:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][300/1251] eta 0:05:18 lr 0.000858 time 0.3213 (0.3350) loss 3.9330 (3.8079) grad_norm 1.3842 (1.1958) [2022-10-07 18:46:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][400/1251] eta 0:04:42 lr 0.000858 time 0.3241 (0.3325) loss 3.9580 (3.8119) grad_norm 1.2263 (1.1990) [2022-10-07 18:47:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][500/1251] eta 0:04:08 lr 0.000857 time 0.3272 (0.3306) loss 4.1355 (3.8111) grad_norm 1.0272 (1.2060) [2022-10-07 18:47:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][600/1251] eta 0:03:34 lr 0.000857 time 0.3207 (0.3295) loss 3.9884 (3.8148) grad_norm 1.5200 (1.2119) [2022-10-07 18:48:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][700/1251] eta 0:03:01 lr 0.000857 time 0.3221 (0.3287) loss 4.0480 (3.8160) grad_norm 0.9585 (1.2140) [2022-10-07 18:48:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][800/1251] eta 0:02:27 lr 0.000856 time 0.3235 (0.3281) loss 3.9203 (3.8141) grad_norm 1.5427 (1.2120) [2022-10-07 18:49:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][900/1251] eta 0:01:55 lr 0.000856 time 0.3258 (0.3277) loss 3.9751 (3.8157) grad_norm 1.2251 (1.2123) [2022-10-07 18:49:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][1000/1251] eta 0:01:22 lr 0.000856 time 0.3269 (0.3273) loss 3.9743 (3.8158) grad_norm 1.0577 (1.2130) [2022-10-07 18:50:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][1100/1251] eta 0:00:49 lr 0.000855 time 0.3317 (0.3270) loss 3.6620 (3.8146) grad_norm 1.1624 (1.2127) [2022-10-07 18:50:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [74/300][1200/1251] eta 0:00:16 lr 0.000855 time 0.3232 (0.3269) loss 3.6516 (3.8152) grad_norm 1.2115 (1.2112) [2022-10-07 18:51:04 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 74 training takes 0:06:49 [2022-10-07 18:51:07 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.959 (2.959) Loss 1.2105 (1.2105) Acc@1 73.047 (73.047) Acc@5 92.383 (92.383) [2022-10-07 18:51:18 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.172 Acc@5 91.262 [2022-10-07 18:51:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-10-07 18:51:18 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.24% [2022-10-07 18:51:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][0/1251] eta 1:01:10 lr 0.000855 time 2.9342 (2.9342) loss 3.7018 (3.7018) grad_norm 0.9983 (0.9983) [2022-10-07 18:51:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][100/1251] eta 0:06:46 lr 0.000855 time 0.3321 (0.3528) loss 4.0529 (3.7991) grad_norm 1.0682 (1.1997) [2022-10-07 18:52:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][200/1251] eta 0:05:57 lr 0.000854 time 0.3241 (0.3398) loss 3.9761 (3.8012) grad_norm 1.1408 (1.1955) [2022-10-07 18:52:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][300/1251] eta 0:05:18 lr 0.000854 time 0.3246 (0.3350) loss 3.5346 (3.7978) grad_norm 1.1476 (1.2043) [2022-10-07 18:53:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][400/1251] eta 0:04:43 lr 0.000854 time 0.3265 (0.3326) loss 3.9146 (3.8085) grad_norm 1.1786 (1.2070) [2022-10-07 18:54:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][500/1251] eta 0:04:08 lr 0.000854 time 0.3289 (0.3311) loss 3.7453 (3.8056) grad_norm 1.3291 (1.2051) [2022-10-07 18:54:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][600/1251] eta 0:03:34 lr 0.000853 time 0.3212 (0.3300) loss 3.6712 (3.8062) grad_norm 1.1563 (1.2092) [2022-10-07 18:55:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][700/1251] eta 0:03:01 lr 0.000853 time 0.3279 (0.3292) loss 3.7543 (3.8102) grad_norm 1.2395 (1.2066) [2022-10-07 18:55:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][800/1251] eta 0:02:28 lr 0.000853 time 0.3217 (0.3287) loss 3.2948 (3.8120) grad_norm 1.0004 (1.2098) [2022-10-07 18:56:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][900/1251] eta 0:01:55 lr 0.000852 time 0.3244 (0.3284) loss 3.8617 (3.8136) grad_norm 1.1496 (1.2106) [2022-10-07 18:56:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][1000/1251] eta 0:01:22 lr 0.000852 time 0.3268 (0.3280) loss 3.6043 (3.8099) grad_norm 1.2123 (1.2117) [2022-10-07 18:57:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][1100/1251] eta 0:00:49 lr 0.000852 time 0.3248 (0.3276) loss 4.0554 (3.8114) grad_norm 1.2109 (1.2117) [2022-10-07 18:57:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [75/300][1200/1251] eta 0:00:16 lr 0.000851 time 0.3209 (0.3273) loss 3.7055 (3.8104) grad_norm 1.3804 (1.2100) [2022-10-07 18:58:07 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 75 training takes 0:06:49 [2022-10-07 18:58:10 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.261 (2.261) Loss 1.2400 (1.2400) Acc@1 72.461 (72.461) Acc@5 90.820 (90.820) [2022-10-07 18:58:21 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 71.884 Acc@5 91.330 [2022-10-07 18:58:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-10-07 18:58:21 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.24% [2022-10-07 18:58:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][0/1251] eta 0:54:33 lr 0.000851 time 2.6163 (2.6163) loss 3.4762 (3.4762) grad_norm 1.2905 (1.2905) [2022-10-07 18:58:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][100/1251] eta 0:06:42 lr 0.000851 time 0.3287 (0.3496) loss 3.9397 (3.8032) grad_norm 1.1449 (1.2028) [2022-10-07 18:59:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][200/1251] eta 0:05:55 lr 0.000851 time 0.3311 (0.3378) loss 3.5872 (3.7965) grad_norm 1.2983 (1.2133) [2022-10-07 19:00:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][300/1251] eta 0:05:17 lr 0.000850 time 0.3277 (0.3340) loss 3.9825 (3.7992) grad_norm 1.0593 (1.2096) [2022-10-07 19:00:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][400/1251] eta 0:04:42 lr 0.000850 time 0.3272 (0.3320) loss 3.6462 (3.7974) grad_norm 1.2754 (1.2107) [2022-10-07 19:01:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][500/1251] eta 0:04:08 lr 0.000850 time 0.3269 (0.3309) loss 3.7932 (3.8033) grad_norm 1.2258 (1.2092) [2022-10-07 19:01:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][600/1251] eta 0:03:34 lr 0.000850 time 0.3263 (0.3300) loss 3.7094 (3.8043) grad_norm 1.3541 (1.2100) [2022-10-07 19:02:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][700/1251] eta 0:03:01 lr 0.000849 time 0.3288 (0.3295) loss 4.2221 (3.8067) grad_norm 1.0210 (1.2128) [2022-10-07 19:02:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][800/1251] eta 0:02:28 lr 0.000849 time 0.3233 (0.3291) loss 3.9428 (3.8041) grad_norm 1.2393 (1.2137) [2022-10-07 19:03:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][900/1251] eta 0:01:55 lr 0.000849 time 0.3250 (0.3288) loss 4.0687 (3.8045) grad_norm 1.3575 (1.2117) [2022-10-07 19:03:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][1000/1251] eta 0:01:22 lr 0.000848 time 0.3219 (0.3286) loss 3.7986 (3.8058) grad_norm 1.0783 (1.2119) [2022-10-07 19:04:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][1100/1251] eta 0:00:49 lr 0.000848 time 0.3254 (0.3284) loss 3.8674 (3.8050) grad_norm 1.1086 (1.2127) [2022-10-07 19:04:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [76/300][1200/1251] eta 0:00:16 lr 0.000848 time 0.3279 (0.3283) loss 3.8890 (3.8042) grad_norm 1.1744 (1.2132) [2022-10-07 19:05:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 76 training takes 0:06:50 [2022-10-07 19:05:14 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.628 (2.628) Loss 1.1231 (1.1231) Acc@1 72.852 (72.852) Acc@5 92.285 (92.285) [2022-10-07 19:05:25 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.482 Acc@5 91.412 [2022-10-07 19:05:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-07 19:05:25 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.48% [2022-10-07 19:05:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][0/1251] eta 0:58:47 lr 0.000848 time 2.8197 (2.8197) loss 3.5463 (3.5463) grad_norm 1.2835 (1.2835) [2022-10-07 19:06:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][100/1251] eta 0:06:49 lr 0.000847 time 0.3281 (0.3562) loss 3.6145 (3.8239) grad_norm 1.1051 (1.2360) [2022-10-07 19:06:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][200/1251] eta 0:05:59 lr 0.000847 time 0.3267 (0.3423) loss 3.5704 (3.8080) grad_norm 1.1612 (1.2288) [2022-10-07 19:07:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][300/1251] eta 0:05:20 lr 0.000847 time 0.3316 (0.3374) loss 3.6369 (3.8107) grad_norm 1.1953 (1.2190) [2022-10-07 19:07:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][400/1251] eta 0:04:44 lr 0.000846 time 0.3243 (0.3348) loss 3.9618 (3.8057) grad_norm 1.1607 (1.2156) [2022-10-07 19:08:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][500/1251] eta 0:04:10 lr 0.000846 time 0.3339 (0.3332) loss 3.3289 (3.8066) grad_norm 1.0827 (1.2137) [2022-10-07 19:08:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][600/1251] eta 0:03:36 lr 0.000846 time 0.3281 (0.3319) loss 3.6232 (3.8053) grad_norm 1.1389 (1.2135) [2022-10-07 19:09:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][700/1251] eta 0:03:02 lr 0.000846 time 0.3275 (0.3310) loss 3.7867 (3.8046) grad_norm 1.3385 (1.2119) [2022-10-07 19:09:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][800/1251] eta 0:02:28 lr 0.000845 time 0.3210 (0.3302) loss 3.3601 (3.8076) grad_norm 1.1544 (1.2136) [2022-10-07 19:10:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][900/1251] eta 0:01:55 lr 0.000845 time 0.3284 (0.3297) loss 3.6469 (3.8064) grad_norm 1.2747 (1.2139) [2022-10-07 19:10:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][1000/1251] eta 0:01:22 lr 0.000845 time 0.3243 (0.3293) loss 3.5202 (3.8038) grad_norm 1.1350 (1.2181) [2022-10-07 19:11:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][1100/1251] eta 0:00:49 lr 0.000844 time 0.3245 (0.3289) loss 4.0706 (3.8060) grad_norm 1.2145 (1.2164) [2022-10-07 19:12:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [77/300][1200/1251] eta 0:00:16 lr 0.000844 time 0.3262 (0.3287) loss 3.7766 (3.8057) grad_norm 1.1424 (1.2164) [2022-10-07 19:12:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 77 training takes 0:06:51 [2022-10-07 19:12:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.494 (2.494) Loss 1.1662 (1.1662) Acc@1 72.656 (72.656) Acc@5 92.773 (92.773) [2022-10-07 19:12:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.362 Acc@5 91.316 [2022-10-07 19:12:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-10-07 19:12:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.48% [2022-10-07 19:12:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][0/1251] eta 1:14:54 lr 0.000844 time 3.5928 (3.5928) loss 3.8429 (3.8429) grad_norm 1.0981 (1.0981) [2022-10-07 19:13:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][100/1251] eta 0:06:54 lr 0.000844 time 0.3278 (0.3598) loss 3.9841 (3.7769) grad_norm 1.1422 (1.2431) [2022-10-07 19:13:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][200/1251] eta 0:06:01 lr 0.000843 time 0.3246 (0.3437) loss 3.7161 (3.7785) grad_norm 1.1631 (1.2389) [2022-10-07 19:14:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][300/1251] eta 0:05:21 lr 0.000843 time 0.3256 (0.3383) loss 3.5887 (3.7768) grad_norm 1.1981 (1.2292) [2022-10-07 19:14:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][400/1251] eta 0:04:45 lr 0.000843 time 0.3289 (0.3356) loss 4.0695 (3.7793) grad_norm 1.1426 (1.2410) [2022-10-07 19:15:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][500/1251] eta 0:04:10 lr 0.000842 time 0.3245 (0.3338) loss 4.1078 (3.7817) grad_norm 1.0922 (1.2311) [2022-10-07 19:15:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][600/1251] eta 0:03:36 lr 0.000842 time 0.3319 (0.3325) loss 3.6073 (3.7805) grad_norm 1.4167 (1.2304) [2022-10-07 19:16:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][700/1251] eta 0:03:02 lr 0.000842 time 0.3266 (0.3316) loss 3.8669 (3.7836) grad_norm 1.1775 (1.2282) [2022-10-07 19:16:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][800/1251] eta 0:02:29 lr 0.000841 time 0.3267 (0.3311) loss 3.3484 (3.7856) grad_norm 1.6840 (1.2245) [2022-10-07 19:17:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][900/1251] eta 0:01:56 lr 0.000841 time 0.3225 (0.3305) loss 3.5581 (3.7865) grad_norm 1.2187 (1.2241) [2022-10-07 19:18:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][1000/1251] eta 0:01:22 lr 0.000841 time 0.3279 (0.3300) loss 3.5500 (3.7876) grad_norm 1.1532 (1.2230) [2022-10-07 19:18:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][1100/1251] eta 0:00:49 lr 0.000841 time 0.3269 (0.3296) loss 3.5167 (3.7905) grad_norm 0.9700 (1.2253) [2022-10-07 19:19:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [78/300][1200/1251] eta 0:00:16 lr 0.000840 time 0.3278 (0.3293) loss 3.9732 (3.7914) grad_norm 1.0834 (1.2239) [2022-10-07 19:19:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 78 training takes 0:06:52 [2022-10-07 19:19:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.613 (2.613) Loss 1.0906 (1.0906) Acc@1 72.852 (72.852) Acc@5 92.969 (92.969) [2022-10-07 19:19:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.670 Acc@5 91.426 [2022-10-07 19:19:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-07 19:19:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.67% [2022-10-07 19:19:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][0/1251] eta 1:08:14 lr 0.000840 time 3.2729 (3.2729) loss 3.7096 (3.7096) grad_norm 1.1298 (1.1298) [2022-10-07 19:20:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][100/1251] eta 0:06:46 lr 0.000840 time 0.3237 (0.3532) loss 3.5979 (3.7497) grad_norm 1.5498 (1.2099) [2022-10-07 19:20:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][200/1251] eta 0:05:56 lr 0.000839 time 0.3262 (0.3392) loss 3.8032 (3.7604) grad_norm 1.2309 (1.1946) [2022-10-07 19:21:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][300/1251] eta 0:05:17 lr 0.000839 time 0.3235 (0.3342) loss 3.9450 (3.7589) grad_norm 1.9364 (1.1983) [2022-10-07 19:21:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][400/1251] eta 0:04:42 lr 0.000839 time 0.3224 (0.3316) loss 3.8007 (3.7649) grad_norm 1.1396 (1.2099) [2022-10-07 19:22:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][500/1251] eta 0:04:07 lr 0.000839 time 0.3252 (0.3299) loss 3.7320 (3.7747) grad_norm 1.3388 (1.2161) [2022-10-07 19:22:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][600/1251] eta 0:03:34 lr 0.000838 time 0.3300 (0.3289) loss 3.7364 (3.7796) grad_norm 1.3350 (1.2150) [2022-10-07 19:23:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][700/1251] eta 0:03:00 lr 0.000838 time 0.3219 (0.3283) loss 3.8851 (3.7842) grad_norm 1.2569 (1.2184) [2022-10-07 19:23:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][800/1251] eta 0:02:27 lr 0.000838 time 0.3256 (0.3280) loss 3.9936 (3.7859) grad_norm 1.1129 (1.2176) [2022-10-07 19:24:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][900/1251] eta 0:01:55 lr 0.000837 time 0.3281 (0.3279) loss 3.3710 (3.7905) grad_norm 1.5054 (1.2198) [2022-10-07 19:25:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][1000/1251] eta 0:01:22 lr 0.000837 time 0.3271 (0.3279) loss 3.9386 (3.7924) grad_norm 1.3217 (1.2215) [2022-10-07 19:25:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][1100/1251] eta 0:00:49 lr 0.000837 time 0.3201 (0.3278) loss 3.9081 (3.7937) grad_norm 1.2355 (1.2239) [2022-10-07 19:26:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [79/300][1200/1251] eta 0:00:16 lr 0.000836 time 0.3276 (0.3278) loss 3.8090 (3.7936) grad_norm 1.1918 (1.2266) [2022-10-07 19:26:26 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 79 training takes 0:06:50 [2022-10-07 19:26:29 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.201 (3.201) Loss 1.2033 (1.2033) Acc@1 73.438 (73.438) Acc@5 91.113 (91.113) [2022-10-07 19:26:40 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.532 Acc@5 91.432 [2022-10-07 19:26:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-07 19:26:40 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.67% [2022-10-07 19:26:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][0/1251] eta 0:57:06 lr 0.000836 time 2.7391 (2.7391) loss 3.7399 (3.7399) grad_norm 1.0873 (1.0873) [2022-10-07 19:27:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][100/1251] eta 0:06:45 lr 0.000836 time 0.3272 (0.3525) loss 3.8980 (3.7550) grad_norm 1.1567 (1.2158) [2022-10-07 19:27:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][200/1251] eta 0:05:56 lr 0.000836 time 0.3218 (0.3396) loss 3.9458 (3.7762) grad_norm 1.2045 (1.2138) [2022-10-07 19:28:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][300/1251] eta 0:05:18 lr 0.000835 time 0.3260 (0.3351) loss 3.6483 (3.7725) grad_norm 1.0784 (1.2364) [2022-10-07 19:28:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][400/1251] eta 0:04:43 lr 0.000835 time 0.3286 (0.3327) loss 3.9448 (3.7740) grad_norm 1.1705 (1.2376) [2022-10-07 19:29:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][500/1251] eta 0:04:08 lr 0.000835 time 0.3280 (0.3312) loss 4.0292 (3.7752) grad_norm 1.1429 (1.2315) [2022-10-07 19:29:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][600/1251] eta 0:03:35 lr 0.000834 time 0.3269 (0.3303) loss 3.8464 (3.7760) grad_norm 1.2736 (1.2288) [2022-10-07 19:30:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][700/1251] eta 0:03:01 lr 0.000834 time 0.3275 (0.3296) loss 3.5616 (3.7745) grad_norm 1.1694 (1.2262) [2022-10-07 19:31:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][800/1251] eta 0:02:28 lr 0.000834 time 0.3229 (0.3290) loss 3.5936 (3.7762) grad_norm 1.2613 (1.2285) [2022-10-07 19:31:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][900/1251] eta 0:01:55 lr 0.000833 time 0.3221 (0.3286) loss 3.7685 (3.7721) grad_norm 1.2522 (1.2318) [2022-10-07 19:32:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][1000/1251] eta 0:01:22 lr 0.000833 time 0.3236 (0.3283) loss 3.8697 (3.7737) grad_norm 1.1978 (1.2354) [2022-10-07 19:32:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][1100/1251] eta 0:00:49 lr 0.000833 time 0.3246 (0.3279) loss 3.8710 (3.7736) grad_norm 1.1703 (1.2344) [2022-10-07 19:33:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [80/300][1200/1251] eta 0:00:16 lr 0.000833 time 0.3227 (0.3277) loss 3.9961 (3.7761) grad_norm 1.2173 (1.2342) [2022-10-07 19:33:30 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 80 training takes 0:06:50 [2022-10-07 19:33:30 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_80 saving...... [2022-10-07 19:33:30 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_80 saved !!! [2022-10-07 19:33:33 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.924 (2.924) Loss 1.2845 (1.2845) Acc@1 70.312 (70.312) Acc@5 89.746 (89.746) [2022-10-07 19:33:44 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.718 Acc@5 91.482 [2022-10-07 19:33:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-07 19:33:44 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 19:33:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][0/1251] eta 1:03:11 lr 0.000832 time 3.0311 (3.0311) loss 3.9087 (3.9087) grad_norm 1.0503 (1.0503) [2022-10-07 19:34:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][100/1251] eta 0:06:44 lr 0.000832 time 0.3250 (0.3512) loss 3.8490 (3.7644) grad_norm 1.1538 (1.2322) [2022-10-07 19:34:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][200/1251] eta 0:05:55 lr 0.000832 time 0.3271 (0.3381) loss 3.5126 (3.7644) grad_norm 1.0267 (1.2361) [2022-10-07 19:35:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][300/1251] eta 0:05:17 lr 0.000831 time 0.3245 (0.3337) loss 3.8665 (3.7641) grad_norm 1.3302 (1.2396) [2022-10-07 19:35:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][400/1251] eta 0:04:42 lr 0.000831 time 0.3274 (0.3317) loss 3.9553 (3.7697) grad_norm 1.1126 (1.2348) [2022-10-07 19:36:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][500/1251] eta 0:04:08 lr 0.000831 time 0.3248 (0.3303) loss 3.9344 (3.7729) grad_norm 1.3494 (1.2340) [2022-10-07 19:37:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][600/1251] eta 0:03:34 lr 0.000830 time 0.3271 (0.3294) loss 3.8402 (3.7740) grad_norm 1.2496 (1.2395) [2022-10-07 19:37:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][700/1251] eta 0:03:01 lr 0.000830 time 0.3272 (0.3289) loss 3.6692 (3.7752) grad_norm 1.0019 (1.2355) [2022-10-07 19:38:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][800/1251] eta 0:02:28 lr 0.000830 time 0.3206 (0.3283) loss 3.8397 (3.7762) grad_norm 1.5985 (1.2337) [2022-10-07 19:38:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][900/1251] eta 0:01:55 lr 0.000830 time 0.3241 (0.3278) loss 4.0296 (3.7764) grad_norm 1.0888 (1.2308) [2022-10-07 19:39:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][1000/1251] eta 0:01:22 lr 0.000829 time 0.3226 (0.3274) loss 3.5426 (3.7753) grad_norm 1.1931 (1.2327) [2022-10-07 19:39:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][1100/1251] eta 0:00:49 lr 0.000829 time 0.3240 (0.3271) loss 4.0665 (3.7746) grad_norm 1.2706 (1.2315) [2022-10-07 19:40:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [81/300][1200/1251] eta 0:00:16 lr 0.000829 time 0.3266 (0.3269) loss 3.6164 (3.7761) grad_norm 1.1674 (1.2332) [2022-10-07 19:40:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 81 training takes 0:06:49 [2022-10-07 19:40:35 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.204 (2.204) Loss 1.1676 (1.1676) Acc@1 72.363 (72.363) Acc@5 92.188 (92.188) [2022-10-07 19:40:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.350 Acc@5 91.534 [2022-10-07 19:40:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-10-07 19:40:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 19:40:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][0/1251] eta 0:59:46 lr 0.000828 time 2.8668 (2.8668) loss 3.5248 (3.5248) grad_norm 1.0196 (1.0196) [2022-10-07 19:41:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][100/1251] eta 0:06:47 lr 0.000828 time 0.3259 (0.3536) loss 3.9009 (3.7552) grad_norm 1.3267 (1.2438) [2022-10-07 19:41:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][200/1251] eta 0:05:57 lr 0.000828 time 0.3274 (0.3405) loss 3.4348 (3.7624) grad_norm 1.1974 (1.2266) [2022-10-07 19:42:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][300/1251] eta 0:05:19 lr 0.000828 time 0.3251 (0.3357) loss 3.9032 (3.7669) grad_norm 1.3595 (1.2299) [2022-10-07 19:43:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][400/1251] eta 0:04:43 lr 0.000827 time 0.3239 (0.3330) loss 3.5237 (3.7733) grad_norm 1.4465 (1.2265) [2022-10-07 19:43:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][500/1251] eta 0:04:08 lr 0.000827 time 0.3248 (0.3315) loss 3.5645 (3.7774) grad_norm 1.2264 (1.2310) [2022-10-07 19:44:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][600/1251] eta 0:03:35 lr 0.000827 time 0.3281 (0.3305) loss 3.7428 (3.7723) grad_norm 1.2348 (1.2294) [2022-10-07 19:44:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][700/1251] eta 0:03:01 lr 0.000826 time 0.3323 (0.3298) loss 3.8718 (3.7742) grad_norm 1.1577 (1.2307) [2022-10-07 19:45:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][800/1251] eta 0:02:28 lr 0.000826 time 0.3193 (0.3294) loss 3.4190 (3.7765) grad_norm 1.1046 (1.2276) [2022-10-07 19:45:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][900/1251] eta 0:01:55 lr 0.000826 time 0.3268 (0.3291) loss 3.9140 (3.7784) grad_norm 1.3155 (1.2285) [2022-10-07 19:46:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][1000/1251] eta 0:01:22 lr 0.000825 time 0.3249 (0.3288) loss 3.4932 (3.7829) grad_norm 1.1751 (1.2317) [2022-10-07 19:46:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][1100/1251] eta 0:00:49 lr 0.000825 time 0.3271 (0.3287) loss 3.6424 (3.7835) grad_norm 1.1581 (1.2308) [2022-10-07 19:47:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [82/300][1200/1251] eta 0:00:16 lr 0.000825 time 0.3333 (0.3286) loss 3.9695 (3.7841) grad_norm 1.2895 (1.2311) [2022-10-07 19:47:38 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 82 training takes 0:06:51 [2022-10-07 19:47:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.144 (3.144) Loss 1.2538 (1.2538) Acc@1 70.703 (70.703) Acc@5 90.918 (90.918) [2022-10-07 19:47:51 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.528 Acc@5 91.564 [2022-10-07 19:47:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-07 19:47:51 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 19:47:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][0/1251] eta 1:04:14 lr 0.000825 time 3.0813 (3.0813) loss 3.5638 (3.5638) grad_norm 1.1488 (1.1488) [2022-10-07 19:48:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][100/1251] eta 0:06:48 lr 0.000824 time 0.3339 (0.3550) loss 3.6901 (3.7617) grad_norm 1.4809 (1.2076) [2022-10-07 19:49:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][200/1251] eta 0:05:58 lr 0.000824 time 0.3237 (0.3409) loss 3.9542 (3.7637) grad_norm 1.2321 (1.2075) [2022-10-07 19:49:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][300/1251] eta 0:05:19 lr 0.000824 time 0.3296 (0.3360) loss 3.8846 (3.7670) grad_norm 1.2203 (1.2190) [2022-10-07 19:50:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][400/1251] eta 0:04:43 lr 0.000823 time 0.3240 (0.3335) loss 3.7769 (3.7652) grad_norm 1.1354 (1.2243) [2022-10-07 19:50:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][500/1251] eta 0:04:09 lr 0.000823 time 0.3257 (0.3320) loss 3.6955 (3.7631) grad_norm 1.1256 (1.2283) [2022-10-07 19:51:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][600/1251] eta 0:03:35 lr 0.000823 time 0.3244 (0.3309) loss 3.4737 (3.7576) grad_norm 1.2600 (1.2248) [2022-10-07 19:51:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][700/1251] eta 0:03:01 lr 0.000822 time 0.3235 (0.3302) loss 3.6415 (3.7639) grad_norm 1.0424 (1.2258) [2022-10-07 19:52:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][800/1251] eta 0:02:28 lr 0.000822 time 0.3330 (0.3296) loss 3.6205 (3.7633) grad_norm 1.2071 (1.2266) [2022-10-07 19:52:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][900/1251] eta 0:01:55 lr 0.000822 time 0.3257 (0.3292) loss 3.7843 (3.7669) grad_norm 1.0531 (1.2247) [2022-10-07 19:53:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][1000/1251] eta 0:01:22 lr 0.000821 time 0.3246 (0.3288) loss 3.5893 (3.7674) grad_norm 1.3621 (1.2231) [2022-10-07 19:53:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][1100/1251] eta 0:00:49 lr 0.000821 time 0.3308 (0.3285) loss 3.9441 (3.7689) grad_norm 1.1191 (1.2270) [2022-10-07 19:54:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [83/300][1200/1251] eta 0:00:16 lr 0.000821 time 0.3234 (0.3283) loss 3.3832 (3.7711) grad_norm 1.2156 (1.2251) [2022-10-07 19:54:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 83 training takes 0:06:50 [2022-10-07 19:54:44 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.367 (2.367) Loss 1.0844 (1.0844) Acc@1 73.926 (73.926) Acc@5 92.578 (92.578) [2022-10-07 19:54:56 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.622 Acc@5 91.564 [2022-10-07 19:54:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-10-07 19:54:56 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 19:54:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][0/1251] eta 1:01:15 lr 0.000821 time 2.9377 (2.9377) loss 3.9361 (3.9361) grad_norm 1.0665 (1.0665) [2022-10-07 19:55:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][100/1251] eta 0:06:44 lr 0.000820 time 0.3203 (0.3518) loss 3.5368 (3.7588) grad_norm 1.0930 (1.2344) [2022-10-07 19:56:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][200/1251] eta 0:05:55 lr 0.000820 time 0.3269 (0.3385) loss 4.0441 (3.7566) grad_norm 1.2133 (1.2389) [2022-10-07 19:56:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][300/1251] eta 0:05:17 lr 0.000820 time 0.3252 (0.3341) loss 3.5114 (3.7616) grad_norm 1.2801 (1.2472) [2022-10-07 19:57:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][400/1251] eta 0:04:42 lr 0.000819 time 0.3205 (0.3318) loss 3.9798 (3.7643) grad_norm 1.0762 (1.2535) [2022-10-07 19:57:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][500/1251] eta 0:04:08 lr 0.000819 time 0.3505 (0.3305) loss 3.8419 (3.7687) grad_norm 1.2133 (1.2474) [2022-10-07 19:58:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][600/1251] eta 0:03:34 lr 0.000819 time 0.3286 (0.3298) loss 3.8469 (3.7697) grad_norm 1.3448 (1.2411) [2022-10-07 19:58:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][700/1251] eta 0:03:01 lr 0.000818 time 0.3282 (0.3291) loss 3.5931 (3.7712) grad_norm 1.1254 (1.2349) [2022-10-07 19:59:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][800/1251] eta 0:02:28 lr 0.000818 time 0.3227 (0.3284) loss 3.7710 (3.7690) grad_norm 1.3854 (1.2319) [2022-10-07 19:59:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][900/1251] eta 0:01:55 lr 0.000818 time 0.3253 (0.3279) loss 4.1859 (3.7746) grad_norm 1.2246 (1.2382) [2022-10-07 20:00:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][1000/1251] eta 0:01:22 lr 0.000817 time 0.3226 (0.3275) loss 3.3761 (3.7748) grad_norm 1.4425 (1.2371) [2022-10-07 20:00:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][1100/1251] eta 0:00:49 lr 0.000817 time 0.3244 (0.3272) loss 3.4415 (3.7710) grad_norm 1.1505 (1.2360) [2022-10-07 20:01:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [84/300][1200/1251] eta 0:00:16 lr 0.000817 time 0.3270 (0.3269) loss 3.9550 (3.7705) grad_norm 1.0994 (1.2375) [2022-10-07 20:01:45 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 84 training takes 0:06:49 [2022-10-07 20:01:48 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.212 (3.212) Loss 1.1393 (1.1393) Acc@1 74.512 (74.512) Acc@5 91.895 (91.895) [2022-10-07 20:01:59 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.678 Acc@5 91.490 [2022-10-07 20:01:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-07 20:01:59 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 20:02:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][0/1251] eta 0:47:13 lr 0.000817 time 2.2650 (2.2650) loss 3.5960 (3.5960) grad_norm 1.2763 (1.2763) [2022-10-07 20:02:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][100/1251] eta 0:06:38 lr 0.000816 time 0.3246 (0.3465) loss 3.6883 (3.7503) grad_norm 1.1994 (1.2264) [2022-10-07 20:03:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][200/1251] eta 0:05:53 lr 0.000816 time 0.3222 (0.3361) loss 3.7369 (3.7554) grad_norm 1.5628 (1.2383) [2022-10-07 20:03:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][300/1251] eta 0:05:16 lr 0.000816 time 0.3289 (0.3325) loss 3.9142 (3.7524) grad_norm 1.4703 (1.2463) [2022-10-07 20:04:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][400/1251] eta 0:04:41 lr 0.000815 time 0.3279 (0.3307) loss 3.9666 (3.7501) grad_norm 1.3307 (1.2462) [2022-10-07 20:04:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][500/1251] eta 0:04:07 lr 0.000815 time 0.3245 (0.3296) loss 3.4456 (3.7540) grad_norm 1.5479 (1.2441) [2022-10-07 20:05:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][600/1251] eta 0:03:34 lr 0.000815 time 0.3270 (0.3290) loss 3.6939 (3.7559) grad_norm 1.1269 (1.2468) [2022-10-07 20:05:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][700/1251] eta 0:03:01 lr 0.000814 time 0.3218 (0.3287) loss 4.1215 (3.7582) grad_norm 1.2230 (1.2415) [2022-10-07 20:06:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][800/1251] eta 0:02:28 lr 0.000814 time 0.3225 (0.3286) loss 3.7747 (3.7595) grad_norm 1.4629 (1.2411) [2022-10-07 20:06:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][900/1251] eta 0:01:55 lr 0.000814 time 0.3313 (0.3285) loss 3.5243 (3.7639) grad_norm 1.0553 (1.2415) [2022-10-07 20:07:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][1000/1251] eta 0:01:22 lr 0.000813 time 0.3244 (0.3286) loss 3.9512 (3.7607) grad_norm 1.2221 (1.2370) [2022-10-07 20:08:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][1100/1251] eta 0:00:49 lr 0.000813 time 0.3239 (0.3285) loss 3.6287 (3.7625) grad_norm 1.0841 (1.2387) [2022-10-07 20:08:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [85/300][1200/1251] eta 0:00:16 lr 0.000813 time 0.3293 (0.3286) loss 3.8761 (3.7642) grad_norm 1.2643 (1.2380) [2022-10-07 20:08:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 85 training takes 0:06:51 [2022-10-07 20:08:53 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.350 (2.350) Loss 1.2117 (1.2117) Acc@1 72.363 (72.363) Acc@5 91.602 (91.602) [2022-10-07 20:09:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.688 Acc@5 91.636 [2022-10-07 20:09:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-07 20:09:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.72% [2022-10-07 20:09:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][0/1251] eta 1:05:29 lr 0.000812 time 3.1413 (3.1413) loss 3.9473 (3.9473) grad_norm 1.1321 (1.1321) [2022-10-07 20:09:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][100/1251] eta 0:06:49 lr 0.000812 time 0.3277 (0.3561) loss 3.8742 (3.7547) grad_norm 1.1863 (1.2253) [2022-10-07 20:10:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][200/1251] eta 0:05:59 lr 0.000812 time 0.3250 (0.3419) loss 3.4831 (3.7552) grad_norm 1.1327 (1.2281) [2022-10-07 20:10:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][300/1251] eta 0:05:20 lr 0.000811 time 0.3301 (0.3369) loss 3.8879 (3.7682) grad_norm 1.4549 (1.2268) [2022-10-07 20:11:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][400/1251] eta 0:04:44 lr 0.000811 time 0.3248 (0.3343) loss 3.7446 (3.7617) grad_norm 1.1085 (1.2296) [2022-10-07 20:11:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][500/1251] eta 0:04:09 lr 0.000811 time 0.3279 (0.3327) loss 3.4685 (3.7589) grad_norm 1.1112 (1.2336) [2022-10-07 20:12:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][600/1251] eta 0:03:35 lr 0.000811 time 0.3271 (0.3316) loss 4.0556 (3.7617) grad_norm 1.3904 (1.2338) [2022-10-07 20:12:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][700/1251] eta 0:03:02 lr 0.000810 time 0.3244 (0.3309) loss 3.3920 (3.7612) grad_norm 1.0333 (1.2369) [2022-10-07 20:13:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][800/1251] eta 0:02:28 lr 0.000810 time 0.3271 (0.3303) loss 3.5772 (3.7630) grad_norm 1.2262 (1.2375) [2022-10-07 20:14:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][900/1251] eta 0:01:55 lr 0.000810 time 0.3243 (0.3299) loss 3.8082 (3.7619) grad_norm 1.1990 (1.2388) [2022-10-07 20:14:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][1000/1251] eta 0:01:22 lr 0.000809 time 0.3275 (0.3295) loss 3.7518 (3.7605) grad_norm 1.0447 (1.2396) [2022-10-07 20:15:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][1100/1251] eta 0:00:49 lr 0.000809 time 0.3282 (0.3292) loss 3.7830 (3.7591) grad_norm 1.2353 (1.2391) [2022-10-07 20:15:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [86/300][1200/1251] eta 0:00:16 lr 0.000809 time 0.3297 (0.3290) loss 4.1188 (3.7593) grad_norm 1.2928 (1.2395) [2022-10-07 20:15:56 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 86 training takes 0:06:51 [2022-10-07 20:15:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.856 (2.856) Loss 1.1567 (1.1567) Acc@1 73.340 (73.340) Acc@5 92.383 (92.383) [2022-10-07 20:16:09 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.952 Acc@5 91.690 [2022-10-07 20:16:09 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-07 20:16:09 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 72.95% [2022-10-07 20:16:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][0/1251] eta 1:03:45 lr 0.000808 time 3.0579 (3.0579) loss 3.7555 (3.7555) grad_norm 1.1105 (1.1105) [2022-10-07 20:16:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][100/1251] eta 0:06:48 lr 0.000808 time 0.3256 (0.3548) loss 3.9386 (3.7321) grad_norm 1.0264 (1.2348) [2022-10-07 20:17:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][200/1251] eta 0:05:57 lr 0.000808 time 0.3255 (0.3405) loss 3.8715 (3.7518) grad_norm 1.3222 (1.2570) [2022-10-07 20:17:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][300/1251] eta 0:05:18 lr 0.000807 time 0.3221 (0.3353) loss 3.6429 (3.7521) grad_norm 1.0604 (1.2538) [2022-10-07 20:18:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][400/1251] eta 0:04:43 lr 0.000807 time 0.3283 (0.3331) loss 3.2692 (3.7490) grad_norm 1.2492 (1.2674) [2022-10-07 20:18:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][500/1251] eta 0:04:08 lr 0.000807 time 0.3183 (0.3314) loss 4.0028 (3.7536) grad_norm 1.1728 (1.2652) [2022-10-07 20:19:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][600/1251] eta 0:03:35 lr 0.000806 time 0.3277 (0.3303) loss 3.5169 (3.7522) grad_norm 1.2306 (1.2679) [2022-10-07 20:20:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][700/1251] eta 0:03:01 lr 0.000806 time 0.3231 (0.3294) loss 3.6615 (3.7526) grad_norm 1.1328 (1.2715) [2022-10-07 20:20:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][800/1251] eta 0:02:28 lr 0.000806 time 0.3249 (0.3287) loss 3.9515 (3.7530) grad_norm 1.3879 (1.2731) [2022-10-07 20:21:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][900/1251] eta 0:01:55 lr 0.000805 time 0.3220 (0.3281) loss 3.6376 (3.7502) grad_norm 1.3252 (1.2675) [2022-10-07 20:21:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][1000/1251] eta 0:01:22 lr 0.000805 time 0.3269 (0.3278) loss 3.5712 (3.7517) grad_norm 1.2566 (1.2659) [2022-10-07 20:22:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][1100/1251] eta 0:00:49 lr 0.000805 time 0.3318 (0.3276) loss 3.6056 (3.7533) grad_norm 1.1313 (1.2615) [2022-10-07 20:22:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [87/300][1200/1251] eta 0:00:16 lr 0.000804 time 0.3291 (0.3274) loss 3.8182 (3.7566) grad_norm 1.1185 (1.2596) [2022-10-07 20:22:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 87 training takes 0:06:49 [2022-10-07 20:23:01 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.144 (2.144) Loss 1.1736 (1.1736) Acc@1 72.656 (72.656) Acc@5 91.211 (91.211) [2022-10-07 20:23:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.030 Acc@5 91.792 [2022-10-07 20:23:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-07 20:23:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.03% [2022-10-07 20:23:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][0/1251] eta 0:54:52 lr 0.000804 time 2.6319 (2.6319) loss 3.7245 (3.7245) grad_norm 1.3063 (1.3063) [2022-10-07 20:23:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][100/1251] eta 0:06:41 lr 0.000804 time 0.3226 (0.3491) loss 3.7613 (3.7335) grad_norm 1.1085 (1.2659) [2022-10-07 20:24:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][200/1251] eta 0:05:53 lr 0.000804 time 0.3299 (0.3368) loss 3.8385 (3.7331) grad_norm 1.2761 (1.2510) [2022-10-07 20:24:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][300/1251] eta 0:05:16 lr 0.000803 time 0.3216 (0.3326) loss 3.8472 (3.7366) grad_norm 1.1595 (1.2464) [2022-10-07 20:25:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][400/1251] eta 0:04:41 lr 0.000803 time 0.3314 (0.3304) loss 3.5818 (3.7385) grad_norm 1.2027 (1.2459) [2022-10-07 20:25:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][500/1251] eta 0:04:07 lr 0.000803 time 0.3295 (0.3292) loss 3.8151 (3.7407) grad_norm 1.4124 (1.2466) [2022-10-07 20:26:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][600/1251] eta 0:03:33 lr 0.000802 time 0.3290 (0.3284) loss 3.9361 (3.7418) grad_norm 1.1789 (1.2473) [2022-10-07 20:27:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][700/1251] eta 0:03:00 lr 0.000802 time 0.3231 (0.3280) loss 4.0138 (3.7424) grad_norm 1.2896 (1.2483) [2022-10-07 20:27:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][800/1251] eta 0:02:27 lr 0.000802 time 0.3275 (0.3277) loss 3.5672 (3.7437) grad_norm 1.2200 (1.2517) [2022-10-07 20:28:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][900/1251] eta 0:01:54 lr 0.000801 time 0.3213 (0.3276) loss 3.5908 (3.7478) grad_norm 1.3436 (1.2508) [2022-10-07 20:28:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][1000/1251] eta 0:01:22 lr 0.000801 time 0.3237 (0.3275) loss 3.7531 (3.7499) grad_norm 1.2624 (1.2515) [2022-10-07 20:29:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][1100/1251] eta 0:00:49 lr 0.000801 time 0.3296 (0.3276) loss 3.6177 (3.7500) grad_norm 1.1372 (1.2541) [2022-10-07 20:29:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [88/300][1200/1251] eta 0:00:16 lr 0.000800 time 0.3319 (0.3277) loss 3.5504 (3.7514) grad_norm 1.2775 (1.2556) [2022-10-07 20:30:03 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 88 training takes 0:06:50 [2022-10-07 20:30:06 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.861 (2.861) Loss 1.1541 (1.1541) Acc@1 74.121 (74.121) Acc@5 91.992 (91.992) [2022-10-07 20:30:17 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.908 Acc@5 91.674 [2022-10-07 20:30:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-10-07 20:30:17 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.03% [2022-10-07 20:30:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][0/1251] eta 0:51:19 lr 0.000800 time 2.4618 (2.4618) loss 3.7398 (3.7398) grad_norm 1.0731 (1.0731) [2022-10-07 20:30:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][100/1251] eta 0:06:40 lr 0.000800 time 0.3222 (0.3481) loss 3.6112 (3.7080) grad_norm 1.0258 (1.2466) [2022-10-07 20:31:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][200/1251] eta 0:05:53 lr 0.000799 time 0.3244 (0.3363) loss 3.6064 (3.7307) grad_norm 1.2174 (1.2595) [2022-10-07 20:31:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][300/1251] eta 0:05:16 lr 0.000799 time 0.3281 (0.3323) loss 3.7578 (3.7361) grad_norm 1.4949 (1.2577) [2022-10-07 20:32:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][400/1251] eta 0:04:41 lr 0.000799 time 0.3253 (0.3305) loss 3.7732 (3.7442) grad_norm 1.1471 (1.2658) [2022-10-07 20:33:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][500/1251] eta 0:04:07 lr 0.000798 time 0.3241 (0.3294) loss 4.1375 (3.7452) grad_norm 1.0144 (1.2675) [2022-10-07 20:33:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][600/1251] eta 0:03:33 lr 0.000798 time 0.3286 (0.3286) loss 3.8866 (3.7467) grad_norm 1.0756 (1.2634) [2022-10-07 20:34:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][700/1251] eta 0:03:00 lr 0.000798 time 0.3227 (0.3280) loss 3.7572 (3.7463) grad_norm 1.2777 (1.2641) [2022-10-07 20:34:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][800/1251] eta 0:02:27 lr 0.000797 time 0.3262 (0.3276) loss 3.5122 (3.7465) grad_norm 1.1693 (1.2657) [2022-10-07 20:35:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][900/1251] eta 0:01:54 lr 0.000797 time 0.3289 (0.3273) loss 4.1505 (3.7483) grad_norm 1.2335 (1.2632) [2022-10-07 20:35:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][1000/1251] eta 0:01:22 lr 0.000797 time 0.3285 (0.3271) loss 3.5516 (3.7478) grad_norm 1.1693 (1.2623) [2022-10-07 20:36:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][1100/1251] eta 0:00:49 lr 0.000796 time 0.3240 (0.3270) loss 3.8113 (3.7488) grad_norm 1.2781 (1.2624) [2022-10-07 20:36:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [89/300][1200/1251] eta 0:00:16 lr 0.000796 time 0.3250 (0.3270) loss 3.7536 (3.7471) grad_norm 1.5340 (1.2620) [2022-10-07 20:37:06 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 89 training takes 0:06:49 [2022-10-07 20:37:09 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.167 (3.167) Loss 1.1910 (1.1910) Acc@1 74.316 (74.316) Acc@5 90.918 (90.918) [2022-10-07 20:37:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 72.948 Acc@5 91.802 [2022-10-07 20:37:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-10-07 20:37:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.03% [2022-10-07 20:37:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][0/1251] eta 1:08:11 lr 0.000796 time 3.2702 (3.2702) loss 3.7594 (3.7594) grad_norm 1.2568 (1.2568) [2022-10-07 20:37:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][100/1251] eta 0:06:48 lr 0.000796 time 0.3220 (0.3547) loss 3.8469 (3.7094) grad_norm 1.4041 (1.2351) [2022-10-07 20:38:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][200/1251] eta 0:05:57 lr 0.000795 time 0.3255 (0.3400) loss 3.7476 (3.7337) grad_norm 1.2462 (1.2431) [2022-10-07 20:39:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][300/1251] eta 0:05:19 lr 0.000795 time 0.3291 (0.3358) loss 3.5774 (3.7340) grad_norm 1.1444 (1.2371) [2022-10-07 20:39:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][400/1251] eta 0:04:43 lr 0.000795 time 0.3240 (0.3332) loss 3.7968 (3.7385) grad_norm 1.7829 (1.2418) [2022-10-07 20:40:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][500/1251] eta 0:04:08 lr 0.000794 time 0.3284 (0.3315) loss 3.6169 (3.7454) grad_norm 1.3291 (1.2432) [2022-10-07 20:40:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][600/1251] eta 0:03:35 lr 0.000794 time 0.3223 (0.3304) loss 3.5253 (3.7458) grad_norm 1.2793 (1.2472) [2022-10-07 20:41:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][700/1251] eta 0:03:01 lr 0.000794 time 0.3207 (0.3295) loss 3.5439 (3.7492) grad_norm 1.4467 (1.2524) [2022-10-07 20:41:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][800/1251] eta 0:02:28 lr 0.000793 time 0.3232 (0.3288) loss 3.6703 (3.7481) grad_norm 1.1584 (1.2498) [2022-10-07 20:42:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][900/1251] eta 0:01:55 lr 0.000793 time 0.3219 (0.3283) loss 3.6278 (3.7503) grad_norm 1.4771 (1.2516) [2022-10-07 20:42:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][1000/1251] eta 0:01:22 lr 0.000793 time 0.3322 (0.3280) loss 3.7900 (3.7521) grad_norm 1.3062 (1.2532) [2022-10-07 20:43:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][1100/1251] eta 0:00:49 lr 0.000792 time 0.3214 (0.3276) loss 4.0056 (3.7537) grad_norm 1.1760 (1.2517) [2022-10-07 20:43:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [90/300][1200/1251] eta 0:00:16 lr 0.000792 time 0.3256 (0.3274) loss 3.5328 (3.7542) grad_norm 0.9980 (1.2520) [2022-10-07 20:44:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 90 training takes 0:06:49 [2022-10-07 20:44:09 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_90 saving...... [2022-10-07 20:44:10 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_90 saved !!! [2022-10-07 20:44:13 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.014 (3.014) Loss 1.1238 (1.1238) Acc@1 73.145 (73.145) Acc@5 93.164 (93.164) [2022-10-07 20:44:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.296 Acc@5 91.714 [2022-10-07 20:44:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-10-07 20:44:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.30% [2022-10-07 20:44:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][0/1251] eta 1:08:16 lr 0.000792 time 3.2746 (3.2746) loss 3.7563 (3.7563) grad_norm 1.1452 (1.1452) [2022-10-07 20:44:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][100/1251] eta 0:06:49 lr 0.000791 time 0.3275 (0.3555) loss 3.7400 (3.7415) grad_norm 1.2269 (1.2445) [2022-10-07 20:45:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][200/1251] eta 0:05:58 lr 0.000791 time 0.3250 (0.3410) loss 3.7914 (3.7479) grad_norm 1.6044 (1.2577) [2022-10-07 20:46:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][300/1251] eta 0:05:19 lr 0.000791 time 0.3218 (0.3359) loss 3.9219 (3.7448) grad_norm 1.3430 (1.2734) [2022-10-07 20:46:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][400/1251] eta 0:04:43 lr 0.000790 time 0.3286 (0.3335) loss 3.5578 (3.7524) grad_norm 1.2747 (1.2703) [2022-10-07 20:47:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][500/1251] eta 0:04:09 lr 0.000790 time 0.3248 (0.3322) loss 3.4311 (3.7496) grad_norm 1.0944 (1.2694) [2022-10-07 20:47:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][600/1251] eta 0:03:35 lr 0.000790 time 0.3230 (0.3313) loss 3.9531 (3.7557) grad_norm 1.3017 (1.2646) [2022-10-07 20:48:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][700/1251] eta 0:03:02 lr 0.000789 time 0.3289 (0.3307) loss 3.7201 (3.7524) grad_norm 1.3808 (1.2679) [2022-10-07 20:48:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][800/1251] eta 0:02:28 lr 0.000789 time 0.3225 (0.3302) loss 3.7681 (3.7525) grad_norm 0.9976 (1.2713) [2022-10-07 20:49:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][900/1251] eta 0:01:55 lr 0.000789 time 0.3345 (0.3298) loss 3.9433 (3.7517) grad_norm 1.1870 (1.2674) [2022-10-07 20:49:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][1000/1251] eta 0:01:22 lr 0.000788 time 0.3274 (0.3296) loss 3.8399 (3.7537) grad_norm 1.5178 (1.2632) [2022-10-07 20:50:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][1100/1251] eta 0:00:49 lr 0.000788 time 0.3354 (0.3294) loss 3.7013 (3.7557) grad_norm 1.2916 (1.2616) [2022-10-07 20:50:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [91/300][1200/1251] eta 0:00:16 lr 0.000788 time 0.3398 (0.3293) loss 3.5336 (3.7547) grad_norm 1.1029 (1.2632) [2022-10-07 20:51:15 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 91 training takes 0:06:52 [2022-10-07 20:51:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.342 (3.342) Loss 1.1265 (1.1265) Acc@1 73.242 (73.242) Acc@5 91.406 (91.406) [2022-10-07 20:51:29 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.438 Acc@5 91.840 [2022-10-07 20:51:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-07 20:51:29 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.44% [2022-10-07 20:51:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][0/1251] eta 1:06:03 lr 0.000788 time 3.1684 (3.1684) loss 3.6879 (3.6879) grad_norm 1.0525 (1.0525) [2022-10-07 20:52:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][100/1251] eta 0:06:49 lr 0.000787 time 0.3251 (0.3557) loss 3.7225 (3.7312) grad_norm 1.5516 (1.2513) [2022-10-07 20:52:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][200/1251] eta 0:05:58 lr 0.000787 time 0.3234 (0.3415) loss 3.7305 (3.7408) grad_norm 1.0586 (1.2596) [2022-10-07 20:53:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][300/1251] eta 0:05:20 lr 0.000786 time 0.3314 (0.3368) loss 3.8556 (3.7411) grad_norm 1.0197 (1.2525) [2022-10-07 20:53:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][400/1251] eta 0:04:44 lr 0.000786 time 0.3263 (0.3341) loss 3.5652 (3.7419) grad_norm 1.1871 (1.2538) [2022-10-07 20:54:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][500/1251] eta 0:04:09 lr 0.000786 time 0.3285 (0.3322) loss 3.9661 (3.7478) grad_norm 1.1167 (1.2565) [2022-10-07 20:54:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][600/1251] eta 0:03:35 lr 0.000785 time 0.3201 (0.3310) loss 3.9550 (3.7452) grad_norm 1.1046 (1.2555) [2022-10-07 20:55:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][700/1251] eta 0:03:01 lr 0.000785 time 0.3239 (0.3301) loss 3.7122 (3.7428) grad_norm 1.2247 (1.2578) [2022-10-07 20:55:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][800/1251] eta 0:02:28 lr 0.000785 time 0.3331 (0.3294) loss 4.0704 (3.7376) grad_norm 1.1195 (1.2566) [2022-10-07 20:56:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][900/1251] eta 0:01:55 lr 0.000784 time 0.3278 (0.3289) loss 4.1923 (3.7411) grad_norm 1.2775 (1.2578) [2022-10-07 20:56:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][1000/1251] eta 0:01:22 lr 0.000784 time 0.3243 (0.3286) loss 4.0461 (3.7405) grad_norm 1.1401 (1.2570) [2022-10-07 20:57:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][1100/1251] eta 0:00:49 lr 0.000784 time 0.3268 (0.3284) loss 3.7187 (3.7389) grad_norm 1.1033 (1.2564) [2022-10-07 20:58:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [92/300][1200/1251] eta 0:00:16 lr 0.000783 time 0.3270 (0.3283) loss 3.5306 (3.7402) grad_norm 1.3517 (1.2566) [2022-10-07 20:58:20 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 92 training takes 0:06:50 [2022-10-07 20:58:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.519 (2.519) Loss 1.1168 (1.1168) Acc@1 75.195 (75.195) Acc@5 92.285 (92.285) [2022-10-07 20:58:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.336 Acc@5 91.774 [2022-10-07 20:58:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-10-07 20:58:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.44% [2022-10-07 20:58:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][0/1251] eta 0:52:31 lr 0.000783 time 2.5192 (2.5192) loss 3.8830 (3.8830) grad_norm 1.1441 (1.1441) [2022-10-07 20:59:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][100/1251] eta 0:06:40 lr 0.000783 time 0.3289 (0.3477) loss 3.9594 (3.7577) grad_norm 1.2131 (1.2297) [2022-10-07 20:59:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][200/1251] eta 0:05:54 lr 0.000783 time 0.3239 (0.3369) loss 3.9008 (3.7513) grad_norm 1.3837 (1.2569) [2022-10-07 21:00:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][300/1251] eta 0:05:16 lr 0.000782 time 0.3246 (0.3328) loss 3.8057 (3.7452) grad_norm 1.4269 (1.2581) [2022-10-07 21:00:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][400/1251] eta 0:04:41 lr 0.000782 time 0.3209 (0.3306) loss 4.0312 (3.7387) grad_norm 1.5423 (1.2646) [2022-10-07 21:01:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][500/1251] eta 0:04:07 lr 0.000782 time 0.3261 (0.3293) loss 3.7124 (3.7403) grad_norm 1.1831 (1.2698) [2022-10-07 21:01:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][600/1251] eta 0:03:33 lr 0.000781 time 0.3229 (0.3282) loss 3.7429 (3.7387) grad_norm 1.2265 (1.2691) [2022-10-07 21:02:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][700/1251] eta 0:03:00 lr 0.000781 time 0.3181 (0.3275) loss 3.8485 (3.7390) grad_norm 1.2145 (1.2677) [2022-10-07 21:02:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][800/1251] eta 0:02:27 lr 0.000780 time 0.3258 (0.3270) loss 3.4568 (3.7400) grad_norm 1.2277 (1.2685) [2022-10-07 21:03:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][900/1251] eta 0:01:54 lr 0.000780 time 0.3238 (0.3267) loss 3.8793 (3.7418) grad_norm 1.2062 (1.2632) [2022-10-07 21:04:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][1000/1251] eta 0:01:21 lr 0.000780 time 0.3212 (0.3263) loss 4.0367 (3.7407) grad_norm 1.3246 (1.2673) [2022-10-07 21:04:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][1100/1251] eta 0:00:49 lr 0.000779 time 0.3238 (0.3260) loss 3.4233 (3.7401) grad_norm 1.1882 (1.2671) [2022-10-07 21:05:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [93/300][1200/1251] eta 0:00:16 lr 0.000779 time 0.3224 (0.3257) loss 3.5725 (3.7400) grad_norm 1.0846 (1.2689) [2022-10-07 21:05:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 93 training takes 0:06:47 [2022-10-07 21:05:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.456 (2.456) Loss 1.1141 (1.1141) Acc@1 74.121 (74.121) Acc@5 92.285 (92.285) [2022-10-07 21:05:34 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.454 Acc@5 91.882 [2022-10-07 21:05:34 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-07 21:05:34 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.45% [2022-10-07 21:05:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][0/1251] eta 1:05:47 lr 0.000779 time 3.1556 (3.1556) loss 3.7501 (3.7501) grad_norm 1.1437 (1.1437) [2022-10-07 21:06:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][100/1251] eta 0:06:48 lr 0.000779 time 0.3246 (0.3546) loss 3.8150 (3.6981) grad_norm 1.2465 (1.2965) [2022-10-07 21:06:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][200/1251] eta 0:05:58 lr 0.000778 time 0.3246 (0.3407) loss 3.3581 (3.7069) grad_norm 1.4732 (1.2841) [2022-10-07 21:07:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][300/1251] eta 0:05:19 lr 0.000778 time 0.3301 (0.3359) loss 3.7505 (3.7051) grad_norm 0.9962 (1.2771) [2022-10-07 21:07:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][400/1251] eta 0:04:43 lr 0.000778 time 0.3234 (0.3333) loss 3.8269 (3.7215) grad_norm 1.0318 (1.2743) [2022-10-07 21:08:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][500/1251] eta 0:04:09 lr 0.000777 time 0.3240 (0.3317) loss 4.0332 (3.7147) grad_norm 1.5424 (1.2757) [2022-10-07 21:08:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][600/1251] eta 0:03:35 lr 0.000777 time 0.3245 (0.3305) loss 3.7941 (3.7209) grad_norm 1.2941 (1.2712) [2022-10-07 21:09:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][700/1251] eta 0:03:01 lr 0.000777 time 0.3252 (0.3296) loss 3.8296 (3.7240) grad_norm 1.1131 (1.2720) [2022-10-07 21:09:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][800/1251] eta 0:02:28 lr 0.000776 time 0.3264 (0.3290) loss 3.7505 (3.7242) grad_norm 1.4726 (1.2708) [2022-10-07 21:10:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][900/1251] eta 0:01:55 lr 0.000776 time 0.3216 (0.3286) loss 3.6919 (3.7259) grad_norm 1.0846 (1.2682) [2022-10-07 21:11:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][1000/1251] eta 0:01:22 lr 0.000775 time 0.3408 (0.3285) loss 3.7860 (3.7268) grad_norm 1.3006 (1.2734) [2022-10-07 21:11:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][1100/1251] eta 0:00:49 lr 0.000775 time 0.3218 (0.3284) loss 3.5616 (3.7274) grad_norm 1.4019 (1.2714) [2022-10-07 21:12:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [94/300][1200/1251] eta 0:00:16 lr 0.000775 time 0.3276 (0.3284) loss 3.9382 (3.7263) grad_norm 1.2725 (1.2696) [2022-10-07 21:12:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 94 training takes 0:06:51 [2022-10-07 21:12:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.301 (2.301) Loss 1.1255 (1.1255) Acc@1 73.438 (73.438) Acc@5 92.578 (92.578) [2022-10-07 21:12:39 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.220 Acc@5 91.964 [2022-10-07 21:12:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.2% [2022-10-07 21:12:39 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.45% [2022-10-07 21:12:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][0/1251] eta 0:51:54 lr 0.000775 time 2.4897 (2.4897) loss 3.8930 (3.8930) grad_norm 1.8679 (1.8679) [2022-10-07 21:13:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][100/1251] eta 0:06:41 lr 0.000774 time 0.3241 (0.3492) loss 3.5602 (3.7221) grad_norm 1.3469 (1.2791) [2022-10-07 21:13:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][200/1251] eta 0:05:54 lr 0.000774 time 0.3221 (0.3370) loss 3.5247 (3.7276) grad_norm 1.4135 (1.2892) [2022-10-07 21:14:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][300/1251] eta 0:05:16 lr 0.000774 time 0.3243 (0.3331) loss 3.7558 (3.7266) grad_norm 1.1804 (1.2800) [2022-10-07 21:14:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][400/1251] eta 0:04:41 lr 0.000773 time 0.3247 (0.3311) loss 4.1346 (3.7353) grad_norm 1.1420 (1.2782) [2022-10-07 21:15:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][500/1251] eta 0:04:07 lr 0.000773 time 0.3265 (0.3299) loss 3.7473 (3.7386) grad_norm 1.0575 (1.2744) [2022-10-07 21:15:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][600/1251] eta 0:03:34 lr 0.000773 time 0.3266 (0.3290) loss 3.4928 (3.7378) grad_norm 1.3307 (1.2709) [2022-10-07 21:16:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][700/1251] eta 0:03:00 lr 0.000772 time 0.3235 (0.3284) loss 3.6568 (3.7366) grad_norm 1.2198 (1.2725) [2022-10-07 21:17:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][800/1251] eta 0:02:27 lr 0.000772 time 0.3230 (0.3279) loss 3.8530 (3.7357) grad_norm 1.2901 (1.2684) [2022-10-07 21:17:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][900/1251] eta 0:01:54 lr 0.000771 time 0.3219 (0.3276) loss 3.4896 (3.7337) grad_norm 1.3804 (1.2712) [2022-10-07 21:18:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][1000/1251] eta 0:01:22 lr 0.000771 time 0.3258 (0.3275) loss 3.6952 (3.7332) grad_norm 1.1182 (1.2709) [2022-10-07 21:18:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][1100/1251] eta 0:00:49 lr 0.000771 time 0.3265 (0.3274) loss 3.9426 (3.7327) grad_norm 1.2028 (1.2711) [2022-10-07 21:19:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [95/300][1200/1251] eta 0:00:16 lr 0.000770 time 0.3361 (0.3275) loss 3.4819 (3.7320) grad_norm 1.2567 (1.2699) [2022-10-07 21:19:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 95 training takes 0:06:50 [2022-10-07 21:19:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.457 (2.457) Loss 1.1425 (1.1425) Acc@1 72.559 (72.559) Acc@5 91.797 (91.797) [2022-10-07 21:19:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.432 Acc@5 92.122 [2022-10-07 21:19:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-07 21:19:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.45% [2022-10-07 21:19:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][0/1251] eta 1:04:48 lr 0.000770 time 3.1085 (3.1085) loss 4.1324 (4.1324) grad_norm 1.1955 (1.1955) [2022-10-07 21:20:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][100/1251] eta 0:06:48 lr 0.000770 time 0.3250 (0.3552) loss 4.0529 (3.7064) grad_norm 1.3339 (1.2582) [2022-10-07 21:20:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][200/1251] eta 0:05:57 lr 0.000770 time 0.3247 (0.3399) loss 3.6570 (3.7243) grad_norm 1.2422 (1.2577) [2022-10-07 21:21:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][300/1251] eta 0:05:18 lr 0.000769 time 0.3255 (0.3348) loss 3.3755 (3.7154) grad_norm 1.2822 (1.2610) [2022-10-07 21:21:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][400/1251] eta 0:04:42 lr 0.000769 time 0.3210 (0.3322) loss 3.5479 (3.7229) grad_norm 1.4357 (1.2628) [2022-10-07 21:22:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][500/1251] eta 0:04:08 lr 0.000768 time 0.3248 (0.3305) loss 3.4060 (3.7163) grad_norm 1.1954 (1.2676) [2022-10-07 21:23:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][600/1251] eta 0:03:34 lr 0.000768 time 0.3235 (0.3295) loss 4.0069 (3.7188) grad_norm 1.1324 (1.2711) [2022-10-07 21:23:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][700/1251] eta 0:03:01 lr 0.000768 time 0.3262 (0.3288) loss 3.5187 (3.7227) grad_norm 1.3822 (1.2730) [2022-10-07 21:24:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][800/1251] eta 0:02:28 lr 0.000767 time 0.3251 (0.3284) loss 3.8191 (3.7250) grad_norm 1.3056 (1.2702) [2022-10-07 21:24:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][900/1251] eta 0:01:55 lr 0.000767 time 0.3238 (0.3280) loss 3.7136 (3.7249) grad_norm 1.4370 (1.2719) [2022-10-07 21:25:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][1000/1251] eta 0:01:22 lr 0.000767 time 0.3271 (0.3277) loss 3.6947 (3.7251) grad_norm 1.2065 (1.2715) [2022-10-07 21:25:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][1100/1251] eta 0:00:49 lr 0.000766 time 0.3254 (0.3275) loss 3.6620 (3.7259) grad_norm 1.2097 (1.2730) [2022-10-07 21:26:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [96/300][1200/1251] eta 0:00:16 lr 0.000766 time 0.3293 (0.3276) loss 3.4725 (3.7260) grad_norm 1.0664 (1.2746) [2022-10-07 21:26:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 96 training takes 0:06:50 [2022-10-07 21:26:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.340 (3.340) Loss 1.0718 (1.0718) Acc@1 75.586 (75.586) Acc@5 92.188 (92.188) [2022-10-07 21:26:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.408 Acc@5 92.034 [2022-10-07 21:26:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-07 21:26:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.45% [2022-10-07 21:26:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][0/1251] eta 0:46:05 lr 0.000766 time 2.2109 (2.2109) loss 3.5533 (3.5533) grad_norm 1.5270 (1.5270) [2022-10-07 21:27:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][100/1251] eta 0:06:45 lr 0.000765 time 0.3260 (0.3526) loss 3.4366 (3.7018) grad_norm 1.3951 (1.2974) [2022-10-07 21:27:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][200/1251] eta 0:05:56 lr 0.000765 time 0.3318 (0.3393) loss 3.7448 (3.7224) grad_norm 1.3909 (1.2965) [2022-10-07 21:28:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][300/1251] eta 0:05:18 lr 0.000765 time 0.3280 (0.3346) loss 4.0661 (3.7262) grad_norm 1.5159 (1.2948) [2022-10-07 21:28:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][400/1251] eta 0:04:42 lr 0.000764 time 0.3231 (0.3322) loss 3.8293 (3.7177) grad_norm 1.0362 (1.2831) [2022-10-07 21:29:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][500/1251] eta 0:04:08 lr 0.000764 time 0.3221 (0.3306) loss 3.8456 (3.7172) grad_norm 1.5291 (1.2838) [2022-10-07 21:30:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][600/1251] eta 0:03:34 lr 0.000764 time 0.3238 (0.3297) loss 3.7708 (3.7178) grad_norm 1.4565 (1.2793) [2022-10-07 21:30:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][700/1251] eta 0:03:01 lr 0.000763 time 0.3308 (0.3290) loss 3.4660 (3.7171) grad_norm 1.2980 (1.2803) [2022-10-07 21:31:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][800/1251] eta 0:02:28 lr 0.000763 time 0.3319 (0.3287) loss 3.8773 (3.7172) grad_norm 1.3571 (1.2820) [2022-10-07 21:31:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][900/1251] eta 0:01:55 lr 0.000763 time 0.3227 (0.3285) loss 4.0053 (3.7206) grad_norm 1.1863 (1.2829) [2022-10-07 21:32:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][1000/1251] eta 0:01:22 lr 0.000762 time 0.3228 (0.3284) loss 3.7841 (3.7222) grad_norm 1.4754 (1.2825) [2022-10-07 21:32:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][1100/1251] eta 0:00:49 lr 0.000762 time 0.3345 (0.3285) loss 3.8679 (3.7244) grad_norm 1.2517 (1.2822) [2022-10-07 21:33:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [97/300][1200/1251] eta 0:00:16 lr 0.000762 time 0.3413 (0.3287) loss 3.7745 (3.7261) grad_norm 1.2140 (1.2795) [2022-10-07 21:33:38 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 97 training takes 0:06:51 [2022-10-07 21:33:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.270 (3.270) Loss 1.1448 (1.1448) Acc@1 74.023 (74.023) Acc@5 91.895 (91.895) [2022-10-07 21:33:51 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.640 Acc@5 91.968 [2022-10-07 21:33:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-07 21:33:51 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.64% [2022-10-07 21:33:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][0/1251] eta 0:49:51 lr 0.000761 time 2.3910 (2.3910) loss 3.8007 (3.8007) grad_norm 1.2888 (1.2888) [2022-10-07 21:34:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][100/1251] eta 0:06:40 lr 0.000761 time 0.3299 (0.3479) loss 3.8723 (3.7117) grad_norm 1.1885 (1.3301) [2022-10-07 21:34:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][200/1251] eta 0:05:53 lr 0.000761 time 0.3396 (0.3368) loss 3.6502 (3.7175) grad_norm 1.0993 (1.2943) [2022-10-07 21:35:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][300/1251] eta 0:05:16 lr 0.000760 time 0.3271 (0.3331) loss 3.5410 (3.7156) grad_norm 1.3069 (1.2740) [2022-10-07 21:36:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][400/1251] eta 0:04:41 lr 0.000760 time 0.3238 (0.3310) loss 3.5037 (3.7159) grad_norm 1.5071 (1.2758) [2022-10-07 21:36:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][500/1251] eta 0:04:07 lr 0.000760 time 0.3246 (0.3298) loss 3.8213 (3.7153) grad_norm 1.1862 (1.2749) [2022-10-07 21:37:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][600/1251] eta 0:03:34 lr 0.000759 time 0.3278 (0.3291) loss 3.7030 (3.7194) grad_norm 1.3416 (1.2771) [2022-10-07 21:37:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][700/1251] eta 0:03:00 lr 0.000759 time 0.3271 (0.3285) loss 3.5638 (3.7178) grad_norm 1.2500 (1.2794) [2022-10-07 21:38:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][800/1251] eta 0:02:27 lr 0.000759 time 0.3286 (0.3280) loss 3.6092 (3.7159) grad_norm 1.1792 (1.2816) [2022-10-07 21:38:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][900/1251] eta 0:01:55 lr 0.000758 time 0.3277 (0.3277) loss 3.5943 (3.7146) grad_norm 1.3512 (1.2808) [2022-10-07 21:39:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][1000/1251] eta 0:01:22 lr 0.000758 time 0.3210 (0.3276) loss 3.3990 (3.7142) grad_norm 1.4085 (1.2812) [2022-10-07 21:39:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][1100/1251] eta 0:00:49 lr 0.000758 time 0.3283 (0.3276) loss 3.8553 (3.7129) grad_norm 1.2721 (1.2828) [2022-10-07 21:40:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [98/300][1200/1251] eta 0:00:16 lr 0.000757 time 0.3331 (0.3278) loss 3.7215 (3.7159) grad_norm 1.1867 (1.2818) [2022-10-07 21:40:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 98 training takes 0:06:50 [2022-10-07 21:40:44 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.816 (2.816) Loss 1.0853 (1.0853) Acc@1 73.926 (73.926) Acc@5 92.871 (92.871) [2022-10-07 21:40:55 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.398 Acc@5 91.862 [2022-10-07 21:40:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-07 21:40:55 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.64% [2022-10-07 21:40:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][0/1251] eta 1:01:52 lr 0.000757 time 2.9674 (2.9674) loss 3.8111 (3.8111) grad_norm 1.5948 (1.5948) [2022-10-07 21:41:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][100/1251] eta 0:06:47 lr 0.000757 time 0.3336 (0.3540) loss 3.3383 (3.6675) grad_norm 1.1622 (1.2709) [2022-10-07 21:42:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][200/1251] eta 0:05:58 lr 0.000756 time 0.3238 (0.3409) loss 3.3930 (3.6831) grad_norm 1.3971 (1.2791) [2022-10-07 21:42:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][300/1251] eta 0:05:20 lr 0.000756 time 0.3263 (0.3366) loss 3.7552 (3.6996) grad_norm 1.3985 (1.2822) [2022-10-07 21:43:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][400/1251] eta 0:04:44 lr 0.000756 time 0.3293 (0.3347) loss 3.7911 (3.7048) grad_norm 1.1454 (1.2886) [2022-10-07 21:43:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][500/1251] eta 0:04:10 lr 0.000755 time 0.3278 (0.3338) loss 3.6149 (3.7060) grad_norm 1.1558 (1.2872) [2022-10-07 21:44:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][600/1251] eta 0:03:36 lr 0.000755 time 0.3322 (0.3333) loss 3.5827 (3.7091) grad_norm 1.1873 (1.2893) [2022-10-07 21:44:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][700/1251] eta 0:03:03 lr 0.000754 time 0.3255 (0.3330) loss 3.8650 (3.7107) grad_norm 1.3906 (1.2934) [2022-10-07 21:45:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][800/1251] eta 0:02:30 lr 0.000754 time 0.3344 (0.3328) loss 3.7911 (3.7132) grad_norm 1.4126 (1.2928) [2022-10-07 21:45:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][900/1251] eta 0:01:56 lr 0.000754 time 0.3313 (0.3326) loss 3.6278 (3.7143) grad_norm 1.2009 (1.2922) [2022-10-07 21:46:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][1000/1251] eta 0:01:23 lr 0.000753 time 0.3271 (0.3324) loss 3.5575 (3.7164) grad_norm 1.1899 (1.2949) [2022-10-07 21:47:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][1100/1251] eta 0:00:50 lr 0.000753 time 0.3333 (0.3324) loss 3.7409 (3.7176) grad_norm 1.2482 (1.2953) [2022-10-07 21:47:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [99/300][1200/1251] eta 0:00:16 lr 0.000753 time 0.3272 (0.3323) loss 3.6350 (3.7182) grad_norm 1.0563 (1.2951) [2022-10-07 21:47:51 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 99 training takes 0:06:56 [2022-10-07 21:47:54 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.756 (2.756) Loss 1.0831 (1.0831) Acc@1 75.293 (75.293) Acc@5 93.555 (93.555) [2022-10-07 21:48:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.256 Acc@5 92.030 [2022-10-07 21:48:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-10-07 21:48:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.64% [2022-10-07 21:48:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][0/1251] eta 1:05:14 lr 0.000753 time 3.1289 (3.1289) loss 3.3794 (3.3794) grad_norm 1.6184 (1.6184) [2022-10-07 21:48:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][100/1251] eta 0:06:47 lr 0.000752 time 0.3261 (0.3542) loss 3.4317 (3.7264) grad_norm 1.1996 (1.3045) [2022-10-07 21:49:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][200/1251] eta 0:05:56 lr 0.000752 time 0.3254 (0.3396) loss 3.9399 (3.7035) grad_norm 1.1193 (1.3091) [2022-10-07 21:49:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][300/1251] eta 0:05:18 lr 0.000751 time 0.3266 (0.3347) loss 3.7255 (3.7035) grad_norm 1.1400 (1.3090) [2022-10-07 21:50:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][400/1251] eta 0:04:42 lr 0.000751 time 0.3275 (0.3322) loss 3.4430 (3.7032) grad_norm 1.3886 (1.3028) [2022-10-07 21:50:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][500/1251] eta 0:04:08 lr 0.000751 time 0.3212 (0.3307) loss 3.7406 (3.7049) grad_norm 1.2406 (1.2994) [2022-10-07 21:51:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][600/1251] eta 0:03:34 lr 0.000750 time 0.3241 (0.3296) loss 3.6169 (3.7054) grad_norm 1.0970 (1.2991) [2022-10-07 21:51:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][700/1251] eta 0:03:01 lr 0.000750 time 0.3219 (0.3289) loss 3.9063 (3.7066) grad_norm 1.2019 (1.2973) [2022-10-07 21:52:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][800/1251] eta 0:02:28 lr 0.000750 time 0.3244 (0.3284) loss 3.5892 (3.7071) grad_norm 1.2511 (1.2963) [2022-10-07 21:53:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][900/1251] eta 0:01:55 lr 0.000749 time 0.3252 (0.3283) loss 3.7922 (3.7040) grad_norm 1.2129 (1.2942) [2022-10-07 21:53:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][1000/1251] eta 0:01:22 lr 0.000749 time 0.3253 (0.3282) loss 3.8744 (3.7071) grad_norm 1.0820 (1.2968) [2022-10-07 21:54:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][1100/1251] eta 0:00:49 lr 0.000749 time 0.3359 (0.3283) loss 3.6436 (3.7100) grad_norm 1.3422 (1.2946) [2022-10-07 21:54:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [100/300][1200/1251] eta 0:00:16 lr 0.000748 time 0.3258 (0.3283) loss 3.3325 (3.7115) grad_norm 1.4634 (1.2985) [2022-10-07 21:54:56 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 100 training takes 0:06:51 [2022-10-07 21:54:56 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_100 saving...... [2022-10-07 21:54:57 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_100 saved !!! [2022-10-07 21:55:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.856 (2.856) Loss 1.1347 (1.1347) Acc@1 72.559 (72.559) Acc@5 92.773 (92.773) [2022-10-07 21:55:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.522 Acc@5 92.006 [2022-10-07 21:55:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-07 21:55:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.64% [2022-10-07 21:55:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][0/1251] eta 0:57:35 lr 0.000748 time 2.7624 (2.7624) loss 3.5867 (3.5867) grad_norm 1.2363 (1.2363) [2022-10-07 21:55:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][100/1251] eta 0:06:44 lr 0.000748 time 0.3244 (0.3512) loss 3.8024 (3.7307) grad_norm 1.5877 (1.3123) [2022-10-07 21:56:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][200/1251] eta 0:05:55 lr 0.000747 time 0.3273 (0.3384) loss 3.7384 (3.7213) grad_norm 1.1403 (1.3136) [2022-10-07 21:56:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][300/1251] eta 0:05:17 lr 0.000747 time 0.3234 (0.3340) loss 3.6912 (3.7012) grad_norm 1.2305 (1.3127) [2022-10-07 21:57:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][400/1251] eta 0:04:42 lr 0.000747 time 0.3222 (0.3318) loss 3.5158 (3.7042) grad_norm 1.2679 (1.3062) [2022-10-07 21:57:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][500/1251] eta 0:04:08 lr 0.000746 time 0.3218 (0.3303) loss 3.3726 (3.7064) grad_norm 1.1567 (1.2990) [2022-10-07 21:58:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][600/1251] eta 0:03:34 lr 0.000746 time 0.3228 (0.3293) loss 3.7534 (3.7056) grad_norm 1.4960 (1.2995) [2022-10-07 21:59:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][700/1251] eta 0:03:01 lr 0.000745 time 0.3248 (0.3285) loss 3.5727 (3.7062) grad_norm 1.3185 (1.2980) [2022-10-07 21:59:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][800/1251] eta 0:02:28 lr 0.000745 time 0.3195 (0.3282) loss 3.7796 (3.7101) grad_norm 1.3592 (1.2967) [2022-10-07 22:00:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][900/1251] eta 0:01:55 lr 0.000745 time 0.3253 (0.3282) loss 3.6081 (3.7063) grad_norm 1.1683 (1.2968) [2022-10-07 22:00:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][1000/1251] eta 0:01:22 lr 0.000744 time 0.3314 (0.3283) loss 3.6928 (3.7085) grad_norm 1.1850 (1.2947) [2022-10-07 22:01:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][1100/1251] eta 0:00:49 lr 0.000744 time 0.3327 (0.3285) loss 3.8824 (3.7089) grad_norm 1.5679 (1.2984) [2022-10-07 22:01:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [101/300][1200/1251] eta 0:00:16 lr 0.000744 time 0.3277 (0.3286) loss 3.7031 (3.7087) grad_norm 1.1766 (1.2982) [2022-10-07 22:02:01 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 101 training takes 0:06:51 [2022-10-07 22:02:04 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.904 (2.904) Loss 1.0984 (1.0984) Acc@1 75.195 (75.195) Acc@5 91.699 (91.699) [2022-10-07 22:02:15 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.648 Acc@5 92.128 [2022-10-07 22:02:15 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-07 22:02:15 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.65% [2022-10-07 22:02:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][0/1251] eta 1:04:06 lr 0.000743 time 3.0748 (3.0748) loss 3.6077 (3.6077) grad_norm 1.2578 (1.2578) [2022-10-07 22:02:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][100/1251] eta 0:06:44 lr 0.000743 time 0.3247 (0.3516) loss 4.0263 (3.6576) grad_norm 1.4915 (1.3340) [2022-10-07 22:03:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][200/1251] eta 0:05:55 lr 0.000743 time 0.3233 (0.3384) loss 3.9610 (3.6762) grad_norm 1.1529 (1.3039) [2022-10-07 22:03:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][300/1251] eta 0:05:17 lr 0.000742 time 0.3222 (0.3339) loss 3.7103 (3.6847) grad_norm 1.3586 (1.3074) [2022-10-07 22:04:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][400/1251] eta 0:04:42 lr 0.000742 time 0.3266 (0.3318) loss 3.7933 (3.6896) grad_norm 1.2289 (1.3017) [2022-10-07 22:05:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][500/1251] eta 0:04:08 lr 0.000742 time 0.3227 (0.3305) loss 3.5331 (3.6874) grad_norm 1.5320 (1.3072) [2022-10-07 22:05:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][600/1251] eta 0:03:34 lr 0.000741 time 0.3256 (0.3298) loss 3.6716 (3.6924) grad_norm 1.3423 (1.3057) [2022-10-07 22:06:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][700/1251] eta 0:03:01 lr 0.000741 time 0.3305 (0.3294) loss 3.6219 (3.6933) grad_norm 1.5892 (1.3026) [2022-10-07 22:06:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][800/1251] eta 0:02:28 lr 0.000741 time 0.3268 (0.3291) loss 3.5203 (3.6948) grad_norm 1.1769 (1.3029) [2022-10-07 22:07:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][900/1251] eta 0:01:55 lr 0.000740 time 0.3269 (0.3289) loss 3.7389 (3.6966) grad_norm 1.1390 (1.3011) [2022-10-07 22:07:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][1000/1251] eta 0:01:22 lr 0.000740 time 0.3306 (0.3287) loss 3.7348 (3.7004) grad_norm 1.2407 (1.2991) [2022-10-07 22:08:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][1100/1251] eta 0:00:49 lr 0.000739 time 0.3221 (0.3286) loss 3.6337 (3.6997) grad_norm 1.4004 (1.3004) [2022-10-07 22:08:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [102/300][1200/1251] eta 0:00:16 lr 0.000739 time 0.3407 (0.3286) loss 3.5947 (3.6996) grad_norm 1.1442 (1.3006) [2022-10-07 22:09:07 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 102 training takes 0:06:51 [2022-10-07 22:09:10 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.008 (3.008) Loss 1.1519 (1.1519) Acc@1 72.168 (72.168) Acc@5 93.164 (93.164) [2022-10-07 22:09:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.490 Acc@5 91.940 [2022-10-07 22:09:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-07 22:09:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.65% [2022-10-07 22:09:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][0/1251] eta 0:57:45 lr 0.000739 time 2.7699 (2.7699) loss 3.7307 (3.7307) grad_norm 1.2080 (1.2080) [2022-10-07 22:09:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][100/1251] eta 0:06:44 lr 0.000739 time 0.3236 (0.3514) loss 3.7253 (3.6647) grad_norm 1.1506 (1.2830) [2022-10-07 22:10:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][200/1251] eta 0:05:55 lr 0.000738 time 0.3254 (0.3380) loss 3.8649 (3.6726) grad_norm 1.2666 (1.2993) [2022-10-07 22:11:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][300/1251] eta 0:05:17 lr 0.000738 time 0.3332 (0.3337) loss 3.4064 (3.6799) grad_norm 1.2498 (1.3028) [2022-10-07 22:11:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][400/1251] eta 0:04:42 lr 0.000737 time 0.3237 (0.3315) loss 3.4994 (3.6820) grad_norm 1.1433 (1.3019) [2022-10-07 22:12:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][500/1251] eta 0:04:07 lr 0.000737 time 0.3207 (0.3300) loss 3.7147 (3.6869) grad_norm 1.1458 (1.2988) [2022-10-07 22:12:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][600/1251] eta 0:03:34 lr 0.000737 time 0.3251 (0.3291) loss 3.3215 (3.6945) grad_norm 1.2556 (1.2980) [2022-10-07 22:13:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][700/1251] eta 0:03:00 lr 0.000736 time 0.3268 (0.3284) loss 3.5503 (3.6980) grad_norm 1.5247 (1.2993) [2022-10-07 22:13:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][800/1251] eta 0:02:27 lr 0.000736 time 0.3260 (0.3279) loss 4.0441 (3.6988) grad_norm 1.2992 (1.3009) [2022-10-07 22:14:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][900/1251] eta 0:01:55 lr 0.000736 time 0.3271 (0.3277) loss 3.6123 (3.6986) grad_norm 1.2021 (1.2969) [2022-10-07 22:14:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][1000/1251] eta 0:01:22 lr 0.000735 time 0.3276 (0.3276) loss 3.7325 (3.7042) grad_norm 1.1952 (1.2975) [2022-10-07 22:15:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][1100/1251] eta 0:00:49 lr 0.000735 time 0.3272 (0.3275) loss 3.9546 (3.7034) grad_norm 1.4959 (1.3026) [2022-10-07 22:15:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [103/300][1200/1251] eta 0:00:16 lr 0.000735 time 0.3319 (0.3276) loss 3.7350 (3.7054) grad_norm 1.8082 (1.3048) [2022-10-07 22:16:10 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 103 training takes 0:06:50 [2022-10-07 22:16:13 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.786 (2.786) Loss 1.1342 (1.1342) Acc@1 72.949 (72.949) Acc@5 91.113 (91.113) [2022-10-07 22:16:24 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.560 Acc@5 92.048 [2022-10-07 22:16:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-07 22:16:24 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.65% [2022-10-07 22:16:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][0/1251] eta 1:07:18 lr 0.000734 time 3.2279 (3.2279) loss 3.4000 (3.4000) grad_norm 1.3068 (1.3068) [2022-10-07 22:17:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][100/1251] eta 0:06:49 lr 0.000734 time 0.3278 (0.3558) loss 3.5638 (3.7010) grad_norm 1.2924 (1.3216) [2022-10-07 22:17:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][200/1251] eta 0:05:58 lr 0.000734 time 0.3300 (0.3413) loss 3.7049 (3.6921) grad_norm 1.3257 (1.2866) [2022-10-07 22:18:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][300/1251] eta 0:05:19 lr 0.000733 time 0.3232 (0.3364) loss 3.6881 (3.6912) grad_norm 1.2473 (1.2894) [2022-10-07 22:18:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][400/1251] eta 0:04:44 lr 0.000733 time 0.3341 (0.3339) loss 3.5563 (3.6944) grad_norm 1.1942 (1.2980) [2022-10-07 22:19:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][500/1251] eta 0:04:09 lr 0.000732 time 0.3241 (0.3323) loss 3.8911 (3.6999) grad_norm 1.1185 (1.2976) [2022-10-07 22:19:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][600/1251] eta 0:03:35 lr 0.000732 time 0.3222 (0.3312) loss 3.5419 (3.7018) grad_norm 1.3952 (1.3075) [2022-10-07 22:20:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][700/1251] eta 0:03:02 lr 0.000732 time 0.3250 (0.3304) loss 3.8468 (3.6975) grad_norm 1.2205 (1.3078) [2022-10-07 22:20:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][800/1251] eta 0:02:28 lr 0.000731 time 0.3224 (0.3299) loss 3.8902 (3.7008) grad_norm 1.2134 (1.3058) [2022-10-07 22:21:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][900/1251] eta 0:01:55 lr 0.000731 time 0.3278 (0.3295) loss 3.6263 (3.7008) grad_norm 1.1719 (1.3055) [2022-10-07 22:21:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][1000/1251] eta 0:01:22 lr 0.000731 time 0.3312 (0.3293) loss 3.7783 (3.7001) grad_norm 1.4264 (1.3059) [2022-10-07 22:22:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][1100/1251] eta 0:00:49 lr 0.000730 time 0.3263 (0.3292) loss 3.6340 (3.6982) grad_norm 1.3714 (1.3092) [2022-10-07 22:22:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [104/300][1200/1251] eta 0:00:16 lr 0.000730 time 0.3197 (0.3290) loss 3.8245 (3.7012) grad_norm 1.1589 (1.3126) [2022-10-07 22:23:16 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 104 training takes 0:06:51 [2022-10-07 22:23:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.044 (3.044) Loss 1.1493 (1.1493) Acc@1 72.266 (72.266) Acc@5 92.188 (92.188) [2022-10-07 22:23:29 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.744 Acc@5 92.232 [2022-10-07 22:23:29 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-10-07 22:23:29 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.74% [2022-10-07 22:23:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][0/1251] eta 0:55:23 lr 0.000730 time 2.6570 (2.6570) loss 3.5234 (3.5234) grad_norm 1.4679 (1.4679) [2022-10-07 22:24:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][100/1251] eta 0:06:40 lr 0.000729 time 0.3209 (0.3481) loss 3.6801 (3.6627) grad_norm 1.2329 (1.3359) [2022-10-07 22:24:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][200/1251] eta 0:05:53 lr 0.000729 time 0.3232 (0.3364) loss 3.7026 (3.6686) grad_norm 1.2155 (1.3139) [2022-10-07 22:25:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][300/1251] eta 0:05:16 lr 0.000729 time 0.3266 (0.3326) loss 3.7036 (3.6813) grad_norm 1.0927 (1.3208) [2022-10-07 22:25:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][400/1251] eta 0:04:41 lr 0.000728 time 0.3264 (0.3307) loss 3.8307 (3.6877) grad_norm 1.6076 (1.3188) [2022-10-07 22:26:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][500/1251] eta 0:04:07 lr 0.000728 time 0.3225 (0.3297) loss 3.6199 (3.6857) grad_norm 1.1300 (1.3232) [2022-10-07 22:26:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][600/1251] eta 0:03:34 lr 0.000728 time 0.3243 (0.3291) loss 3.5251 (3.6799) grad_norm 1.2820 (1.3203) [2022-10-07 22:27:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][700/1251] eta 0:03:01 lr 0.000727 time 0.3291 (0.3285) loss 3.9668 (3.6841) grad_norm 1.4018 (1.3194) [2022-10-07 22:27:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][800/1251] eta 0:02:28 lr 0.000727 time 0.3281 (0.3282) loss 3.7520 (3.6870) grad_norm 1.2766 (1.3170) [2022-10-07 22:28:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][900/1251] eta 0:01:55 lr 0.000726 time 0.3251 (0.3280) loss 3.7005 (3.6914) grad_norm 1.1946 (1.3152) [2022-10-07 22:28:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][1000/1251] eta 0:01:22 lr 0.000726 time 0.3210 (0.3279) loss 3.6855 (3.6900) grad_norm 1.7146 (1.3159) [2022-10-07 22:29:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][1100/1251] eta 0:00:49 lr 0.000726 time 0.3254 (0.3279) loss 3.6076 (3.6907) grad_norm 1.4258 (1.3174) [2022-10-07 22:30:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [105/300][1200/1251] eta 0:00:16 lr 0.000725 time 0.3238 (0.3279) loss 3.8640 (3.6928) grad_norm 1.1907 (1.3157) [2022-10-07 22:30:20 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 105 training takes 0:06:50 [2022-10-07 22:30:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.223 (3.223) Loss 1.1179 (1.1179) Acc@1 74.023 (74.023) Acc@5 92.676 (92.676) [2022-10-07 22:30:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.884 Acc@5 92.208 [2022-10-07 22:30:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-10-07 22:30:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.88% [2022-10-07 22:30:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][0/1251] eta 1:08:31 lr 0.000725 time 3.2869 (3.2869) loss 3.8195 (3.8195) grad_norm 1.4551 (1.4551) [2022-10-07 22:31:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][100/1251] eta 0:06:46 lr 0.000725 time 0.3192 (0.3528) loss 3.6526 (3.6667) grad_norm 1.1942 (1.2768) [2022-10-07 22:31:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][200/1251] eta 0:05:55 lr 0.000724 time 0.3221 (0.3385) loss 3.9248 (3.6805) grad_norm 1.1664 (1.2770) [2022-10-07 22:32:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][300/1251] eta 0:05:17 lr 0.000724 time 0.3184 (0.3336) loss 3.8039 (3.6760) grad_norm 1.4916 (1.2853) [2022-10-07 22:32:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][400/1251] eta 0:04:41 lr 0.000724 time 0.3257 (0.3311) loss 3.8527 (3.6786) grad_norm 1.2360 (1.2940) [2022-10-07 22:33:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][500/1251] eta 0:04:07 lr 0.000723 time 0.3237 (0.3296) loss 3.8741 (3.6867) grad_norm 1.2974 (1.2964) [2022-10-07 22:33:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][600/1251] eta 0:03:33 lr 0.000723 time 0.3308 (0.3287) loss 3.8941 (3.6903) grad_norm 1.3034 (1.2973) [2022-10-07 22:34:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][700/1251] eta 0:03:00 lr 0.000722 time 0.3321 (0.3281) loss 3.4369 (3.6942) grad_norm 1.2823 (1.2983) [2022-10-07 22:34:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][800/1251] eta 0:02:27 lr 0.000722 time 0.3229 (0.3279) loss 3.7506 (3.6950) grad_norm 1.2897 (1.3039) [2022-10-07 22:35:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][900/1251] eta 0:01:55 lr 0.000722 time 0.3217 (0.3277) loss 3.6888 (3.7020) grad_norm 1.2637 (1.3046) [2022-10-07 22:36:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][1000/1251] eta 0:01:22 lr 0.000721 time 0.3254 (0.3277) loss 3.7270 (3.7016) grad_norm 1.5090 (1.3068) [2022-10-07 22:36:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][1100/1251] eta 0:00:49 lr 0.000721 time 0.3305 (0.3277) loss 3.9174 (3.7005) grad_norm 1.2464 (1.3106) [2022-10-07 22:37:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [106/300][1200/1251] eta 0:00:16 lr 0.000721 time 0.3244 (0.3278) loss 3.8193 (3.7013) grad_norm 1.6736 (1.3122) [2022-10-07 22:37:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 106 training takes 0:06:50 [2022-10-07 22:37:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.851 (2.851) Loss 1.1030 (1.1030) Acc@1 73.047 (73.047) Acc@5 93.164 (93.164) [2022-10-07 22:37:37 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.838 Acc@5 92.234 [2022-10-07 22:37:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-07 22:37:37 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 73.88% [2022-10-07 22:37:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][0/1251] eta 0:54:44 lr 0.000720 time 2.6251 (2.6251) loss 3.7452 (3.7452) grad_norm 1.2775 (1.2775) [2022-10-07 22:38:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][100/1251] eta 0:06:41 lr 0.000720 time 0.3274 (0.3486) loss 3.7593 (3.6641) grad_norm 1.2567 (1.3212) [2022-10-07 22:38:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][200/1251] eta 0:05:53 lr 0.000720 time 0.3253 (0.3366) loss 3.3951 (3.6807) grad_norm 1.3680 (1.3266) [2022-10-07 22:39:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][300/1251] eta 0:05:16 lr 0.000719 time 0.3276 (0.3325) loss 3.6069 (3.6854) grad_norm 1.3862 (1.3242) [2022-10-07 22:39:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][400/1251] eta 0:04:41 lr 0.000719 time 0.3239 (0.3302) loss 3.8581 (3.6825) grad_norm 1.3175 (1.3282) [2022-10-07 22:40:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][500/1251] eta 0:04:06 lr 0.000719 time 0.3213 (0.3288) loss 3.3650 (3.6850) grad_norm 1.5876 (1.3298) [2022-10-07 22:40:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][600/1251] eta 0:03:33 lr 0.000718 time 0.3266 (0.3279) loss 3.1995 (3.6852) grad_norm 1.2587 (1.3308) [2022-10-07 22:41:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][700/1251] eta 0:03:00 lr 0.000718 time 0.3182 (0.3273) loss 3.7312 (3.6855) grad_norm 1.2119 (1.3320) [2022-10-07 22:41:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][800/1251] eta 0:02:27 lr 0.000717 time 0.3282 (0.3269) loss 3.6144 (3.6873) grad_norm 1.3165 (1.3274) [2022-10-07 22:42:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][900/1251] eta 0:01:54 lr 0.000717 time 0.3299 (0.3268) loss 3.5013 (3.6890) grad_norm 1.2812 (1.3287) [2022-10-07 22:43:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][1000/1251] eta 0:01:22 lr 0.000717 time 0.3212 (0.3267) loss 3.3890 (3.6918) grad_norm 1.5800 (1.3313) [2022-10-07 22:43:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][1100/1251] eta 0:00:49 lr 0.000716 time 0.3306 (0.3266) loss 3.6474 (3.6915) grad_norm 1.2297 (1.3304) [2022-10-07 22:44:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [107/300][1200/1251] eta 0:00:16 lr 0.000716 time 0.3198 (0.3265) loss 3.9517 (3.6932) grad_norm 1.4956 (1.3330) [2022-10-07 22:44:26 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 107 training takes 0:06:48 [2022-10-07 22:44:29 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.471 (2.471) Loss 1.1048 (1.1048) Acc@1 74.902 (74.902) Acc@5 90.918 (90.918) [2022-10-07 22:44:40 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.996 Acc@5 92.350 [2022-10-07 22:44:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-07 22:44:40 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.00% [2022-10-07 22:44:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][0/1251] eta 0:59:49 lr 0.000716 time 2.8694 (2.8694) loss 3.6600 (3.6600) grad_norm 1.2057 (1.2057) [2022-10-07 22:45:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][100/1251] eta 0:06:45 lr 0.000715 time 0.3265 (0.3527) loss 3.7622 (3.6872) grad_norm 1.2908 (1.3265) [2022-10-07 22:45:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][200/1251] eta 0:05:56 lr 0.000715 time 0.3289 (0.3396) loss 3.9880 (3.6900) grad_norm 1.2815 (1.3452) [2022-10-07 22:46:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][300/1251] eta 0:05:18 lr 0.000715 time 0.3249 (0.3351) loss 3.8252 (3.6918) grad_norm 1.2847 (1.3424) [2022-10-07 22:46:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][400/1251] eta 0:04:43 lr 0.000714 time 0.3259 (0.3326) loss 3.4998 (3.6823) grad_norm 1.1059 (1.3385) [2022-10-07 22:47:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][500/1251] eta 0:04:08 lr 0.000714 time 0.3309 (0.3311) loss 3.8509 (3.6852) grad_norm 1.3266 (1.3433) [2022-10-07 22:47:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][600/1251] eta 0:03:34 lr 0.000714 time 0.3277 (0.3301) loss 3.6396 (3.6861) grad_norm 1.3537 (1.3431) [2022-10-07 22:48:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][700/1251] eta 0:03:01 lr 0.000713 time 0.3283 (0.3294) loss 3.3259 (3.6858) grad_norm 1.4400 (1.3404) [2022-10-07 22:49:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][800/1251] eta 0:02:28 lr 0.000713 time 0.3282 (0.3289) loss 3.7922 (3.6855) grad_norm 1.1956 (1.3391) [2022-10-07 22:49:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][900/1251] eta 0:01:55 lr 0.000712 time 0.3258 (0.3287) loss 3.5666 (3.6854) grad_norm 1.3286 (1.3370) [2022-10-07 22:50:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][1000/1251] eta 0:01:22 lr 0.000712 time 0.3284 (0.3286) loss 3.4782 (3.6855) grad_norm 1.0982 (1.3368) [2022-10-07 22:50:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][1100/1251] eta 0:00:49 lr 0.000712 time 0.3271 (0.3285) loss 3.6647 (3.6845) grad_norm 1.2002 (1.3349) [2022-10-07 22:51:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [108/300][1200/1251] eta 0:00:16 lr 0.000711 time 0.3258 (0.3284) loss 3.4545 (3.6863) grad_norm 1.4853 (1.3328) [2022-10-07 22:51:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 108 training takes 0:06:51 [2022-10-07 22:51:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.000 (3.000) Loss 1.0633 (1.0633) Acc@1 74.121 (74.121) Acc@5 93.359 (93.359) [2022-10-07 22:51:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 73.806 Acc@5 92.278 [2022-10-07 22:51:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-07 22:51:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.00% [2022-10-07 22:51:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][0/1251] eta 0:50:31 lr 0.000711 time 2.4230 (2.4230) loss 3.7734 (3.7734) grad_norm 1.3336 (1.3336) [2022-10-07 22:52:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][100/1251] eta 0:06:40 lr 0.000711 time 0.3226 (0.3478) loss 3.9080 (3.6896) grad_norm 1.2321 (1.3485) [2022-10-07 22:52:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][200/1251] eta 0:05:52 lr 0.000710 time 0.3261 (0.3354) loss 3.9797 (3.6911) grad_norm 1.1710 (1.3567) [2022-10-07 22:53:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][300/1251] eta 0:05:15 lr 0.000710 time 0.3235 (0.3315) loss 3.3970 (3.6813) grad_norm 1.3183 (1.3553) [2022-10-07 22:53:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][400/1251] eta 0:04:40 lr 0.000710 time 0.3205 (0.3296) loss 3.3441 (3.6858) grad_norm 1.2214 (1.3477) [2022-10-07 22:54:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][500/1251] eta 0:04:06 lr 0.000709 time 0.3214 (0.3284) loss 3.6397 (3.6820) grad_norm 1.1420 (1.3376) [2022-10-07 22:55:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][600/1251] eta 0:03:33 lr 0.000709 time 0.3195 (0.3276) loss 3.7723 (3.6792) grad_norm 1.4293 (1.3378) [2022-10-07 22:55:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][700/1251] eta 0:03:00 lr 0.000708 time 0.3241 (0.3273) loss 3.7595 (3.6805) grad_norm 1.3680 (1.3396) [2022-10-07 22:56:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][800/1251] eta 0:02:27 lr 0.000708 time 0.3289 (0.3272) loss 3.7511 (3.6800) grad_norm 1.2983 (1.3405) [2022-10-07 22:56:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][900/1251] eta 0:01:54 lr 0.000708 time 0.3199 (0.3273) loss 3.5108 (3.6806) grad_norm 1.2105 (1.3390) [2022-10-07 22:57:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][1000/1251] eta 0:01:22 lr 0.000707 time 0.3302 (0.3274) loss 3.4810 (3.6819) grad_norm 1.2516 (1.3344) [2022-10-07 22:57:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][1100/1251] eta 0:00:49 lr 0.000707 time 0.3222 (0.3275) loss 3.4506 (3.6808) grad_norm 1.3291 (1.3331) [2022-10-07 22:58:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [109/300][1200/1251] eta 0:00:16 lr 0.000707 time 0.3217 (0.3276) loss 3.7089 (3.6805) grad_norm 1.2288 (1.3343) [2022-10-07 22:58:35 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 109 training takes 0:06:50 [2022-10-07 22:58:38 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.921 (2.921) Loss 0.9869 (0.9869) Acc@1 76.465 (76.465) Acc@5 93.555 (93.555) [2022-10-07 22:58:48 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.032 Acc@5 92.264 [2022-10-07 22:58:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-07 22:58:48 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.03% [2022-10-07 22:58:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][0/1251] eta 1:07:27 lr 0.000706 time 3.2355 (3.2355) loss 3.8744 (3.8744) grad_norm 1.0659 (1.0659) [2022-10-07 22:59:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][100/1251] eta 0:06:47 lr 0.000706 time 0.3243 (0.3541) loss 3.6317 (3.6795) grad_norm 1.3675 (1.3435) [2022-10-07 22:59:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][200/1251] eta 0:05:56 lr 0.000706 time 0.3267 (0.3396) loss 3.5994 (3.6680) grad_norm 1.1482 (1.3481) [2022-10-07 23:00:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][300/1251] eta 0:05:18 lr 0.000705 time 0.3243 (0.3346) loss 3.6977 (3.6757) grad_norm 1.2603 (1.3426) [2022-10-07 23:01:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][400/1251] eta 0:04:42 lr 0.000705 time 0.3268 (0.3322) loss 3.6243 (3.6761) grad_norm 1.1957 (1.3426) [2022-10-07 23:01:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][500/1251] eta 0:04:08 lr 0.000704 time 0.3233 (0.3308) loss 3.7291 (3.6722) grad_norm 1.3426 (1.3400) [2022-10-07 23:02:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][600/1251] eta 0:03:34 lr 0.000704 time 0.3234 (0.3297) loss 3.8111 (3.6715) grad_norm 1.5057 (1.3385) [2022-10-07 23:02:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][700/1251] eta 0:03:01 lr 0.000704 time 0.3287 (0.3291) loss 3.8839 (3.6727) grad_norm 1.3122 (1.3406) [2022-10-07 23:03:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][800/1251] eta 0:02:28 lr 0.000703 time 0.3230 (0.3288) loss 3.7930 (3.6733) grad_norm 1.4505 (1.3349) [2022-10-07 23:03:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][900/1251] eta 0:01:55 lr 0.000703 time 0.3314 (0.3284) loss 3.2756 (3.6725) grad_norm 2.1525 (1.3361) [2022-10-07 23:04:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][1000/1251] eta 0:01:22 lr 0.000703 time 0.3233 (0.3281) loss 3.8293 (3.6716) grad_norm 1.5439 (1.3336) [2022-10-07 23:04:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][1100/1251] eta 0:00:49 lr 0.000702 time 0.3270 (0.3280) loss 3.5482 (3.6743) grad_norm 1.2386 (1.3351) [2022-10-07 23:05:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [110/300][1200/1251] eta 0:00:16 lr 0.000702 time 0.3228 (0.3279) loss 3.6285 (3.6740) grad_norm 1.1798 (1.3340) [2022-10-07 23:05:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 110 training takes 0:06:50 [2022-10-07 23:05:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_110 saving...... [2022-10-07 23:05:40 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_110 saved !!! [2022-10-07 23:05:43 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.023 (3.023) Loss 1.0860 (1.0860) Acc@1 75.488 (75.488) Acc@5 92.188 (92.188) [2022-10-07 23:05:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.044 Acc@5 92.172 [2022-10-07 23:05:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-07 23:05:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.04% [2022-10-07 23:05:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][0/1251] eta 0:56:02 lr 0.000702 time 2.6882 (2.6882) loss 3.7386 (3.7386) grad_norm 1.1902 (1.1902) [2022-10-07 23:06:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][100/1251] eta 0:06:46 lr 0.000701 time 0.3238 (0.3532) loss 3.7129 (3.6558) grad_norm 1.1793 (1.3589) [2022-10-07 23:07:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][200/1251] eta 0:05:57 lr 0.000701 time 0.3291 (0.3402) loss 3.6814 (3.6586) grad_norm 1.2468 (1.3489) [2022-10-07 23:07:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][300/1251] eta 0:05:19 lr 0.000700 time 0.3310 (0.3363) loss 3.6030 (3.6629) grad_norm 1.5408 (1.3459) [2022-10-07 23:08:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][400/1251] eta 0:04:44 lr 0.000700 time 0.3267 (0.3345) loss 3.3336 (3.6659) grad_norm 1.3416 (1.3431) [2022-10-07 23:08:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][500/1251] eta 0:04:10 lr 0.000700 time 0.3370 (0.3334) loss 3.7799 (3.6672) grad_norm 1.6212 (1.3419) [2022-10-07 23:09:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][600/1251] eta 0:03:36 lr 0.000699 time 0.3218 (0.3327) loss 3.5982 (3.6691) grad_norm 1.3869 (1.3442) [2022-10-07 23:09:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][700/1251] eta 0:03:03 lr 0.000699 time 0.3313 (0.3323) loss 3.6623 (3.6705) grad_norm 1.5992 (1.3441) [2022-10-07 23:10:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][800/1251] eta 0:02:29 lr 0.000699 time 0.3235 (0.3320) loss 3.4533 (3.6692) grad_norm 1.2376 (1.3483) [2022-10-07 23:10:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][900/1251] eta 0:01:56 lr 0.000698 time 0.3242 (0.3319) loss 3.5259 (3.6713) grad_norm 1.1400 (1.3502) [2022-10-07 23:11:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][1000/1251] eta 0:01:23 lr 0.000698 time 0.3316 (0.3318) loss 3.6194 (3.6719) grad_norm 1.1306 (1.3491) [2022-10-07 23:11:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][1100/1251] eta 0:00:50 lr 0.000697 time 0.3288 (0.3318) loss 3.5153 (3.6730) grad_norm 1.2752 (1.3494) [2022-10-07 23:12:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [111/300][1200/1251] eta 0:00:16 lr 0.000697 time 0.3260 (0.3319) loss 3.7748 (3.6734) grad_norm 1.3152 (1.3508) [2022-10-07 23:12:49 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 111 training takes 0:06:55 [2022-10-07 23:12:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.664 (2.664) Loss 1.1136 (1.1136) Acc@1 73.633 (73.633) Acc@5 92.578 (92.578) [2022-10-07 23:13:02 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.030 Acc@5 92.384 [2022-10-07 23:13:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-07 23:13:02 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.04% [2022-10-07 23:13:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][0/1251] eta 0:47:54 lr 0.000697 time 2.2979 (2.2979) loss 3.7411 (3.7411) grad_norm 1.7391 (1.7391) [2022-10-07 23:13:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][100/1251] eta 0:06:46 lr 0.000696 time 0.3264 (0.3535) loss 3.5786 (3.6528) grad_norm 1.2305 (1.3721) [2022-10-07 23:14:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][200/1251] eta 0:05:58 lr 0.000696 time 0.3333 (0.3408) loss 3.6326 (3.6713) grad_norm 1.1176 (1.3527) [2022-10-07 23:14:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][300/1251] eta 0:05:19 lr 0.000696 time 0.3275 (0.3363) loss 3.6341 (3.6704) grad_norm 1.2753 (1.3440) [2022-10-07 23:15:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][400/1251] eta 0:04:44 lr 0.000695 time 0.3297 (0.3340) loss 3.8771 (3.6765) grad_norm 1.3261 (1.3462) [2022-10-07 23:15:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][500/1251] eta 0:04:09 lr 0.000695 time 0.3276 (0.3328) loss 3.7399 (3.6777) grad_norm 1.4001 (1.3534) [2022-10-07 23:16:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][600/1251] eta 0:03:36 lr 0.000695 time 0.3437 (0.3322) loss 3.8374 (3.6813) grad_norm 1.2124 (1.3493) [2022-10-07 23:16:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][700/1251] eta 0:03:02 lr 0.000694 time 0.3258 (0.3319) loss 3.7874 (3.6809) grad_norm 1.2298 (1.3508) [2022-10-07 23:17:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][800/1251] eta 0:02:29 lr 0.000694 time 0.3299 (0.3317) loss 3.8661 (3.6794) grad_norm 1.8946 (1.3489) [2022-10-07 23:18:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][900/1251] eta 0:01:56 lr 0.000693 time 0.3325 (0.3316) loss 3.7236 (3.6760) grad_norm 1.3673 (1.3490) [2022-10-07 23:18:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][1000/1251] eta 0:01:23 lr 0.000693 time 0.3267 (0.3315) loss 3.5175 (3.6780) grad_norm 1.2724 (1.3505) [2022-10-07 23:19:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][1100/1251] eta 0:00:50 lr 0.000693 time 0.3237 (0.3314) loss 3.9489 (3.6776) grad_norm 1.4587 (1.3508) [2022-10-07 23:19:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [112/300][1200/1251] eta 0:00:16 lr 0.000692 time 0.3378 (0.3313) loss 3.6323 (3.6758) grad_norm 1.2164 (1.3499) [2022-10-07 23:19:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 112 training takes 0:06:54 [2022-10-07 23:20:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.065 (3.065) Loss 1.1940 (1.1940) Acc@1 71.387 (71.387) Acc@5 92.188 (92.188) [2022-10-07 23:20:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.266 Acc@5 92.416 [2022-10-07 23:20:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-07 23:20:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.27% [2022-10-07 23:20:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][0/1251] eta 1:07:16 lr 0.000692 time 3.2269 (3.2269) loss 3.7053 (3.7053) grad_norm 1.1761 (1.1761) [2022-10-07 23:20:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][100/1251] eta 0:06:46 lr 0.000692 time 0.3234 (0.3529) loss 3.4503 (3.6373) grad_norm 1.4440 (1.3526) [2022-10-07 23:21:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][200/1251] eta 0:05:55 lr 0.000691 time 0.3226 (0.3386) loss 3.5793 (3.6531) grad_norm 1.1744 (1.3568) [2022-10-07 23:21:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][300/1251] eta 0:05:17 lr 0.000691 time 0.3196 (0.3338) loss 3.5589 (3.6669) grad_norm 1.2521 (1.3514) [2022-10-07 23:22:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][400/1251] eta 0:04:42 lr 0.000690 time 0.3235 (0.3314) loss 3.6760 (3.6700) grad_norm 1.3121 (1.3473) [2022-10-07 23:22:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][500/1251] eta 0:04:07 lr 0.000690 time 0.3217 (0.3299) loss 3.7137 (3.6709) grad_norm 1.1976 (1.3571) [2022-10-07 23:23:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][600/1251] eta 0:03:34 lr 0.000690 time 0.3231 (0.3288) loss 3.5003 (3.6676) grad_norm 1.3794 (1.3606) [2022-10-07 23:24:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][700/1251] eta 0:03:00 lr 0.000689 time 0.3211 (0.3284) loss 3.5631 (3.6668) grad_norm 1.4838 (1.3633) [2022-10-07 23:24:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][800/1251] eta 0:02:27 lr 0.000689 time 0.3222 (0.3276) loss 3.4090 (3.6669) grad_norm 1.4289 (1.3614) [2022-10-07 23:25:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][900/1251] eta 0:01:54 lr 0.000689 time 0.3222 (0.3271) loss 3.4444 (3.6664) grad_norm 1.3706 (1.3647) [2022-10-07 23:25:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][1000/1251] eta 0:01:21 lr 0.000688 time 0.3230 (0.3267) loss 3.5534 (3.6660) grad_norm 1.3750 (1.3649) [2022-10-07 23:26:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][1100/1251] eta 0:00:49 lr 0.000688 time 0.3273 (0.3264) loss 3.8868 (3.6664) grad_norm 1.8107 (1.3639) [2022-10-07 23:26:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [113/300][1200/1251] eta 0:00:16 lr 0.000687 time 0.3253 (0.3262) loss 3.7147 (3.6670) grad_norm 1.1884 (1.3629) [2022-10-07 23:26:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 113 training takes 0:06:48 [2022-10-07 23:27:02 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.746 (2.746) Loss 1.0359 (1.0359) Acc@1 74.316 (74.316) Acc@5 93.262 (93.262) [2022-10-07 23:27:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.122 Acc@5 92.390 [2022-10-07 23:27:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-10-07 23:27:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.27% [2022-10-07 23:27:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][0/1251] eta 1:00:07 lr 0.000687 time 2.8835 (2.8835) loss 3.7152 (3.7152) grad_norm 1.3000 (1.3000) [2022-10-07 23:27:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][100/1251] eta 0:06:44 lr 0.000687 time 0.3232 (0.3513) loss 3.7931 (3.6327) grad_norm 1.4278 (1.3564) [2022-10-07 23:28:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][200/1251] eta 0:05:55 lr 0.000686 time 0.3261 (0.3380) loss 3.8324 (3.6595) grad_norm 1.2031 (1.3518) [2022-10-07 23:28:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][300/1251] eta 0:05:17 lr 0.000686 time 0.3234 (0.3336) loss 3.5144 (3.6621) grad_norm 1.3555 (1.3531) [2022-10-07 23:29:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][400/1251] eta 0:04:42 lr 0.000686 time 0.3207 (0.3317) loss 3.4777 (3.6657) grad_norm 1.1942 (1.3441) [2022-10-07 23:29:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][500/1251] eta 0:04:08 lr 0.000685 time 0.3243 (0.3306) loss 3.3577 (3.6629) grad_norm 1.1379 (1.3487) [2022-10-07 23:30:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][600/1251] eta 0:03:34 lr 0.000685 time 0.3222 (0.3300) loss 3.8524 (3.6642) grad_norm 1.3074 (1.3473) [2022-10-07 23:31:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][700/1251] eta 0:03:01 lr 0.000685 time 0.3345 (0.3297) loss 3.4267 (3.6666) grad_norm 1.3703 (1.3464) [2022-10-07 23:31:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][800/1251] eta 0:02:28 lr 0.000684 time 0.3273 (0.3294) loss 3.8962 (3.6672) grad_norm 1.3818 (1.3439) [2022-10-07 23:32:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][900/1251] eta 0:01:55 lr 0.000684 time 0.3255 (0.3295) loss 3.6222 (3.6643) grad_norm 1.3497 (1.3510) [2022-10-07 23:32:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][1000/1251] eta 0:01:22 lr 0.000683 time 0.3235 (0.3295) loss 3.7620 (3.6664) grad_norm 1.5527 (1.3561) [2022-10-07 23:33:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][1100/1251] eta 0:00:49 lr 0.000683 time 0.3317 (0.3295) loss 3.6657 (3.6684) grad_norm 1.4482 (1.3558) [2022-10-07 23:33:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [114/300][1200/1251] eta 0:00:16 lr 0.000683 time 0.3278 (0.3296) loss 3.5491 (3.6677) grad_norm 1.1513 (1.3551) [2022-10-07 23:34:05 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 114 training takes 0:06:52 [2022-10-07 23:34:08 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.622 (2.622) Loss 1.1186 (1.1186) Acc@1 73.340 (73.340) Acc@5 91.992 (91.992) [2022-10-07 23:34:19 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.272 Acc@5 92.388 [2022-10-07 23:34:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-07 23:34:19 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.27% [2022-10-07 23:34:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][0/1251] eta 1:02:41 lr 0.000682 time 3.0071 (3.0071) loss 3.8097 (3.8097) grad_norm 1.7288 (1.7288) [2022-10-07 23:34:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][100/1251] eta 0:06:45 lr 0.000682 time 0.3244 (0.3519) loss 3.3832 (3.6395) grad_norm 1.1706 (1.3782) [2022-10-07 23:35:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][200/1251] eta 0:05:56 lr 0.000682 time 0.3340 (0.3389) loss 3.6770 (3.6601) grad_norm 1.2048 (1.3712) [2022-10-07 23:36:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][300/1251] eta 0:05:18 lr 0.000681 time 0.3357 (0.3345) loss 3.7150 (3.6655) grad_norm 1.2952 (1.3752) [2022-10-07 23:36:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][400/1251] eta 0:04:42 lr 0.000681 time 0.3282 (0.3322) loss 3.6789 (3.6614) grad_norm 1.1610 (1.3611) [2022-10-07 23:37:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][500/1251] eta 0:04:08 lr 0.000680 time 0.3275 (0.3309) loss 3.5606 (3.6605) grad_norm 1.3240 (1.3676) [2022-10-07 23:37:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][600/1251] eta 0:03:34 lr 0.000680 time 0.3242 (0.3300) loss 3.4823 (3.6558) grad_norm 1.6008 (1.3593) [2022-10-07 23:38:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][700/1251] eta 0:03:01 lr 0.000680 time 0.3259 (0.3293) loss 3.6219 (3.6567) grad_norm 1.3020 (1.3583) [2022-10-07 23:38:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][800/1251] eta 0:02:28 lr 0.000679 time 0.3259 (0.3288) loss 3.5931 (3.6565) grad_norm 1.4182 (1.3638) [2022-10-07 23:39:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][900/1251] eta 0:01:55 lr 0.000679 time 0.3246 (0.3284) loss 3.8244 (3.6544) grad_norm 1.2972 (1.3615) [2022-10-07 23:39:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][1000/1251] eta 0:01:22 lr 0.000679 time 0.3247 (0.3281) loss 3.7394 (3.6541) grad_norm 1.3759 (1.3624) [2022-10-07 23:40:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][1100/1251] eta 0:00:49 lr 0.000678 time 0.3278 (0.3277) loss 3.7393 (3.6576) grad_norm 1.7458 (1.3645) [2022-10-07 23:40:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [115/300][1200/1251] eta 0:00:16 lr 0.000678 time 0.3222 (0.3276) loss 3.6162 (3.6566) grad_norm 1.3178 (1.3634) [2022-10-07 23:41:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 115 training takes 0:06:50 [2022-10-07 23:41:12 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.931 (2.931) Loss 1.1578 (1.1578) Acc@1 73.242 (73.242) Acc@5 92.383 (92.383) [2022-10-07 23:41:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.380 Acc@5 92.462 [2022-10-07 23:41:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-10-07 23:41:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.38% [2022-10-07 23:41:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][0/1251] eta 0:47:38 lr 0.000678 time 2.2847 (2.2847) loss 3.5022 (3.5022) grad_norm 1.5337 (1.5337) [2022-10-07 23:41:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][100/1251] eta 0:06:43 lr 0.000677 time 0.3253 (0.3505) loss 3.7277 (3.6412) grad_norm 1.4668 (1.3374) [2022-10-07 23:42:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][200/1251] eta 0:05:55 lr 0.000677 time 0.3208 (0.3379) loss 3.7220 (3.6497) grad_norm 1.3696 (1.3630) [2022-10-07 23:43:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][300/1251] eta 0:05:17 lr 0.000676 time 0.3277 (0.3338) loss 3.8011 (3.6545) grad_norm 1.6974 (1.3582) [2022-10-07 23:43:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][400/1251] eta 0:04:42 lr 0.000676 time 0.3275 (0.3317) loss 3.2531 (3.6509) grad_norm 1.2682 (1.3613) [2022-10-07 23:44:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][500/1251] eta 0:04:08 lr 0.000676 time 0.3234 (0.3303) loss 3.8054 (3.6603) grad_norm 1.3059 (1.3665) [2022-10-07 23:44:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][600/1251] eta 0:03:34 lr 0.000675 time 0.3249 (0.3299) loss 3.8331 (3.6601) grad_norm 1.3808 (1.3614) [2022-10-07 23:45:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][700/1251] eta 0:03:01 lr 0.000675 time 0.3284 (0.3291) loss 3.6421 (3.6580) grad_norm 1.4954 (1.3653) [2022-10-07 23:45:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][800/1251] eta 0:02:28 lr 0.000674 time 0.3252 (0.3285) loss 3.8808 (3.6630) grad_norm 1.4674 (1.3619) [2022-10-07 23:46:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][900/1251] eta 0:01:55 lr 0.000674 time 0.3287 (0.3280) loss 3.7430 (3.6641) grad_norm 1.5140 (1.3627) [2022-10-07 23:46:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][1000/1251] eta 0:01:22 lr 0.000674 time 0.3257 (0.3278) loss 3.5767 (3.6604) grad_norm 1.2585 (1.3611) [2022-10-07 23:47:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][1100/1251] eta 0:00:49 lr 0.000673 time 0.3254 (0.3275) loss 3.6810 (3.6587) grad_norm 1.3681 (1.3640) [2022-10-07 23:47:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [116/300][1200/1251] eta 0:00:16 lr 0.000673 time 0.3267 (0.3274) loss 3.7862 (3.6600) grad_norm 1.2705 (1.3619) [2022-10-07 23:48:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 116 training takes 0:06:49 [2022-10-07 23:48:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.375 (2.375) Loss 1.1730 (1.1730) Acc@1 72.266 (72.266) Acc@5 91.406 (91.406) [2022-10-07 23:48:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.142 Acc@5 92.452 [2022-10-07 23:48:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-10-07 23:48:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.38% [2022-10-07 23:48:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][0/1251] eta 0:58:14 lr 0.000673 time 2.7932 (2.7932) loss 3.8300 (3.8300) grad_norm 1.2163 (1.2163) [2022-10-07 23:49:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][100/1251] eta 0:06:45 lr 0.000672 time 0.3265 (0.3522) loss 3.7900 (3.6242) grad_norm 1.4045 (1.4045) [2022-10-07 23:49:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][200/1251] eta 0:05:56 lr 0.000672 time 0.3263 (0.3395) loss 3.4876 (3.6534) grad_norm 1.2980 (1.3749) [2022-10-07 23:50:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][300/1251] eta 0:05:18 lr 0.000672 time 0.3303 (0.3351) loss 3.4588 (3.6447) grad_norm 1.3375 (1.3811) [2022-10-07 23:50:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][400/1251] eta 0:04:43 lr 0.000671 time 0.3264 (0.3329) loss 3.5692 (3.6505) grad_norm 1.4087 (1.3764) [2022-10-07 23:51:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][500/1251] eta 0:04:09 lr 0.000671 time 0.3413 (0.3321) loss 3.4421 (3.6497) grad_norm 1.3016 (1.3726) [2022-10-07 23:51:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][600/1251] eta 0:03:35 lr 0.000670 time 0.3230 (0.3316) loss 3.6810 (3.6491) grad_norm 1.1383 (1.3703) [2022-10-07 23:52:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][700/1251] eta 0:03:02 lr 0.000670 time 0.3265 (0.3314) loss 3.5152 (3.6483) grad_norm 1.5676 (1.3657) [2022-10-07 23:52:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][800/1251] eta 0:02:29 lr 0.000670 time 0.3228 (0.3313) loss 3.3874 (3.6499) grad_norm 1.4626 (1.3679) [2022-10-07 23:53:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][900/1251] eta 0:01:56 lr 0.000669 time 0.3335 (0.3314) loss 3.6743 (3.6532) grad_norm 1.3915 (1.3665) [2022-10-07 23:53:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][1000/1251] eta 0:01:23 lr 0.000669 time 0.3355 (0.3315) loss 3.6329 (3.6563) grad_norm 1.3657 (1.3625) [2022-10-07 23:54:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][1100/1251] eta 0:00:50 lr 0.000668 time 0.3335 (0.3315) loss 3.8191 (3.6578) grad_norm 1.7172 (1.3616) [2022-10-07 23:55:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [117/300][1200/1251] eta 0:00:16 lr 0.000668 time 0.3326 (0.3316) loss 3.7013 (3.6560) grad_norm 1.3483 (1.3647) [2022-10-07 23:55:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 117 training takes 0:06:55 [2022-10-07 23:55:24 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.521 (2.521) Loss 1.0606 (1.0606) Acc@1 75.977 (75.977) Acc@5 93.164 (93.164) [2022-10-07 23:55:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.396 Acc@5 92.444 [2022-10-07 23:55:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-10-07 23:55:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.40% [2022-10-07 23:55:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][0/1251] eta 0:59:34 lr 0.000668 time 2.8574 (2.8574) loss 3.5393 (3.5393) grad_norm 1.4228 (1.4228) [2022-10-07 23:56:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][100/1251] eta 0:06:44 lr 0.000667 time 0.3277 (0.3512) loss 3.6455 (3.6382) grad_norm 1.5280 (1.3529) [2022-10-07 23:56:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][200/1251] eta 0:05:55 lr 0.000667 time 0.3208 (0.3379) loss 3.5961 (3.6628) grad_norm 1.2864 (1.3547) [2022-10-07 23:57:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][300/1251] eta 0:05:17 lr 0.000667 time 0.3238 (0.3335) loss 3.8555 (3.6519) grad_norm 1.4181 (1.3496) [2022-10-07 23:57:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][400/1251] eta 0:04:41 lr 0.000666 time 0.3189 (0.3312) loss 3.5922 (3.6512) grad_norm 1.4375 (1.3479) [2022-10-07 23:58:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][500/1251] eta 0:04:07 lr 0.000666 time 0.3258 (0.3299) loss 3.6548 (3.6494) grad_norm 1.3850 (1.3516) [2022-10-07 23:58:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][600/1251] eta 0:03:34 lr 0.000665 time 0.3232 (0.3291) loss 3.6495 (3.6513) grad_norm 1.3398 (1.3583) [2022-10-07 23:59:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][700/1251] eta 0:03:01 lr 0.000665 time 0.3247 (0.3285) loss 3.2934 (3.6539) grad_norm 1.4465 (1.3567) [2022-10-07 23:59:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][800/1251] eta 0:02:27 lr 0.000665 time 0.3217 (0.3281) loss 3.6720 (3.6565) grad_norm 1.3386 (1.3595) [2022-10-08 00:00:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][900/1251] eta 0:01:54 lr 0.000664 time 0.3269 (0.3276) loss 3.6277 (3.6546) grad_norm 1.2472 (1.3590) [2022-10-08 00:01:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][1000/1251] eta 0:01:22 lr 0.000664 time 0.3268 (0.3273) loss 3.7763 (3.6488) grad_norm 1.6635 (1.3626) [2022-10-08 00:01:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][1100/1251] eta 0:00:49 lr 0.000663 time 0.3233 (0.3271) loss 3.6883 (3.6495) grad_norm 1.2965 (1.3629) [2022-10-08 00:02:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [118/300][1200/1251] eta 0:00:16 lr 0.000663 time 0.3265 (0.3271) loss 3.9317 (3.6528) grad_norm 1.1974 (1.3626) [2022-10-08 00:02:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 118 training takes 0:06:49 [2022-10-08 00:02:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.153 (2.153) Loss 1.1037 (1.1037) Acc@1 72.559 (72.559) Acc@5 92.969 (92.969) [2022-10-08 00:02:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.490 Acc@5 92.466 [2022-10-08 00:02:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-08 00:02:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.49% [2022-10-08 00:02:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][0/1251] eta 0:59:24 lr 0.000663 time 2.8497 (2.8497) loss 3.7709 (3.7709) grad_norm 1.2058 (1.2058) [2022-10-08 00:03:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][100/1251] eta 0:06:44 lr 0.000662 time 0.3221 (0.3511) loss 3.7367 (3.6627) grad_norm 1.2797 (1.3826) [2022-10-08 00:03:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][200/1251] eta 0:05:55 lr 0.000662 time 0.3279 (0.3380) loss 3.7840 (3.6469) grad_norm 1.3749 (1.3706) [2022-10-08 00:04:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][300/1251] eta 0:05:17 lr 0.000662 time 0.3297 (0.3335) loss 3.8369 (3.6437) grad_norm 1.2783 (1.3699) [2022-10-08 00:04:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][400/1251] eta 0:04:42 lr 0.000661 time 0.3254 (0.3314) loss 3.5004 (3.6447) grad_norm 1.2538 (1.3747) [2022-10-08 00:05:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][500/1251] eta 0:04:08 lr 0.000661 time 0.3240 (0.3306) loss 3.4035 (3.6416) grad_norm 1.3221 (1.3728) [2022-10-08 00:05:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][600/1251] eta 0:03:34 lr 0.000661 time 0.3237 (0.3295) loss 3.9241 (3.6376) grad_norm 1.3916 (1.3779) [2022-10-08 00:06:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][700/1251] eta 0:03:01 lr 0.000660 time 0.3222 (0.3288) loss 3.6825 (3.6409) grad_norm 1.3094 (1.3806) [2022-10-08 00:07:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][800/1251] eta 0:02:28 lr 0.000660 time 0.3242 (0.3283) loss 3.6562 (3.6419) grad_norm 1.3045 (1.3791) [2022-10-08 00:07:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][900/1251] eta 0:01:55 lr 0.000659 time 0.3285 (0.3279) loss 3.5801 (3.6495) grad_norm 1.1990 (1.3788) [2022-10-08 00:08:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][1000/1251] eta 0:01:22 lr 0.000659 time 0.3279 (0.3276) loss 3.5625 (3.6498) grad_norm 1.4121 (1.3810) [2022-10-08 00:08:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][1100/1251] eta 0:00:49 lr 0.000659 time 0.3242 (0.3274) loss 3.5479 (3.6522) grad_norm 1.3897 (1.3780) [2022-10-08 00:09:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [119/300][1200/1251] eta 0:00:16 lr 0.000658 time 0.3291 (0.3272) loss 3.8180 (3.6535) grad_norm 1.2118 (1.3788) [2022-10-08 00:09:27 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 119 training takes 0:06:49 [2022-10-08 00:09:30 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.670 (2.670) Loss 1.1333 (1.1333) Acc@1 73.633 (73.633) Acc@5 92.480 (92.480) [2022-10-08 00:09:41 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.522 Acc@5 92.538 [2022-10-08 00:09:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-08 00:09:41 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.52% [2022-10-08 00:09:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][0/1251] eta 1:04:35 lr 0.000658 time 3.0976 (3.0976) loss 3.4234 (3.4234) grad_norm 1.3093 (1.3093) [2022-10-08 00:10:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][100/1251] eta 0:06:45 lr 0.000658 time 0.3245 (0.3526) loss 3.6574 (3.6200) grad_norm 1.3239 (1.3791) [2022-10-08 00:10:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][200/1251] eta 0:05:56 lr 0.000657 time 0.3254 (0.3389) loss 3.6907 (3.6260) grad_norm 1.2468 (1.3689) [2022-10-08 00:11:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][300/1251] eta 0:05:17 lr 0.000657 time 0.3264 (0.3342) loss 3.6495 (3.6244) grad_norm 1.2436 (1.3732) [2022-10-08 00:11:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][400/1251] eta 0:04:42 lr 0.000656 time 0.3218 (0.3317) loss 3.4504 (3.6313) grad_norm 1.1141 (1.3730) [2022-10-08 00:12:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][500/1251] eta 0:04:08 lr 0.000656 time 0.3235 (0.3305) loss 3.7324 (3.6308) grad_norm 1.3653 (1.3761) [2022-10-08 00:12:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][600/1251] eta 0:03:34 lr 0.000656 time 0.3296 (0.3298) loss 3.2988 (3.6353) grad_norm 1.3033 (1.3758) [2022-10-08 00:13:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][700/1251] eta 0:03:01 lr 0.000655 time 0.3298 (0.3295) loss 3.7135 (3.6335) grad_norm 1.2906 (1.3772) [2022-10-08 00:14:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][800/1251] eta 0:02:28 lr 0.000655 time 0.3293 (0.3293) loss 3.8371 (3.6338) grad_norm 1.3261 (1.3792) [2022-10-08 00:14:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][900/1251] eta 0:01:55 lr 0.000654 time 0.3270 (0.3293) loss 3.4586 (3.6349) grad_norm 1.1572 (1.3810) [2022-10-08 00:15:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][1000/1251] eta 0:01:22 lr 0.000654 time 0.3231 (0.3294) loss 3.7348 (3.6335) grad_norm 1.5252 (1.3804) [2022-10-08 00:15:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][1100/1251] eta 0:00:49 lr 0.000654 time 0.3291 (0.3293) loss 3.9519 (3.6354) grad_norm 1.4123 (1.3807) [2022-10-08 00:16:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [120/300][1200/1251] eta 0:00:16 lr 0.000653 time 0.3361 (0.3294) loss 3.8371 (3.6352) grad_norm 1.1733 (1.3797) [2022-10-08 00:16:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 120 training takes 0:06:52 [2022-10-08 00:16:33 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_120 saving...... [2022-10-08 00:16:34 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_120 saved !!! [2022-10-08 00:16:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.635 (2.635) Loss 1.0881 (1.0881) Acc@1 74.609 (74.609) Acc@5 91.992 (91.992) [2022-10-08 00:16:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.488 Acc@5 92.426 [2022-10-08 00:16:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-08 00:16:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.52% [2022-10-08 00:16:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][0/1251] eta 0:56:30 lr 0.000653 time 2.7099 (2.7099) loss 3.6383 (3.6383) grad_norm 1.3755 (1.3755) [2022-10-08 00:17:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][100/1251] eta 0:06:43 lr 0.000653 time 0.3252 (0.3505) loss 3.6383 (3.6185) grad_norm 1.3599 (1.4035) [2022-10-08 00:17:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][200/1251] eta 0:05:54 lr 0.000652 time 0.3201 (0.3373) loss 3.4645 (3.6186) grad_norm 1.4674 (1.3970) [2022-10-08 00:18:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][300/1251] eta 0:05:16 lr 0.000652 time 0.3240 (0.3330) loss 3.8184 (3.6238) grad_norm 1.5084 (1.4027) [2022-10-08 00:19:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][400/1251] eta 0:04:41 lr 0.000651 time 0.3234 (0.3308) loss 3.5157 (3.6250) grad_norm 1.6540 (1.4020) [2022-10-08 00:19:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][500/1251] eta 0:04:07 lr 0.000651 time 0.3229 (0.3296) loss 4.0166 (3.6289) grad_norm 1.3765 (1.4008) [2022-10-08 00:20:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][600/1251] eta 0:03:33 lr 0.000651 time 0.3237 (0.3287) loss 3.4653 (3.6321) grad_norm 1.2442 (1.4011) [2022-10-08 00:20:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][700/1251] eta 0:03:00 lr 0.000650 time 0.3251 (0.3280) loss 3.7640 (3.6340) grad_norm 1.5570 (1.3976) [2022-10-08 00:21:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][800/1251] eta 0:02:27 lr 0.000650 time 0.3211 (0.3275) loss 3.7645 (3.6344) grad_norm 1.5304 (1.3939) [2022-10-08 00:21:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][900/1251] eta 0:01:54 lr 0.000649 time 0.3252 (0.3271) loss 3.9923 (3.6357) grad_norm 1.7228 (1.3929) [2022-10-08 00:22:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][1000/1251] eta 0:01:22 lr 0.000649 time 0.3271 (0.3270) loss 3.7465 (3.6373) grad_norm 1.2987 (1.3917) [2022-10-08 00:22:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][1100/1251] eta 0:00:49 lr 0.000649 time 0.3257 (0.3269) loss 3.7699 (3.6393) grad_norm 1.7070 (1.3895) [2022-10-08 00:23:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [121/300][1200/1251] eta 0:00:16 lr 0.000648 time 0.3227 (0.3269) loss 3.3204 (3.6384) grad_norm 1.5626 (1.3888) [2022-10-08 00:23:36 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 121 training takes 0:06:49 [2022-10-08 00:23:39 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.852 (2.852) Loss 1.0707 (1.0707) Acc@1 75.098 (75.098) Acc@5 93.359 (93.359) [2022-10-08 00:23:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.730 Acc@5 92.648 [2022-10-08 00:23:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-08 00:23:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.73% [2022-10-08 00:23:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][0/1251] eta 1:07:49 lr 0.000648 time 3.2531 (3.2531) loss 3.7174 (3.7174) grad_norm 1.3942 (1.3942) [2022-10-08 00:24:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][100/1251] eta 0:06:48 lr 0.000648 time 0.3306 (0.3552) loss 3.8547 (3.6298) grad_norm 1.3892 (1.4020) [2022-10-08 00:24:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][200/1251] eta 0:05:57 lr 0.000647 time 0.3310 (0.3405) loss 3.7327 (3.6135) grad_norm 1.6266 (1.3964) [2022-10-08 00:25:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][300/1251] eta 0:05:20 lr 0.000647 time 0.3300 (0.3365) loss 3.2929 (3.6322) grad_norm 1.7068 (1.4059) [2022-10-08 00:26:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][400/1251] eta 0:04:43 lr 0.000646 time 0.3256 (0.3336) loss 3.4642 (3.6233) grad_norm 1.2604 (1.3995) [2022-10-08 00:26:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][500/1251] eta 0:04:09 lr 0.000646 time 0.3285 (0.3319) loss 3.8313 (3.6272) grad_norm 1.2534 (1.3967) [2022-10-08 00:27:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][600/1251] eta 0:03:35 lr 0.000646 time 0.3228 (0.3308) loss 3.3615 (3.6247) grad_norm 1.2644 (1.3971) [2022-10-08 00:27:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][700/1251] eta 0:03:01 lr 0.000645 time 0.3303 (0.3301) loss 3.6908 (3.6313) grad_norm 1.2327 (1.3976) [2022-10-08 00:28:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][800/1251] eta 0:02:28 lr 0.000645 time 0.3260 (0.3295) loss 3.5663 (3.6325) grad_norm 1.2486 (1.3971) [2022-10-08 00:28:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][900/1251] eta 0:01:55 lr 0.000644 time 0.3249 (0.3290) loss 3.9115 (3.6374) grad_norm 1.2690 (1.3989) [2022-10-08 00:29:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][1000/1251] eta 0:01:22 lr 0.000644 time 0.3225 (0.3286) loss 3.6576 (3.6368) grad_norm 1.3651 (1.3971) [2022-10-08 00:29:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][1100/1251] eta 0:00:49 lr 0.000644 time 0.3254 (0.3282) loss 3.7768 (3.6345) grad_norm 1.3836 (1.3975) [2022-10-08 00:30:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [122/300][1200/1251] eta 0:00:16 lr 0.000643 time 0.3246 (0.3278) loss 3.6173 (3.6362) grad_norm 1.3955 (1.3990) [2022-10-08 00:30:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 122 training takes 0:06:50 [2022-10-08 00:30:43 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.242 (3.242) Loss 0.9851 (0.9851) Acc@1 76.270 (76.270) Acc@5 93.945 (93.945) [2022-10-08 00:30:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.540 Acc@5 92.560 [2022-10-08 00:30:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-08 00:30:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.73% [2022-10-08 00:30:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][0/1251] eta 0:59:50 lr 0.000643 time 2.8704 (2.8704) loss 3.5015 (3.5015) grad_norm 1.4326 (1.4326) [2022-10-08 00:31:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][100/1251] eta 0:06:45 lr 0.000643 time 0.3226 (0.3523) loss 3.6414 (3.6281) grad_norm 1.1744 (1.4118) [2022-10-08 00:32:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][200/1251] eta 0:05:55 lr 0.000642 time 0.3263 (0.3386) loss 3.6400 (3.6304) grad_norm 1.6576 (1.3929) [2022-10-08 00:32:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][300/1251] eta 0:05:17 lr 0.000642 time 0.3253 (0.3341) loss 3.7508 (3.6217) grad_norm 1.3664 (1.4006) [2022-10-08 00:33:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][400/1251] eta 0:04:42 lr 0.000642 time 0.3256 (0.3320) loss 3.7041 (3.6272) grad_norm 1.5609 (1.3892) [2022-10-08 00:33:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][500/1251] eta 0:04:08 lr 0.000641 time 0.3252 (0.3308) loss 3.8296 (3.6356) grad_norm 1.7073 (1.3862) [2022-10-08 00:34:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][600/1251] eta 0:03:35 lr 0.000641 time 0.3261 (0.3303) loss 3.8392 (3.6355) grad_norm 1.3780 (1.3891) [2022-10-08 00:34:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][700/1251] eta 0:03:01 lr 0.000640 time 0.3228 (0.3301) loss 3.3653 (3.6379) grad_norm 1.3883 (1.3875) [2022-10-08 00:35:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][800/1251] eta 0:02:28 lr 0.000640 time 0.3378 (0.3298) loss 3.9229 (3.6399) grad_norm 1.2906 (1.3884) [2022-10-08 00:35:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][900/1251] eta 0:01:55 lr 0.000640 time 0.3260 (0.3300) loss 3.3255 (3.6393) grad_norm 1.3158 (1.3925) [2022-10-08 00:36:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][1000/1251] eta 0:01:22 lr 0.000639 time 0.3214 (0.3305) loss 3.7932 (3.6414) grad_norm 1.4511 (1.3906) [2022-10-08 00:36:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][1100/1251] eta 0:00:49 lr 0.000639 time 0.3313 (0.3305) loss 3.7447 (3.6426) grad_norm 1.2529 (1.3917) [2022-10-08 00:37:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [123/300][1200/1251] eta 0:00:16 lr 0.000638 time 0.3245 (0.3309) loss 3.4738 (3.6441) grad_norm 1.5287 (1.3935) [2022-10-08 00:37:48 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 123 training takes 0:06:54 [2022-10-08 00:37:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.767 (2.767) Loss 1.0270 (1.0270) Acc@1 76.660 (76.660) Acc@5 93.652 (93.652) [2022-10-08 00:38:02 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.602 Acc@5 92.608 [2022-10-08 00:38:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-10-08 00:38:02 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.73% [2022-10-08 00:38:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][0/1251] eta 0:53:13 lr 0.000638 time 2.5524 (2.5524) loss 3.5422 (3.5422) grad_norm 1.3406 (1.3406) [2022-10-08 00:38:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][100/1251] eta 0:06:46 lr 0.000638 time 0.3250 (0.3534) loss 3.6756 (3.5938) grad_norm 1.1496 (1.3505) [2022-10-08 00:39:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][200/1251] eta 0:05:57 lr 0.000637 time 0.3262 (0.3402) loss 3.8588 (3.6108) grad_norm 1.4706 (1.3759) [2022-10-08 00:39:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][300/1251] eta 0:05:19 lr 0.000637 time 0.3293 (0.3360) loss 3.6182 (3.6175) grad_norm 1.2179 (1.3898) [2022-10-08 00:40:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][400/1251] eta 0:04:44 lr 0.000637 time 0.3317 (0.3342) loss 3.5742 (3.6262) grad_norm 1.3839 (1.3927) [2022-10-08 00:40:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][500/1251] eta 0:04:10 lr 0.000636 time 0.3262 (0.3331) loss 3.4478 (3.6310) grad_norm 1.4761 (1.3940) [2022-10-08 00:41:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][600/1251] eta 0:03:36 lr 0.000636 time 0.3270 (0.3324) loss 3.6999 (3.6273) grad_norm 1.2193 (1.3952) [2022-10-08 00:41:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][700/1251] eta 0:03:02 lr 0.000635 time 0.3409 (0.3320) loss 3.3559 (3.6240) grad_norm 1.2743 (1.4000) [2022-10-08 00:42:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][800/1251] eta 0:02:29 lr 0.000635 time 0.3391 (0.3318) loss 3.8556 (3.6232) grad_norm 1.5147 (1.3986) [2022-10-08 00:43:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][900/1251] eta 0:01:56 lr 0.000635 time 0.3270 (0.3316) loss 3.8367 (3.6248) grad_norm 1.4789 (1.3971) [2022-10-08 00:43:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][1000/1251] eta 0:01:23 lr 0.000634 time 0.3330 (0.3316) loss 3.6792 (3.6279) grad_norm 1.7973 (1.3979) [2022-10-08 00:44:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][1100/1251] eta 0:00:50 lr 0.000634 time 0.3282 (0.3316) loss 3.8748 (3.6295) grad_norm 1.3521 (1.4004) [2022-10-08 00:44:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [124/300][1200/1251] eta 0:00:16 lr 0.000633 time 0.3435 (0.3316) loss 3.5327 (3.6313) grad_norm 1.5495 (1.4035) [2022-10-08 00:44:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 124 training takes 0:06:55 [2022-10-08 00:44:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.715 (2.715) Loss 1.0537 (1.0537) Acc@1 75.488 (75.488) Acc@5 92.773 (92.773) [2022-10-08 00:45:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.596 Acc@5 92.518 [2022-10-08 00:45:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-10-08 00:45:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.73% [2022-10-08 00:45:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][0/1251] eta 1:02:59 lr 0.000633 time 3.0211 (3.0211) loss 3.2014 (3.2014) grad_norm 1.3849 (1.3849) [2022-10-08 00:45:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][100/1251] eta 0:06:47 lr 0.000633 time 0.3248 (0.3539) loss 3.5899 (3.6022) grad_norm 1.2726 (1.3868) [2022-10-08 00:46:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][200/1251] eta 0:05:58 lr 0.000632 time 0.3310 (0.3411) loss 3.8750 (3.5956) grad_norm 1.6695 (1.3975) [2022-10-08 00:46:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][300/1251] eta 0:05:19 lr 0.000632 time 0.3270 (0.3363) loss 3.6021 (3.6108) grad_norm 1.7943 (1.4029) [2022-10-08 00:47:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][400/1251] eta 0:04:44 lr 0.000632 time 0.3277 (0.3339) loss 3.7595 (3.6152) grad_norm 1.4245 (1.4013) [2022-10-08 00:47:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][500/1251] eta 0:04:09 lr 0.000631 time 0.3305 (0.3325) loss 3.6892 (3.6076) grad_norm 1.5298 (1.4032) [2022-10-08 00:48:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][600/1251] eta 0:03:35 lr 0.000631 time 0.3391 (0.3316) loss 3.6565 (3.6111) grad_norm 1.4618 (1.4029) [2022-10-08 00:49:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][700/1251] eta 0:03:02 lr 0.000630 time 0.3312 (0.3310) loss 3.4404 (3.6116) grad_norm 1.4314 (1.4066) [2022-10-08 00:49:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][800/1251] eta 0:02:29 lr 0.000630 time 0.3277 (0.3305) loss 3.8069 (3.6105) grad_norm 1.3215 (1.4004) [2022-10-08 00:50:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][900/1251] eta 0:01:55 lr 0.000630 time 0.3242 (0.3302) loss 3.8495 (3.6134) grad_norm 1.3761 (1.4006) [2022-10-08 00:50:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][1000/1251] eta 0:01:22 lr 0.000629 time 0.3257 (0.3302) loss 3.5574 (3.6147) grad_norm 1.2749 (1.4036) [2022-10-08 00:51:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][1100/1251] eta 0:00:49 lr 0.000629 time 0.3329 (0.3302) loss 3.7334 (3.6176) grad_norm 1.7461 (1.4052) [2022-10-08 00:51:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [125/300][1200/1251] eta 0:00:16 lr 0.000628 time 0.3262 (0.3302) loss 3.5696 (3.6196) grad_norm 1.3734 (1.4057) [2022-10-08 00:52:04 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 125 training takes 0:06:53 [2022-10-08 00:52:06 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.362 (2.362) Loss 1.0119 (1.0119) Acc@1 75.391 (75.391) Acc@5 92.676 (92.676) [2022-10-08 00:52:17 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.716 Acc@5 92.732 [2022-10-08 00:52:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-08 00:52:17 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 74.73% [2022-10-08 00:52:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][0/1251] eta 0:45:16 lr 0.000628 time 2.1712 (2.1712) loss 3.8562 (3.8562) grad_norm 1.3839 (1.3839) [2022-10-08 00:52:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][100/1251] eta 0:06:43 lr 0.000628 time 0.3237 (0.3504) loss 3.9678 (3.6424) grad_norm 1.3103 (1.4404) [2022-10-08 00:53:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][200/1251] eta 0:05:55 lr 0.000627 time 0.3249 (0.3386) loss 3.8337 (3.6234) grad_norm 1.4942 (1.4234) [2022-10-08 00:53:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][300/1251] eta 0:05:18 lr 0.000627 time 0.3277 (0.3347) loss 3.5944 (3.6248) grad_norm 1.6414 (1.4197) [2022-10-08 00:54:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][400/1251] eta 0:04:43 lr 0.000626 time 0.3277 (0.3331) loss 3.4468 (3.6259) grad_norm 1.2493 (1.4199) [2022-10-08 00:55:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][500/1251] eta 0:04:09 lr 0.000626 time 0.3273 (0.3320) loss 3.4721 (3.6207) grad_norm 1.5261 (1.4159) [2022-10-08 00:55:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][600/1251] eta 0:03:35 lr 0.000626 time 0.3236 (0.3315) loss 3.8338 (3.6203) grad_norm 1.5567 (1.4174) [2022-10-08 00:56:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][700/1251] eta 0:03:02 lr 0.000625 time 0.3305 (0.3312) loss 3.4722 (3.6157) grad_norm 1.4815 (1.4142) [2022-10-08 00:56:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][800/1251] eta 0:02:29 lr 0.000625 time 0.3288 (0.3311) loss 3.6424 (3.6142) grad_norm 1.4934 (1.4099) [2022-10-08 00:57:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][900/1251] eta 0:01:56 lr 0.000624 time 0.3287 (0.3311) loss 3.7799 (3.6186) grad_norm 1.3826 (1.4120) [2022-10-08 00:57:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][1000/1251] eta 0:01:23 lr 0.000624 time 0.3263 (0.3311) loss 3.3637 (3.6209) grad_norm 1.3906 (1.4137) [2022-10-08 00:58:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][1100/1251] eta 0:00:50 lr 0.000624 time 0.3337 (0.3312) loss 3.5916 (3.6198) grad_norm 1.4997 (1.4156) [2022-10-08 00:58:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [126/300][1200/1251] eta 0:00:16 lr 0.000623 time 0.3256 (0.3312) loss 3.6368 (3.6209) grad_norm 1.2878 (1.4175) [2022-10-08 00:59:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 126 training takes 0:06:54 [2022-10-08 00:59:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.857 (2.857) Loss 1.1311 (1.1311) Acc@1 73.242 (73.242) Acc@5 91.797 (91.797) [2022-10-08 00:59:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.038 Acc@5 92.784 [2022-10-08 00:59:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-08 00:59:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 00:59:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][0/1251] eta 1:05:59 lr 0.000623 time 3.1648 (3.1648) loss 3.7791 (3.7791) grad_norm 1.3268 (1.3268) [2022-10-08 01:00:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][100/1251] eta 0:06:46 lr 0.000623 time 0.3251 (0.3528) loss 3.6203 (3.5977) grad_norm 1.3031 (1.4047) [2022-10-08 01:00:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][200/1251] eta 0:05:56 lr 0.000622 time 0.3289 (0.3392) loss 3.4619 (3.6082) grad_norm 1.2232 (1.4098) [2022-10-08 01:01:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][300/1251] eta 0:05:18 lr 0.000622 time 0.3244 (0.3346) loss 3.5079 (3.6070) grad_norm 1.5858 (1.4080) [2022-10-08 01:01:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][400/1251] eta 0:04:42 lr 0.000621 time 0.3211 (0.3323) loss 3.6631 (3.6072) grad_norm 1.2229 (1.4060) [2022-10-08 01:02:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][500/1251] eta 0:04:08 lr 0.000621 time 0.3244 (0.3310) loss 3.8059 (3.6140) grad_norm 1.2008 (1.4086) [2022-10-08 01:02:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][600/1251] eta 0:03:34 lr 0.000621 time 0.3219 (0.3301) loss 3.5438 (3.6155) grad_norm 1.4095 (1.4067) [2022-10-08 01:03:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][700/1251] eta 0:03:01 lr 0.000620 time 0.3252 (0.3294) loss 3.5796 (3.6143) grad_norm 1.7109 (1.4114) [2022-10-08 01:03:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][800/1251] eta 0:02:28 lr 0.000620 time 0.3301 (0.3291) loss 3.4792 (3.6140) grad_norm 1.2674 (1.4135) [2022-10-08 01:04:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][900/1251] eta 0:01:55 lr 0.000619 time 0.3249 (0.3290) loss 3.5930 (3.6127) grad_norm 1.2606 (1.4093) [2022-10-08 01:04:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][1000/1251] eta 0:01:22 lr 0.000619 time 0.3271 (0.3290) loss 3.4260 (3.6112) grad_norm 1.6112 (1.4098) [2022-10-08 01:05:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][1100/1251] eta 0:00:49 lr 0.000619 time 0.3383 (0.3290) loss 3.6464 (3.6115) grad_norm 1.4693 (1.4103) [2022-10-08 01:06:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [127/300][1200/1251] eta 0:00:16 lr 0.000618 time 0.3345 (0.3291) loss 3.6941 (3.6127) grad_norm 1.4168 (1.4100) [2022-10-08 01:06:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 127 training takes 0:06:52 [2022-10-08 01:06:21 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.998 (2.998) Loss 1.0315 (1.0315) Acc@1 75.195 (75.195) Acc@5 93.359 (93.359) [2022-10-08 01:06:32 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.746 Acc@5 92.618 [2022-10-08 01:06:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-08 01:06:32 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 01:06:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][0/1251] eta 1:07:41 lr 0.000618 time 3.2464 (3.2464) loss 3.4778 (3.4778) grad_norm 1.4117 (1.4117) [2022-10-08 01:07:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][100/1251] eta 0:06:49 lr 0.000618 time 0.3259 (0.3559) loss 3.5325 (3.6048) grad_norm 1.3611 (1.4199) [2022-10-08 01:07:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][200/1251] eta 0:05:58 lr 0.000617 time 0.3249 (0.3409) loss 3.7720 (3.6023) grad_norm 1.3995 (1.4275) [2022-10-08 01:08:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][300/1251] eta 0:05:19 lr 0.000617 time 0.3232 (0.3356) loss 3.6802 (3.6029) grad_norm 1.3711 (1.4287) [2022-10-08 01:08:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][400/1251] eta 0:04:43 lr 0.000616 time 0.3256 (0.3330) loss 3.4198 (3.6114) grad_norm 1.3076 (1.4216) [2022-10-08 01:09:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][500/1251] eta 0:04:08 lr 0.000616 time 0.3247 (0.3315) loss 3.6335 (3.6115) grad_norm 1.5063 (1.4248) [2022-10-08 01:09:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][600/1251] eta 0:03:35 lr 0.000616 time 0.3277 (0.3304) loss 3.5118 (3.6134) grad_norm 1.5729 (1.4248) [2022-10-08 01:10:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][700/1251] eta 0:03:01 lr 0.000615 time 0.3276 (0.3296) loss 3.5355 (3.6166) grad_norm 1.4351 (1.4252) [2022-10-08 01:10:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][800/1251] eta 0:02:28 lr 0.000615 time 0.3200 (0.3290) loss 3.4866 (3.6143) grad_norm 1.5071 (1.4224) [2022-10-08 01:11:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][900/1251] eta 0:01:55 lr 0.000614 time 0.3205 (0.3285) loss 3.4609 (3.6172) grad_norm 1.8146 (1.4206) [2022-10-08 01:12:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][1000/1251] eta 0:01:22 lr 0.000614 time 0.3252 (0.3283) loss 3.7800 (3.6182) grad_norm 1.2760 (1.4234) [2022-10-08 01:12:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][1100/1251] eta 0:00:49 lr 0.000614 time 0.3320 (0.3281) loss 3.6312 (3.6189) grad_norm 1.2973 (1.4238) [2022-10-08 01:13:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [128/300][1200/1251] eta 0:00:16 lr 0.000613 time 0.3299 (0.3280) loss 3.2549 (3.6216) grad_norm 1.3475 (1.4227) [2022-10-08 01:13:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 128 training takes 0:06:50 [2022-10-08 01:13:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.184 (3.184) Loss 1.0988 (1.0988) Acc@1 73.535 (73.535) Acc@5 92.871 (92.871) [2022-10-08 01:13:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.838 Acc@5 92.642 [2022-10-08 01:13:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-08 01:13:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 01:13:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][0/1251] eta 0:58:01 lr 0.000613 time 2.7828 (2.7828) loss 3.4868 (3.4868) grad_norm 1.3124 (1.3124) [2022-10-08 01:14:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][100/1251] eta 0:06:47 lr 0.000613 time 0.3244 (0.3538) loss 3.7501 (3.5738) grad_norm 1.3188 (1.4157) [2022-10-08 01:14:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][200/1251] eta 0:05:58 lr 0.000612 time 0.3322 (0.3406) loss 3.7882 (3.5988) grad_norm 1.5171 (1.4504) [2022-10-08 01:15:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][300/1251] eta 0:05:20 lr 0.000612 time 0.3308 (0.3366) loss 3.5659 (3.6067) grad_norm 1.4632 (1.4429) [2022-10-08 01:15:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][400/1251] eta 0:04:44 lr 0.000611 time 0.3287 (0.3346) loss 3.8097 (3.6094) grad_norm 1.3054 (1.4412) [2022-10-08 01:16:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][500/1251] eta 0:04:10 lr 0.000611 time 0.3236 (0.3336) loss 3.7284 (3.6106) grad_norm 1.2652 (1.4372) [2022-10-08 01:16:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][600/1251] eta 0:03:36 lr 0.000611 time 0.3394 (0.3330) loss 3.9190 (3.6128) grad_norm 1.6544 (1.4326) [2022-10-08 01:17:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][700/1251] eta 0:03:03 lr 0.000610 time 0.3263 (0.3326) loss 3.8285 (3.6118) grad_norm 1.7863 (1.4353) [2022-10-08 01:18:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][800/1251] eta 0:02:29 lr 0.000610 time 0.3297 (0.3325) loss 3.6025 (3.6101) grad_norm 1.2192 (1.4305) [2022-10-08 01:18:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][900/1251] eta 0:01:56 lr 0.000609 time 0.3319 (0.3324) loss 3.5808 (3.6099) grad_norm 1.3452 (1.4291) [2022-10-08 01:19:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][1000/1251] eta 0:01:23 lr 0.000609 time 0.3295 (0.3325) loss 3.6571 (3.6082) grad_norm 1.5482 (1.4319) [2022-10-08 01:19:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][1100/1251] eta 0:00:50 lr 0.000609 time 0.3282 (0.3325) loss 3.8896 (3.6128) grad_norm 1.3516 (1.4331) [2022-10-08 01:20:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [129/300][1200/1251] eta 0:00:16 lr 0.000608 time 0.3305 (0.3325) loss 3.5949 (3.6129) grad_norm 1.4013 (1.4343) [2022-10-08 01:20:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 129 training takes 0:06:56 [2022-10-08 01:20:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.090 (3.090) Loss 1.0526 (1.0526) Acc@1 73.633 (73.633) Acc@5 93.066 (93.066) [2022-10-08 01:20:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.886 Acc@5 92.800 [2022-10-08 01:20:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-08 01:20:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 01:20:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][0/1251] eta 1:05:21 lr 0.000608 time 3.1347 (3.1347) loss 3.6760 (3.6760) grad_norm 1.4368 (1.4368) [2022-10-08 01:21:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][100/1251] eta 0:06:46 lr 0.000608 time 0.3290 (0.3535) loss 3.7236 (3.6003) grad_norm 1.4031 (1.4317) [2022-10-08 01:21:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][200/1251] eta 0:05:57 lr 0.000607 time 0.3273 (0.3397) loss 3.7590 (3.6027) grad_norm 1.5676 (1.4522) [2022-10-08 01:22:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][300/1251] eta 0:05:18 lr 0.000607 time 0.3246 (0.3350) loss 3.6925 (3.5985) grad_norm 1.2890 (1.4601) [2022-10-08 01:23:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][400/1251] eta 0:04:43 lr 0.000606 time 0.3261 (0.3328) loss 3.5745 (3.5993) grad_norm 1.3251 (1.4528) [2022-10-08 01:23:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][500/1251] eta 0:04:08 lr 0.000606 time 0.3241 (0.3311) loss 3.8654 (3.6059) grad_norm 1.5693 (1.4516) [2022-10-08 01:24:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][600/1251] eta 0:03:34 lr 0.000605 time 0.3287 (0.3302) loss 3.6411 (3.6111) grad_norm 1.4206 (1.4498) [2022-10-08 01:24:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][700/1251] eta 0:03:01 lr 0.000605 time 0.3215 (0.3295) loss 3.6689 (3.6118) grad_norm 1.3526 (1.4476) [2022-10-08 01:25:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][800/1251] eta 0:02:28 lr 0.000605 time 0.3235 (0.3289) loss 3.7597 (3.6131) grad_norm 1.3313 (1.4497) [2022-10-08 01:25:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][900/1251] eta 0:01:55 lr 0.000604 time 0.3334 (0.3286) loss 3.5171 (3.6098) grad_norm 1.3478 (1.4516) [2022-10-08 01:26:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][1000/1251] eta 0:01:22 lr 0.000604 time 0.3237 (0.3284) loss 3.1182 (3.6121) grad_norm 1.4545 (1.4496) [2022-10-08 01:26:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][1100/1251] eta 0:00:49 lr 0.000603 time 0.3307 (0.3284) loss 3.6532 (3.6130) grad_norm 1.7682 (1.4473) [2022-10-08 01:27:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [130/300][1200/1251] eta 0:00:16 lr 0.000603 time 0.3247 (0.3284) loss 3.3796 (3.6146) grad_norm 1.5116 (1.4516) [2022-10-08 01:27:38 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 130 training takes 0:06:51 [2022-10-08 01:27:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_130 saving...... [2022-10-08 01:27:39 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_130 saved !!! [2022-10-08 01:27:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.738 (2.738) Loss 1.0743 (1.0743) Acc@1 74.902 (74.902) Acc@5 92.578 (92.578) [2022-10-08 01:27:52 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.902 Acc@5 92.780 [2022-10-08 01:27:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-08 01:27:52 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 01:27:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][0/1251] eta 0:59:15 lr 0.000603 time 2.8423 (2.8423) loss 3.5609 (3.5609) grad_norm 1.3145 (1.3145) [2022-10-08 01:28:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][100/1251] eta 0:06:44 lr 0.000602 time 0.3226 (0.3512) loss 3.3431 (3.5914) grad_norm 1.3068 (1.4410) [2022-10-08 01:29:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][200/1251] eta 0:05:55 lr 0.000602 time 0.3306 (0.3384) loss 3.3666 (3.5862) grad_norm 1.5587 (1.4543) [2022-10-08 01:29:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][300/1251] eta 0:05:18 lr 0.000602 time 0.3222 (0.3347) loss 3.8106 (3.5980) grad_norm 1.4174 (1.4516) [2022-10-08 01:30:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][400/1251] eta 0:04:43 lr 0.000601 time 0.3293 (0.3329) loss 3.7759 (3.5966) grad_norm 1.4581 (1.4401) [2022-10-08 01:30:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][500/1251] eta 0:04:09 lr 0.000601 time 0.3391 (0.3319) loss 3.4072 (3.5963) grad_norm 1.2488 (1.4393) [2022-10-08 01:31:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][600/1251] eta 0:03:35 lr 0.000600 time 0.3259 (0.3313) loss 3.7325 (3.5987) grad_norm 1.5260 (1.4417) [2022-10-08 01:31:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][700/1251] eta 0:03:02 lr 0.000600 time 0.3278 (0.3310) loss 3.6631 (3.6004) grad_norm 1.7822 (1.4425) [2022-10-08 01:32:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][800/1251] eta 0:02:29 lr 0.000600 time 0.3248 (0.3309) loss 3.7747 (3.5985) grad_norm 2.0649 (1.4429) [2022-10-08 01:32:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][900/1251] eta 0:01:56 lr 0.000599 time 0.3422 (0.3309) loss 3.4201 (3.5989) grad_norm 1.3447 (1.4478) [2022-10-08 01:33:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][1000/1251] eta 0:01:23 lr 0.000599 time 0.3257 (0.3309) loss 3.7723 (3.5994) grad_norm 1.4014 (1.4474) [2022-10-08 01:33:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][1100/1251] eta 0:00:49 lr 0.000598 time 0.3370 (0.3309) loss 3.6445 (3.6014) grad_norm 1.4741 (1.4477) [2022-10-08 01:34:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [131/300][1200/1251] eta 0:00:16 lr 0.000598 time 0.3311 (0.3308) loss 3.0436 (3.6012) grad_norm 1.3669 (1.4464) [2022-10-08 01:34:46 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 131 training takes 0:06:54 [2022-10-08 01:34:49 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.874 (2.874) Loss 1.0412 (1.0412) Acc@1 76.367 (76.367) Acc@5 92.578 (92.578) [2022-10-08 01:35:00 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.974 Acc@5 92.688 [2022-10-08 01:35:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-08 01:35:00 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.04% [2022-10-08 01:35:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][0/1251] eta 0:53:41 lr 0.000598 time 2.5748 (2.5748) loss 3.5960 (3.5960) grad_norm 1.3757 (1.3757) [2022-10-08 01:35:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][100/1251] eta 0:06:42 lr 0.000597 time 0.3277 (0.3493) loss 3.6460 (3.5893) grad_norm 1.7303 (1.4780) [2022-10-08 01:36:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][200/1251] eta 0:05:54 lr 0.000597 time 0.3214 (0.3369) loss 3.5131 (3.6056) grad_norm 1.2489 (1.4591) [2022-10-08 01:36:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][300/1251] eta 0:05:16 lr 0.000597 time 0.3297 (0.3328) loss 3.8290 (3.5995) grad_norm 1.2847 (1.4542) [2022-10-08 01:37:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][400/1251] eta 0:04:41 lr 0.000596 time 0.3238 (0.3310) loss 3.7194 (3.6003) grad_norm 1.5318 (1.4533) [2022-10-08 01:37:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][500/1251] eta 0:04:07 lr 0.000596 time 0.3261 (0.3301) loss 3.4909 (3.5964) grad_norm 1.2216 (1.4515) [2022-10-08 01:38:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][600/1251] eta 0:03:34 lr 0.000595 time 0.3297 (0.3296) loss 3.6153 (3.5996) grad_norm 1.5679 (1.4504) [2022-10-08 01:38:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][700/1251] eta 0:03:01 lr 0.000595 time 0.3263 (0.3293) loss 3.4814 (3.5981) grad_norm 1.6001 (1.4517) [2022-10-08 01:39:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][800/1251] eta 0:02:28 lr 0.000594 time 0.3248 (0.3291) loss 3.4027 (3.5961) grad_norm 1.3625 (1.4512) [2022-10-08 01:39:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][900/1251] eta 0:01:55 lr 0.000594 time 0.3345 (0.3291) loss 3.5208 (3.5994) grad_norm 1.2382 (1.4535) [2022-10-08 01:40:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][1000/1251] eta 0:01:22 lr 0.000594 time 0.3292 (0.3292) loss 3.5463 (3.6004) grad_norm 1.3927 (1.4524) [2022-10-08 01:41:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][1100/1251] eta 0:00:49 lr 0.000593 time 0.3320 (0.3293) loss 3.6791 (3.6009) grad_norm 1.2486 (1.4530) [2022-10-08 01:41:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [132/300][1200/1251] eta 0:00:16 lr 0.000593 time 0.3315 (0.3295) loss 3.6341 (3.6045) grad_norm 1.3830 (1.4554) [2022-10-08 01:41:52 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 132 training takes 0:06:52 [2022-10-08 01:41:55 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.734 (2.734) Loss 1.0876 (1.0876) Acc@1 74.219 (74.219) Acc@5 93.066 (93.066) [2022-10-08 01:42:06 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.138 Acc@5 92.782 [2022-10-08 01:42:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-10-08 01:42:06 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.14% [2022-10-08 01:42:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][0/1251] eta 0:51:28 lr 0.000593 time 2.4685 (2.4685) loss 3.7219 (3.7219) grad_norm 1.6285 (1.6285) [2022-10-08 01:42:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][100/1251] eta 0:06:40 lr 0.000592 time 0.3265 (0.3479) loss 3.8049 (3.5894) grad_norm 1.3596 (1.4990) [2022-10-08 01:43:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][200/1251] eta 0:05:53 lr 0.000592 time 0.3253 (0.3367) loss 3.3863 (3.6008) grad_norm 1.3891 (1.4863) [2022-10-08 01:43:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][300/1251] eta 0:05:16 lr 0.000591 time 0.3269 (0.3330) loss 3.4782 (3.5926) grad_norm 1.3110 (1.4884) [2022-10-08 01:44:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][400/1251] eta 0:04:41 lr 0.000591 time 0.3211 (0.3312) loss 3.6467 (3.5929) grad_norm 1.4448 (1.4772) [2022-10-08 01:44:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][500/1251] eta 0:04:07 lr 0.000591 time 0.3237 (0.3301) loss 3.5219 (3.5904) grad_norm 1.3953 (1.4718) [2022-10-08 01:45:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][600/1251] eta 0:03:34 lr 0.000590 time 0.3245 (0.3292) loss 3.7476 (3.5936) grad_norm 1.4403 (1.4674) [2022-10-08 01:45:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][700/1251] eta 0:03:01 lr 0.000590 time 0.3248 (0.3286) loss 3.6000 (3.5897) grad_norm 1.4457 (1.4698) [2022-10-08 01:46:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][800/1251] eta 0:02:27 lr 0.000589 time 0.3273 (0.3282) loss 3.7423 (3.5941) grad_norm 1.2564 (1.4675) [2022-10-08 01:47:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][900/1251] eta 0:01:55 lr 0.000589 time 0.3281 (0.3278) loss 3.4780 (3.5955) grad_norm 1.3631 (1.4638) [2022-10-08 01:47:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][1000/1251] eta 0:01:22 lr 0.000589 time 0.3218 (0.3277) loss 3.7616 (3.5969) grad_norm 1.5086 (1.4661) [2022-10-08 01:48:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][1100/1251] eta 0:00:49 lr 0.000588 time 0.3259 (0.3276) loss 3.9385 (3.5985) grad_norm 1.4393 (1.4647) [2022-10-08 01:48:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [133/300][1200/1251] eta 0:00:16 lr 0.000588 time 0.3233 (0.3274) loss 3.1491 (3.5992) grad_norm 1.2593 (1.4632) [2022-10-08 01:48:56 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 133 training takes 0:06:49 [2022-10-08 01:48:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.776 (2.776) Loss 1.1423 (1.1423) Acc@1 73.926 (73.926) Acc@5 91.504 (91.504) [2022-10-08 01:49:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.024 Acc@5 92.920 [2022-10-08 01:49:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-08 01:49:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.14% [2022-10-08 01:49:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][0/1251] eta 0:52:51 lr 0.000588 time 2.5354 (2.5354) loss 3.7042 (3.7042) grad_norm 1.3354 (1.3354) [2022-10-08 01:49:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][100/1251] eta 0:06:40 lr 0.000587 time 0.3239 (0.3478) loss 3.6022 (3.5862) grad_norm 1.4844 (1.4597) [2022-10-08 01:50:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][200/1251] eta 0:05:53 lr 0.000587 time 0.3256 (0.3360) loss 3.7131 (3.5877) grad_norm 1.4717 (1.4741) [2022-10-08 01:50:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][300/1251] eta 0:05:15 lr 0.000586 time 0.3269 (0.3320) loss 3.9271 (3.5892) grad_norm 1.5306 (1.4815) [2022-10-08 01:51:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][400/1251] eta 0:04:41 lr 0.000586 time 0.3251 (0.3303) loss 3.6727 (3.5916) grad_norm 1.7006 (1.4753) [2022-10-08 01:51:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][500/1251] eta 0:04:07 lr 0.000586 time 0.3301 (0.3295) loss 3.5227 (3.5850) grad_norm 1.4323 (1.4771) [2022-10-08 01:52:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][600/1251] eta 0:03:34 lr 0.000585 time 0.3276 (0.3293) loss 3.4614 (3.5856) grad_norm 2.6334 (1.4781) [2022-10-08 01:53:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][700/1251] eta 0:03:01 lr 0.000585 time 0.3251 (0.3292) loss 3.5713 (3.5893) grad_norm 1.5240 (1.4735) [2022-10-08 01:53:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][800/1251] eta 0:02:28 lr 0.000584 time 0.3256 (0.3292) loss 3.5542 (3.5935) grad_norm 1.2663 (1.4765) [2022-10-08 01:54:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][900/1251] eta 0:01:55 lr 0.000584 time 0.3362 (0.3292) loss 3.7149 (3.5954) grad_norm 1.2784 (1.4718) [2022-10-08 01:54:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][1000/1251] eta 0:01:22 lr 0.000583 time 0.3326 (0.3293) loss 3.8135 (3.5942) grad_norm 1.2858 (1.4701) [2022-10-08 01:55:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][1100/1251] eta 0:00:49 lr 0.000583 time 0.3344 (0.3294) loss 3.7299 (3.5976) grad_norm 1.4495 (1.4670) [2022-10-08 01:55:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [134/300][1200/1251] eta 0:00:16 lr 0.000583 time 0.3343 (0.3296) loss 3.4511 (3.5968) grad_norm 1.3509 (1.4655) [2022-10-08 01:56:02 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 134 training takes 0:06:52 [2022-10-08 01:56:06 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.195 (3.195) Loss 1.0434 (1.0434) Acc@1 75.977 (75.977) Acc@5 93.164 (93.164) [2022-10-08 01:56:16 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.134 Acc@5 92.862 [2022-10-08 01:56:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-10-08 01:56:16 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.14% [2022-10-08 01:56:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][0/1251] eta 0:51:56 lr 0.000582 time 2.4910 (2.4910) loss 3.6088 (3.6088) grad_norm 1.6597 (1.6597) [2022-10-08 01:56:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][100/1251] eta 0:06:45 lr 0.000582 time 0.3312 (0.3523) loss 3.3220 (3.5873) grad_norm 1.2920 (1.4470) [2022-10-08 01:57:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][200/1251] eta 0:05:57 lr 0.000582 time 0.3258 (0.3405) loss 3.6017 (3.5904) grad_norm 1.6848 (1.4482) [2022-10-08 01:57:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][300/1251] eta 0:05:19 lr 0.000581 time 0.3236 (0.3364) loss 3.7041 (3.5821) grad_norm 1.3117 (1.4461) [2022-10-08 01:58:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][400/1251] eta 0:04:44 lr 0.000581 time 0.3245 (0.3342) loss 3.3611 (3.5819) grad_norm 1.2099 (1.4442) [2022-10-08 01:59:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][500/1251] eta 0:04:10 lr 0.000580 time 0.3261 (0.3329) loss 3.3022 (3.5859) grad_norm 1.5269 (1.4446) [2022-10-08 01:59:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][600/1251] eta 0:03:36 lr 0.000580 time 0.3261 (0.3319) loss 3.7024 (3.5827) grad_norm 1.5232 (1.4483) [2022-10-08 02:00:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][700/1251] eta 0:03:02 lr 0.000580 time 0.3284 (0.3314) loss 3.5665 (3.5838) grad_norm 1.3818 (1.4520) [2022-10-08 02:00:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][800/1251] eta 0:02:29 lr 0.000579 time 0.3270 (0.3309) loss 3.5287 (3.5832) grad_norm 1.4373 (1.4558) [2022-10-08 02:01:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][900/1251] eta 0:01:56 lr 0.000579 time 0.3242 (0.3307) loss 3.6134 (3.5841) grad_norm 1.3451 (1.4582) [2022-10-08 02:01:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][1000/1251] eta 0:01:22 lr 0.000578 time 0.3320 (0.3305) loss 3.3597 (3.5838) grad_norm 1.5190 (1.4590) [2022-10-08 02:02:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][1100/1251] eta 0:00:49 lr 0.000578 time 0.3272 (0.3304) loss 3.9535 (3.5885) grad_norm 1.4961 (1.4587) [2022-10-08 02:02:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [135/300][1200/1251] eta 0:00:16 lr 0.000578 time 0.3269 (0.3303) loss 3.4583 (3.5881) grad_norm 1.3112 (1.4627) [2022-10-08 02:03:10 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 135 training takes 0:06:53 [2022-10-08 02:03:12 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.464 (2.464) Loss 1.0762 (1.0762) Acc@1 74.902 (74.902) Acc@5 93.164 (93.164) [2022-10-08 02:03:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.198 Acc@5 92.750 [2022-10-08 02:03:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-08 02:03:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.20% [2022-10-08 02:03:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][0/1251] eta 0:55:36 lr 0.000577 time 2.6672 (2.6672) loss 3.7055 (3.7055) grad_norm 1.3408 (1.3408) [2022-10-08 02:03:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][100/1251] eta 0:06:44 lr 0.000577 time 0.3235 (0.3514) loss 3.5803 (3.5827) grad_norm 1.5414 (1.4408) [2022-10-08 02:04:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][200/1251] eta 0:05:56 lr 0.000576 time 0.3245 (0.3389) loss 3.6911 (3.5779) grad_norm 1.3463 (1.4538) [2022-10-08 02:05:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][300/1251] eta 0:05:18 lr 0.000576 time 0.3247 (0.3344) loss 3.8919 (3.5873) grad_norm 1.5008 (1.4665) [2022-10-08 02:05:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][400/1251] eta 0:04:42 lr 0.000576 time 0.3226 (0.3321) loss 3.6497 (3.5858) grad_norm 1.4581 (1.4723) [2022-10-08 02:06:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][500/1251] eta 0:04:08 lr 0.000575 time 0.3217 (0.3308) loss 3.7751 (3.5959) grad_norm 1.3791 (1.4764) [2022-10-08 02:06:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][600/1251] eta 0:03:34 lr 0.000575 time 0.3339 (0.3298) loss 3.6699 (3.5918) grad_norm 1.4965 (1.4725) [2022-10-08 02:07:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][700/1251] eta 0:03:01 lr 0.000574 time 0.3231 (0.3291) loss 3.8707 (3.5894) grad_norm 1.5021 (1.4731) [2022-10-08 02:07:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][800/1251] eta 0:02:28 lr 0.000574 time 0.3288 (0.3286) loss 3.5591 (3.5904) grad_norm 1.4183 (1.4767) [2022-10-08 02:08:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][900/1251] eta 0:01:55 lr 0.000574 time 0.3224 (0.3281) loss 3.7307 (3.5900) grad_norm 1.4391 (1.4739) [2022-10-08 02:08:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][1000/1251] eta 0:01:22 lr 0.000573 time 0.3243 (0.3279) loss 3.6045 (3.5882) grad_norm 1.6284 (1.4734) [2022-10-08 02:09:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][1100/1251] eta 0:00:49 lr 0.000573 time 0.3234 (0.3277) loss 3.7374 (3.5871) grad_norm 1.3384 (1.4703) [2022-10-08 02:09:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [136/300][1200/1251] eta 0:00:16 lr 0.000572 time 0.3306 (0.3276) loss 3.6175 (3.5874) grad_norm 1.8628 (1.4689) [2022-10-08 02:10:14 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 136 training takes 0:06:50 [2022-10-08 02:10:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.533 (3.533) Loss 1.0636 (1.0636) Acc@1 75.391 (75.391) Acc@5 92.090 (92.090) [2022-10-08 02:10:28 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.416 Acc@5 92.902 [2022-10-08 02:10:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-08 02:10:28 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.42% [2022-10-08 02:10:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][0/1251] eta 0:54:37 lr 0.000572 time 2.6200 (2.6200) loss 3.4552 (3.4552) grad_norm 1.3616 (1.3616) [2022-10-08 02:11:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][100/1251] eta 0:06:41 lr 0.000572 time 0.3225 (0.3487) loss 3.7339 (3.5880) grad_norm 1.8841 (1.4419) [2022-10-08 02:11:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][200/1251] eta 0:05:53 lr 0.000571 time 0.3229 (0.3366) loss 3.6321 (3.5921) grad_norm 1.3416 (1.4664) [2022-10-08 02:12:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][300/1251] eta 0:05:16 lr 0.000571 time 0.3235 (0.3328) loss 3.6164 (3.5916) grad_norm 1.4520 (1.4770) [2022-10-08 02:12:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][400/1251] eta 0:04:41 lr 0.000571 time 0.3331 (0.3310) loss 3.6107 (3.5936) grad_norm 1.4551 (1.4738) [2022-10-08 02:13:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][500/1251] eta 0:04:08 lr 0.000570 time 0.3259 (0.3303) loss 3.7351 (3.5904) grad_norm 1.3531 (1.4765) [2022-10-08 02:13:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][600/1251] eta 0:03:34 lr 0.000570 time 0.3361 (0.3300) loss 3.8052 (3.5897) grad_norm 1.8040 (1.4757) [2022-10-08 02:14:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][700/1251] eta 0:03:01 lr 0.000569 time 0.3273 (0.3300) loss 3.3912 (3.5889) grad_norm 1.4960 (1.4814) [2022-10-08 02:14:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][800/1251] eta 0:02:28 lr 0.000569 time 0.3386 (0.3299) loss 3.3407 (3.5901) grad_norm 1.2388 (1.4811) [2022-10-08 02:15:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][900/1251] eta 0:01:55 lr 0.000568 time 0.3263 (0.3299) loss 3.7983 (3.5864) grad_norm 1.4489 (1.4809) [2022-10-08 02:15:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][1000/1251] eta 0:01:22 lr 0.000568 time 0.3295 (0.3299) loss 3.4777 (3.5894) grad_norm 1.5133 (1.4789) [2022-10-08 02:16:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][1100/1251] eta 0:00:49 lr 0.000568 time 0.3245 (0.3299) loss 3.4408 (3.5883) grad_norm 1.4189 (1.4758) [2022-10-08 02:17:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [137/300][1200/1251] eta 0:00:16 lr 0.000567 time 0.3210 (0.3300) loss 3.1489 (3.5879) grad_norm 1.5168 (1.4753) [2022-10-08 02:17:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 137 training takes 0:06:53 [2022-10-08 02:17:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.384 (2.384) Loss 1.0084 (1.0084) Acc@1 76.270 (76.270) Acc@5 93.555 (93.555) [2022-10-08 02:17:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.430 Acc@5 92.828 [2022-10-08 02:17:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-08 02:17:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.43% [2022-10-08 02:17:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][0/1251] eta 0:56:51 lr 0.000567 time 2.7273 (2.7273) loss 3.7663 (3.7663) grad_norm 1.8127 (1.8127) [2022-10-08 02:18:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][100/1251] eta 0:06:45 lr 0.000567 time 0.3255 (0.3520) loss 3.6113 (3.5203) grad_norm 1.5354 (1.4754) [2022-10-08 02:18:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][200/1251] eta 0:05:56 lr 0.000566 time 0.3283 (0.3395) loss 3.5911 (3.5565) grad_norm 1.6189 (1.4872) [2022-10-08 02:19:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][300/1251] eta 0:05:18 lr 0.000566 time 0.3263 (0.3350) loss 3.1646 (3.5643) grad_norm 1.4947 (1.4781) [2022-10-08 02:19:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][400/1251] eta 0:04:43 lr 0.000565 time 0.3252 (0.3329) loss 3.2890 (3.5707) grad_norm 1.7038 (1.4789) [2022-10-08 02:20:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][500/1251] eta 0:04:08 lr 0.000565 time 0.3270 (0.3315) loss 3.6994 (3.5710) grad_norm 1.3619 (1.4763) [2022-10-08 02:20:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][600/1251] eta 0:03:35 lr 0.000565 time 0.3235 (0.3307) loss 3.7323 (3.5734) grad_norm 1.5755 (1.4816) [2022-10-08 02:21:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][700/1251] eta 0:03:01 lr 0.000564 time 0.3271 (0.3301) loss 3.4455 (3.5756) grad_norm 1.4260 (1.4810) [2022-10-08 02:21:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][800/1251] eta 0:02:28 lr 0.000564 time 0.3288 (0.3298) loss 3.5155 (3.5773) grad_norm 1.7416 (1.4783) [2022-10-08 02:22:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][900/1251] eta 0:01:55 lr 0.000563 time 0.3204 (0.3297) loss 3.1320 (3.5790) grad_norm 1.4260 (1.4828) [2022-10-08 02:23:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][1000/1251] eta 0:01:22 lr 0.000563 time 0.3364 (0.3296) loss 3.8113 (3.5807) grad_norm 1.5473 (1.4826) [2022-10-08 02:23:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][1100/1251] eta 0:00:49 lr 0.000563 time 0.3435 (0.3297) loss 3.6053 (3.5829) grad_norm 1.3071 (1.4816) [2022-10-08 02:24:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [138/300][1200/1251] eta 0:00:16 lr 0.000562 time 0.3231 (0.3296) loss 3.6687 (3.5842) grad_norm 1.3863 (1.4846) [2022-10-08 02:24:27 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 138 training takes 0:06:52 [2022-10-08 02:24:30 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.402 (2.402) Loss 1.0682 (1.0682) Acc@1 73.926 (73.926) Acc@5 92.969 (92.969) [2022-10-08 02:24:41 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.342 Acc@5 92.838 [2022-10-08 02:24:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-08 02:24:41 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.43% [2022-10-08 02:24:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][0/1251] eta 1:05:54 lr 0.000562 time 3.1615 (3.1615) loss 3.1515 (3.1515) grad_norm 1.5069 (1.5069) [2022-10-08 02:25:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][100/1251] eta 0:06:47 lr 0.000561 time 0.3312 (0.3537) loss 3.7418 (3.5525) grad_norm 1.2462 (1.4881) [2022-10-08 02:25:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][200/1251] eta 0:05:57 lr 0.000561 time 0.3257 (0.3397) loss 3.5328 (3.5544) grad_norm 1.5809 (1.4956) [2022-10-08 02:26:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][300/1251] eta 0:05:18 lr 0.000561 time 0.3248 (0.3347) loss 3.6822 (3.5748) grad_norm 2.0587 (1.4945) [2022-10-08 02:26:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][400/1251] eta 0:04:42 lr 0.000560 time 0.3226 (0.3323) loss 3.5120 (3.5713) grad_norm 1.6371 (1.5083) [2022-10-08 02:27:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][500/1251] eta 0:04:08 lr 0.000560 time 0.3251 (0.3308) loss 3.6293 (3.5689) grad_norm 1.1538 (1.4987) [2022-10-08 02:27:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][600/1251] eta 0:03:34 lr 0.000559 time 0.3232 (0.3298) loss 3.5888 (3.5713) grad_norm 1.3519 (1.4950) [2022-10-08 02:28:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][700/1251] eta 0:03:01 lr 0.000559 time 0.3211 (0.3290) loss 3.5311 (3.5704) grad_norm 1.4045 (1.4918) [2022-10-08 02:29:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][800/1251] eta 0:02:28 lr 0.000559 time 0.3257 (0.3284) loss 3.3955 (3.5706) grad_norm 1.4971 (1.4856) [2022-10-08 02:29:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][900/1251] eta 0:01:55 lr 0.000558 time 0.3198 (0.3281) loss 3.9307 (3.5701) grad_norm 1.4964 (1.4887) [2022-10-08 02:30:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][1000/1251] eta 0:01:22 lr 0.000558 time 0.3228 (0.3277) loss 3.6316 (3.5723) grad_norm 1.2481 (1.4857) [2022-10-08 02:30:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][1100/1251] eta 0:00:49 lr 0.000557 time 0.3253 (0.3274) loss 3.5794 (3.5754) grad_norm 1.5522 (1.4822) [2022-10-08 02:31:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [139/300][1200/1251] eta 0:00:16 lr 0.000557 time 0.3220 (0.3273) loss 3.6834 (3.5746) grad_norm 2.1164 (1.4840) [2022-10-08 02:31:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 139 training takes 0:06:49 [2022-10-08 02:31:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.896 (2.896) Loss 1.0385 (1.0385) Acc@1 75.586 (75.586) Acc@5 92.969 (92.969) [2022-10-08 02:31:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.256 Acc@5 92.954 [2022-10-08 02:31:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-08 02:31:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.43% [2022-10-08 02:31:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][0/1251] eta 1:04:20 lr 0.000557 time 3.0858 (3.0858) loss 3.5208 (3.5208) grad_norm 1.4088 (1.4088) [2022-10-08 02:32:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][100/1251] eta 0:06:46 lr 0.000556 time 0.3259 (0.3533) loss 3.2749 (3.5524) grad_norm 1.4116 (1.5095) [2022-10-08 02:32:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][200/1251] eta 0:05:57 lr 0.000556 time 0.3278 (0.3398) loss 3.4867 (3.5654) grad_norm 1.4422 (1.5000) [2022-10-08 02:33:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][300/1251] eta 0:05:19 lr 0.000556 time 0.3294 (0.3356) loss 3.5697 (3.5612) grad_norm 1.3556 (1.4929) [2022-10-08 02:33:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][400/1251] eta 0:04:43 lr 0.000555 time 0.3368 (0.3337) loss 3.5998 (3.5661) grad_norm 1.2889 (1.5018) [2022-10-08 02:34:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][500/1251] eta 0:04:09 lr 0.000555 time 0.3329 (0.3327) loss 3.4147 (3.5672) grad_norm 1.2485 (1.4986) [2022-10-08 02:35:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][600/1251] eta 0:03:36 lr 0.000554 time 0.3290 (0.3321) loss 3.4154 (3.5691) grad_norm 1.3294 (1.5005) [2022-10-08 02:35:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][700/1251] eta 0:03:02 lr 0.000554 time 0.3270 (0.3318) loss 3.6961 (3.5746) grad_norm 1.4371 (1.5023) [2022-10-08 02:36:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][800/1251] eta 0:02:29 lr 0.000553 time 0.3285 (0.3316) loss 3.9136 (3.5735) grad_norm 1.5056 (1.4990) [2022-10-08 02:36:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][900/1251] eta 0:01:56 lr 0.000553 time 0.3263 (0.3314) loss 3.4864 (3.5743) grad_norm 1.3146 (1.4986) [2022-10-08 02:37:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][1000/1251] eta 0:01:23 lr 0.000553 time 0.3277 (0.3313) loss 3.6264 (3.5760) grad_norm 1.5656 (1.4996) [2022-10-08 02:37:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][1100/1251] eta 0:00:50 lr 0.000552 time 0.3268 (0.3313) loss 3.7578 (3.5767) grad_norm 1.3027 (1.4992) [2022-10-08 02:38:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [140/300][1200/1251] eta 0:00:16 lr 0.000552 time 0.3256 (0.3313) loss 3.3578 (3.5757) grad_norm 1.3267 (1.5002) [2022-10-08 02:38:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 140 training takes 0:06:54 [2022-10-08 02:38:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_140 saving...... [2022-10-08 02:38:40 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_140 saved !!! [2022-10-08 02:38:43 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.552 (2.552) Loss 1.1060 (1.1060) Acc@1 75.293 (75.293) Acc@5 91.504 (91.504) [2022-10-08 02:38:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.460 Acc@5 92.958 [2022-10-08 02:38:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-08 02:38:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.46% [2022-10-08 02:38:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][0/1251] eta 1:07:29 lr 0.000552 time 3.2369 (3.2369) loss 3.5169 (3.5169) grad_norm 1.3098 (1.3098) [2022-10-08 02:39:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][100/1251] eta 0:06:51 lr 0.000551 time 0.3252 (0.3575) loss 3.3817 (3.5287) grad_norm 1.4813 (1.5447) [2022-10-08 02:40:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][200/1251] eta 0:05:59 lr 0.000551 time 0.3299 (0.3420) loss 3.5967 (3.5357) grad_norm 1.2789 (1.4954) [2022-10-08 02:40:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][300/1251] eta 0:05:20 lr 0.000550 time 0.3212 (0.3365) loss 3.4588 (3.5454) grad_norm 1.6751 (1.4980) [2022-10-08 02:41:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][400/1251] eta 0:04:43 lr 0.000550 time 0.3232 (0.3336) loss 3.5579 (3.5551) grad_norm 1.3657 (1.4934) [2022-10-08 02:41:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][500/1251] eta 0:04:09 lr 0.000550 time 0.3259 (0.3319) loss 3.1613 (3.5597) grad_norm 1.6420 (1.4979) [2022-10-08 02:42:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][600/1251] eta 0:03:35 lr 0.000549 time 0.3255 (0.3310) loss 3.4054 (3.5644) grad_norm 1.4024 (1.4972) [2022-10-08 02:42:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][700/1251] eta 0:03:02 lr 0.000549 time 0.3277 (0.3303) loss 3.9490 (3.5711) grad_norm 1.5093 (1.4990) [2022-10-08 02:43:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][800/1251] eta 0:02:28 lr 0.000548 time 0.3220 (0.3299) loss 3.7902 (3.5728) grad_norm 1.4291 (1.5030) [2022-10-08 02:43:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][900/1251] eta 0:01:55 lr 0.000548 time 0.3263 (0.3296) loss 3.1939 (3.5708) grad_norm 1.4203 (1.5008) [2022-10-08 02:44:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][1000/1251] eta 0:01:22 lr 0.000547 time 0.3314 (0.3295) loss 3.4821 (3.5747) grad_norm 1.2908 (1.5002) [2022-10-08 02:44:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][1100/1251] eta 0:00:49 lr 0.000547 time 0.3294 (0.3295) loss 3.7017 (3.5767) grad_norm 1.4934 (1.5000) [2022-10-08 02:45:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [141/300][1200/1251] eta 0:00:16 lr 0.000547 time 0.3244 (0.3296) loss 3.6833 (3.5785) grad_norm 1.3141 (1.4995) [2022-10-08 02:45:46 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 141 training takes 0:06:52 [2022-10-08 02:45:49 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.645 (2.645) Loss 1.0249 (1.0249) Acc@1 75.684 (75.684) Acc@5 93.750 (93.750) [2022-10-08 02:45:59 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 74.890 Acc@5 92.824 [2022-10-08 02:45:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-08 02:45:59 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.46% [2022-10-08 02:46:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][0/1251] eta 0:52:51 lr 0.000546 time 2.5354 (2.5354) loss 3.2258 (3.2258) grad_norm 1.4834 (1.4834) [2022-10-08 02:46:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][100/1251] eta 0:06:41 lr 0.000546 time 0.3281 (0.3488) loss 3.8849 (3.5502) grad_norm 1.4714 (1.4684) [2022-10-08 02:47:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][200/1251] eta 0:05:54 lr 0.000546 time 0.3289 (0.3375) loss 3.4344 (3.5558) grad_norm 1.4033 (1.4891) [2022-10-08 02:47:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][300/1251] eta 0:05:17 lr 0.000545 time 0.3239 (0.3337) loss 3.6527 (3.5662) grad_norm 1.5381 (1.4948) [2022-10-08 02:48:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][400/1251] eta 0:04:42 lr 0.000545 time 0.3249 (0.3317) loss 3.3991 (3.5594) grad_norm 1.2545 (1.5037) [2022-10-08 02:48:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][500/1251] eta 0:04:08 lr 0.000544 time 0.3213 (0.3304) loss 3.7329 (3.5603) grad_norm 1.5565 (1.5030) [2022-10-08 02:49:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][600/1251] eta 0:03:34 lr 0.000544 time 0.3259 (0.3297) loss 3.8991 (3.5610) grad_norm 1.4809 (1.5048) [2022-10-08 02:49:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][700/1251] eta 0:03:01 lr 0.000544 time 0.3256 (0.3291) loss 3.5586 (3.5617) grad_norm 1.3895 (1.5093) [2022-10-08 02:50:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][800/1251] eta 0:02:28 lr 0.000543 time 0.3238 (0.3287) loss 3.7069 (3.5658) grad_norm 1.4146 (1.5100) [2022-10-08 02:50:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][900/1251] eta 0:01:55 lr 0.000543 time 0.3241 (0.3282) loss 3.5256 (3.5653) grad_norm 1.9967 (1.5136) [2022-10-08 02:51:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][1000/1251] eta 0:01:22 lr 0.000542 time 0.3259 (0.3279) loss 3.7648 (3.5657) grad_norm 1.7707 (1.5155) [2022-10-08 02:52:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][1100/1251] eta 0:00:49 lr 0.000542 time 0.3226 (0.3278) loss 3.7082 (3.5691) grad_norm 1.6783 (1.5173) [2022-10-08 02:52:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [142/300][1200/1251] eta 0:00:16 lr 0.000541 time 0.3311 (0.3277) loss 3.5646 (3.5706) grad_norm 1.5671 (1.5205) [2022-10-08 02:52:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 142 training takes 0:06:50 [2022-10-08 02:52:52 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.591 (2.591) Loss 1.0593 (1.0593) Acc@1 73.730 (73.730) Acc@5 92.969 (92.969) [2022-10-08 02:53:03 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.482 Acc@5 93.040 [2022-10-08 02:53:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-08 02:53:03 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.48% [2022-10-08 02:53:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][0/1251] eta 1:04:19 lr 0.000541 time 3.0852 (3.0852) loss 3.6115 (3.6115) grad_norm 1.4810 (1.4810) [2022-10-08 02:53:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][100/1251] eta 0:06:46 lr 0.000541 time 0.3241 (0.3530) loss 3.4781 (3.5516) grad_norm 1.6355 (1.5173) [2022-10-08 02:54:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][200/1251] eta 0:05:56 lr 0.000540 time 0.3222 (0.3395) loss 3.3071 (3.5495) grad_norm 1.3979 (1.5161) [2022-10-08 02:54:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][300/1251] eta 0:05:18 lr 0.000540 time 0.3198 (0.3353) loss 3.2816 (3.5501) grad_norm 1.4437 (1.5124) [2022-10-08 02:55:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][400/1251] eta 0:04:43 lr 0.000540 time 0.3346 (0.3337) loss 3.7298 (3.5515) grad_norm 1.5028 (1.5117) [2022-10-08 02:55:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][500/1251] eta 0:04:09 lr 0.000539 time 0.3336 (0.3328) loss 3.4958 (3.5530) grad_norm 1.5034 (1.5186) [2022-10-08 02:56:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][600/1251] eta 0:03:36 lr 0.000539 time 0.3319 (0.3322) loss 3.5628 (3.5558) grad_norm 1.4733 (1.5172) [2022-10-08 02:56:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][700/1251] eta 0:03:02 lr 0.000538 time 0.3304 (0.3318) loss 3.6473 (3.5558) grad_norm 1.6358 (1.5130) [2022-10-08 02:57:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][800/1251] eta 0:02:29 lr 0.000538 time 0.3294 (0.3317) loss 3.8908 (3.5553) grad_norm 1.5210 (1.5107) [2022-10-08 02:58:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][900/1251] eta 0:01:56 lr 0.000538 time 0.3338 (0.3317) loss 3.5275 (3.5540) grad_norm 1.5991 (1.5115) [2022-10-08 02:58:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][1000/1251] eta 0:01:23 lr 0.000537 time 0.3357 (0.3318) loss 3.3845 (3.5568) grad_norm 1.3525 (1.5159) [2022-10-08 02:59:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][1100/1251] eta 0:00:50 lr 0.000537 time 0.3426 (0.3318) loss 3.7124 (3.5586) grad_norm 1.3942 (1.5152) [2022-10-08 02:59:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [143/300][1200/1251] eta 0:00:16 lr 0.000536 time 0.3365 (0.3320) loss 3.8496 (3.5605) grad_norm 1.6179 (1.5183) [2022-10-08 02:59:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 143 training takes 0:06:55 [2022-10-08 03:00:02 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.249 (3.249) Loss 1.0820 (1.0820) Acc@1 73.926 (73.926) Acc@5 91.602 (91.602) [2022-10-08 03:00:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.476 Acc@5 93.010 [2022-10-08 03:00:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-08 03:00:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.48% [2022-10-08 03:00:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][0/1251] eta 0:48:36 lr 0.000536 time 2.3317 (2.3317) loss 3.4909 (3.4909) grad_norm 1.2257 (1.2257) [2022-10-08 03:00:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][100/1251] eta 0:06:43 lr 0.000536 time 0.3250 (0.3504) loss 3.4722 (3.5531) grad_norm 1.3889 (1.4930) [2022-10-08 03:01:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][200/1251] eta 0:05:55 lr 0.000535 time 0.3243 (0.3387) loss 3.5355 (3.5419) grad_norm 1.3854 (1.5329) [2022-10-08 03:01:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][300/1251] eta 0:05:18 lr 0.000535 time 0.3314 (0.3346) loss 3.4592 (3.5409) grad_norm 1.4890 (1.5208) [2022-10-08 03:02:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][400/1251] eta 0:04:43 lr 0.000534 time 0.3322 (0.3326) loss 3.6063 (3.5389) grad_norm 1.2845 (1.5154) [2022-10-08 03:02:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][500/1251] eta 0:04:08 lr 0.000534 time 0.3296 (0.3315) loss 3.3883 (3.5378) grad_norm 1.3981 (1.5142) [2022-10-08 03:03:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][600/1251] eta 0:03:35 lr 0.000534 time 0.3247 (0.3308) loss 3.4705 (3.5427) grad_norm 1.6980 (1.5128) [2022-10-08 03:04:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][700/1251] eta 0:03:01 lr 0.000533 time 0.3272 (0.3302) loss 3.3840 (3.5468) grad_norm 1.4038 (1.5107) [2022-10-08 03:04:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][800/1251] eta 0:02:28 lr 0.000533 time 0.3309 (0.3300) loss 3.5419 (3.5453) grad_norm 1.2400 (1.5116) [2022-10-08 03:05:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][900/1251] eta 0:01:55 lr 0.000532 time 0.3233 (0.3297) loss 3.7631 (3.5470) grad_norm 1.5420 (1.5105) [2022-10-08 03:05:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][1000/1251] eta 0:01:22 lr 0.000532 time 0.3268 (0.3296) loss 3.6488 (3.5493) grad_norm 1.5599 (1.5126) [2022-10-08 03:06:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][1100/1251] eta 0:00:49 lr 0.000532 time 0.3296 (0.3296) loss 3.5997 (3.5482) grad_norm 1.3275 (1.5120) [2022-10-08 03:06:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [144/300][1200/1251] eta 0:00:16 lr 0.000531 time 0.3305 (0.3295) loss 3.2060 (3.5472) grad_norm 2.1326 (1.5128) [2022-10-08 03:07:06 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 144 training takes 0:06:52 [2022-10-08 03:07:09 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.328 (3.328) Loss 1.0472 (1.0472) Acc@1 75.781 (75.781) Acc@5 93.457 (93.457) [2022-10-08 03:07:19 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.430 Acc@5 93.066 [2022-10-08 03:07:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-08 03:07:19 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.48% [2022-10-08 03:07:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][0/1251] eta 0:54:00 lr 0.000531 time 2.5906 (2.5906) loss 3.5929 (3.5929) grad_norm 1.5174 (1.5174) [2022-10-08 03:07:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][100/1251] eta 0:06:41 lr 0.000530 time 0.3240 (0.3488) loss 3.4305 (3.5345) grad_norm 1.8880 (1.5372) [2022-10-08 03:08:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][200/1251] eta 0:05:54 lr 0.000530 time 0.3328 (0.3372) loss 3.4094 (3.5635) grad_norm 1.7022 (1.5619) [2022-10-08 03:09:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][300/1251] eta 0:05:17 lr 0.000530 time 0.3272 (0.3334) loss 3.4455 (3.5544) grad_norm 1.4194 (1.5461) [2022-10-08 03:09:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][400/1251] eta 0:04:42 lr 0.000529 time 0.3308 (0.3314) loss 3.5088 (3.5471) grad_norm 1.5489 (1.5453) [2022-10-08 03:10:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][500/1251] eta 0:04:08 lr 0.000529 time 0.3265 (0.3303) loss 3.6183 (3.5393) grad_norm 1.4122 (1.5498) [2022-10-08 03:10:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][600/1251] eta 0:03:34 lr 0.000528 time 0.3255 (0.3294) loss 4.0028 (3.5414) grad_norm 1.4281 (1.5545) [2022-10-08 03:11:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][700/1251] eta 0:03:01 lr 0.000528 time 0.3277 (0.3292) loss 3.6126 (3.5403) grad_norm 1.3537 (1.5542) [2022-10-08 03:11:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][800/1251] eta 0:02:28 lr 0.000528 time 0.3228 (0.3287) loss 3.5000 (3.5418) grad_norm 1.8075 (1.5486) [2022-10-08 03:12:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][900/1251] eta 0:01:55 lr 0.000527 time 0.3286 (0.3283) loss 3.7528 (3.5422) grad_norm 1.4659 (1.5455) [2022-10-08 03:12:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][1000/1251] eta 0:01:22 lr 0.000527 time 0.3351 (0.3279) loss 3.5199 (3.5442) grad_norm 1.3351 (1.5431) [2022-10-08 03:13:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][1100/1251] eta 0:00:49 lr 0.000526 time 0.3241 (0.3277) loss 3.8839 (3.5461) grad_norm 1.4214 (1.5457) [2022-10-08 03:13:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [145/300][1200/1251] eta 0:00:16 lr 0.000526 time 0.3290 (0.3275) loss 3.3515 (3.5474) grad_norm 1.7970 (1.5461) [2022-10-08 03:14:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 145 training takes 0:06:49 [2022-10-08 03:14:12 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.857 (2.857) Loss 0.9826 (0.9826) Acc@1 75.684 (75.684) Acc@5 93.945 (93.945) [2022-10-08 03:14:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.744 Acc@5 93.204 [2022-10-08 03:14:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-10-08 03:14:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.74% [2022-10-08 03:14:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][0/1251] eta 0:59:33 lr 0.000526 time 2.8563 (2.8563) loss 4.0654 (4.0654) grad_norm 1.6402 (1.6402) [2022-10-08 03:14:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][100/1251] eta 0:06:45 lr 0.000525 time 0.3277 (0.3526) loss 3.7022 (3.5365) grad_norm 1.4458 (1.5526) [2022-10-08 03:15:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][200/1251] eta 0:05:56 lr 0.000525 time 0.3307 (0.3397) loss 3.6167 (3.5318) grad_norm 1.6207 (1.5389) [2022-10-08 03:16:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][300/1251] eta 0:05:19 lr 0.000524 time 0.3329 (0.3356) loss 3.7167 (3.5344) grad_norm 1.7130 (1.5283) [2022-10-08 03:16:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][400/1251] eta 0:04:44 lr 0.000524 time 0.3289 (0.3340) loss 3.3447 (3.5383) grad_norm 1.4697 (1.5363) [2022-10-08 03:17:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][500/1251] eta 0:04:10 lr 0.000524 time 0.3284 (0.3330) loss 3.3336 (3.5390) grad_norm 1.4003 (1.5363) [2022-10-08 03:17:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][600/1251] eta 0:03:36 lr 0.000523 time 0.3350 (0.3325) loss 3.7720 (3.5448) grad_norm 1.4167 (1.5359) [2022-10-08 03:18:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][700/1251] eta 0:03:03 lr 0.000523 time 0.3328 (0.3322) loss 3.5968 (3.5435) grad_norm 1.5799 (1.5324) [2022-10-08 03:18:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][800/1251] eta 0:02:29 lr 0.000522 time 0.3351 (0.3320) loss 3.6929 (3.5428) grad_norm 1.7925 (1.5275) [2022-10-08 03:19:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][900/1251] eta 0:01:56 lr 0.000522 time 0.3357 (0.3320) loss 3.4501 (3.5426) grad_norm 1.3331 (1.5303) [2022-10-08 03:19:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][1000/1251] eta 0:01:23 lr 0.000522 time 0.3261 (0.3320) loss 3.7798 (3.5477) grad_norm 1.3015 (1.5272) [2022-10-08 03:20:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][1100/1251] eta 0:00:50 lr 0.000521 time 0.3281 (0.3320) loss 3.7047 (3.5468) grad_norm 1.3820 (1.5254) [2022-10-08 03:21:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [146/300][1200/1251] eta 0:00:16 lr 0.000521 time 0.3381 (0.3321) loss 3.5959 (3.5486) grad_norm 1.3182 (1.5242) [2022-10-08 03:21:19 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 146 training takes 0:06:55 [2022-10-08 03:21:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.831 (2.831) Loss 1.0697 (1.0697) Acc@1 76.367 (76.367) Acc@5 93.066 (93.066) [2022-10-08 03:21:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.748 Acc@5 93.188 [2022-10-08 03:21:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-10-08 03:21:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.75% [2022-10-08 03:21:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][0/1251] eta 1:04:01 lr 0.000521 time 3.0709 (3.0709) loss 3.6753 (3.6753) grad_norm 1.2806 (1.2806) [2022-10-08 03:22:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][100/1251] eta 0:06:47 lr 0.000520 time 0.3230 (0.3543) loss 3.5834 (3.5359) grad_norm 1.4262 (1.5436) [2022-10-08 03:22:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][200/1251] eta 0:05:58 lr 0.000520 time 0.3242 (0.3407) loss 3.5830 (3.5352) grad_norm 1.4402 (1.5303) [2022-10-08 03:23:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][300/1251] eta 0:05:19 lr 0.000519 time 0.3219 (0.3359) loss 3.4670 (3.5351) grad_norm 1.2779 (1.5170) [2022-10-08 03:23:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][400/1251] eta 0:04:43 lr 0.000519 time 0.3258 (0.3335) loss 3.6501 (3.5370) grad_norm 1.5856 (1.5289) [2022-10-08 03:24:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][500/1251] eta 0:04:09 lr 0.000518 time 0.3269 (0.3321) loss 3.6411 (3.5376) grad_norm 1.3560 (1.5305) [2022-10-08 03:24:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][600/1251] eta 0:03:35 lr 0.000518 time 0.3248 (0.3311) loss 3.4991 (3.5399) grad_norm 1.4020 (1.5269) [2022-10-08 03:25:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][700/1251] eta 0:03:02 lr 0.000518 time 0.3226 (0.3304) loss 3.7779 (3.5437) grad_norm 1.3931 (1.5310) [2022-10-08 03:25:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][800/1251] eta 0:02:28 lr 0.000517 time 0.3261 (0.3299) loss 3.6463 (3.5444) grad_norm 1.3924 (1.5314) [2022-10-08 03:26:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][900/1251] eta 0:01:55 lr 0.000517 time 0.3276 (0.3297) loss 3.6584 (3.5458) grad_norm 1.4893 (1.5359) [2022-10-08 03:27:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][1000/1251] eta 0:01:22 lr 0.000516 time 0.3267 (0.3297) loss 3.3684 (3.5435) grad_norm 1.4884 (1.5373) [2022-10-08 03:27:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][1100/1251] eta 0:00:49 lr 0.000516 time 0.3337 (0.3296) loss 3.3793 (3.5435) grad_norm 1.3322 (1.5351) [2022-10-08 03:28:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [147/300][1200/1251] eta 0:00:16 lr 0.000516 time 0.3259 (0.3296) loss 3.6809 (3.5456) grad_norm 1.9078 (1.5320) [2022-10-08 03:28:26 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 147 training takes 0:06:52 [2022-10-08 03:28:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.874 (2.874) Loss 1.0563 (1.0563) Acc@1 74.219 (74.219) Acc@5 93.652 (93.652) [2022-10-08 03:28:39 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.654 Acc@5 93.116 [2022-10-08 03:28:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-10-08 03:28:39 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.75% [2022-10-08 03:28:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][0/1251] eta 0:59:21 lr 0.000515 time 2.8466 (2.8466) loss 3.5046 (3.5046) grad_norm 1.6091 (1.6091) [2022-10-08 03:29:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][100/1251] eta 0:06:45 lr 0.000515 time 0.3202 (0.3522) loss 3.3057 (3.5237) grad_norm 1.3381 (1.5886) [2022-10-08 03:29:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][200/1251] eta 0:05:56 lr 0.000515 time 0.3257 (0.3390) loss 3.5437 (3.5348) grad_norm 1.5867 (1.5549) [2022-10-08 03:30:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][300/1251] eta 0:05:18 lr 0.000514 time 0.3211 (0.3347) loss 3.2359 (3.5401) grad_norm 1.4083 (1.5582) [2022-10-08 03:30:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][400/1251] eta 0:04:43 lr 0.000514 time 0.3271 (0.3326) loss 3.4932 (3.5422) grad_norm 1.5576 (1.5584) [2022-10-08 03:31:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][500/1251] eta 0:04:08 lr 0.000513 time 0.3324 (0.3313) loss 3.7636 (3.5480) grad_norm 1.3797 (1.5506) [2022-10-08 03:31:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][600/1251] eta 0:03:35 lr 0.000513 time 0.3273 (0.3309) loss 3.3826 (3.5498) grad_norm 1.4976 (1.5479) [2022-10-08 03:32:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][700/1251] eta 0:03:01 lr 0.000512 time 0.3275 (0.3301) loss 3.7841 (3.5494) grad_norm 1.6654 (1.5476) [2022-10-08 03:33:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][800/1251] eta 0:02:28 lr 0.000512 time 0.3293 (0.3295) loss 3.4508 (3.5457) grad_norm 1.5336 (1.5441) [2022-10-08 03:33:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][900/1251] eta 0:01:55 lr 0.000512 time 0.3214 (0.3291) loss 3.3082 (3.5477) grad_norm 1.4995 (1.5492) [2022-10-08 03:34:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][1000/1251] eta 0:01:22 lr 0.000511 time 0.3218 (0.3288) loss 3.6087 (3.5524) grad_norm 1.4204 (1.5491) [2022-10-08 03:34:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][1100/1251] eta 0:00:49 lr 0.000511 time 0.3299 (0.3284) loss 3.8443 (3.5544) grad_norm 1.4028 (1.5509) [2022-10-08 03:35:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [148/300][1200/1251] eta 0:00:16 lr 0.000510 time 0.3234 (0.3282) loss 3.4373 (3.5522) grad_norm 1.3478 (1.5527) [2022-10-08 03:35:30 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 148 training takes 0:06:50 [2022-10-08 03:35:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.460 (2.460) Loss 1.0284 (1.0284) Acc@1 74.219 (74.219) Acc@5 93.555 (93.555) [2022-10-08 03:35:44 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.896 Acc@5 93.208 [2022-10-08 03:35:44 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-08 03:35:44 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 75.90% [2022-10-08 03:35:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][0/1251] eta 0:55:24 lr 0.000510 time 2.6573 (2.6573) loss 3.9248 (3.9248) grad_norm 1.4325 (1.4325) [2022-10-08 03:36:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][100/1251] eta 0:06:44 lr 0.000510 time 0.3229 (0.3510) loss 3.7441 (3.5445) grad_norm 1.4348 (1.5402) [2022-10-08 03:36:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][200/1251] eta 0:05:55 lr 0.000509 time 0.3272 (0.3386) loss 3.3305 (3.5408) grad_norm 1.6757 (1.5453) [2022-10-08 03:37:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][300/1251] eta 0:05:17 lr 0.000509 time 0.3331 (0.3343) loss 3.3399 (3.5542) grad_norm 1.3513 (1.5526) [2022-10-08 03:37:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][400/1251] eta 0:04:42 lr 0.000509 time 0.3275 (0.3321) loss 3.5676 (3.5455) grad_norm 1.4834 (1.5440) [2022-10-08 03:38:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][500/1251] eta 0:04:08 lr 0.000508 time 0.3272 (0.3310) loss 3.7598 (3.5453) grad_norm 1.5978 (1.5410) [2022-10-08 03:39:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][600/1251] eta 0:03:35 lr 0.000508 time 0.3276 (0.3303) loss 3.6247 (3.5450) grad_norm 1.7812 (1.5449) [2022-10-08 03:39:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][700/1251] eta 0:03:02 lr 0.000507 time 0.3210 (0.3307) loss 3.6320 (3.5451) grad_norm 1.3490 (1.5499) [2022-10-08 03:40:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][800/1251] eta 0:02:29 lr 0.000507 time 0.3221 (0.3305) loss 3.6398 (3.5419) grad_norm 1.5181 (1.5502) [2022-10-08 03:40:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][900/1251] eta 0:01:55 lr 0.000506 time 0.3254 (0.3304) loss 3.7114 (3.5426) grad_norm 1.3690 (1.5514) [2022-10-08 03:41:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][1000/1251] eta 0:01:22 lr 0.000506 time 0.3296 (0.3304) loss 3.3201 (3.5430) grad_norm 1.5102 (1.5567) [2022-10-08 03:41:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][1100/1251] eta 0:00:49 lr 0.000506 time 0.3230 (0.3304) loss 3.5855 (3.5420) grad_norm 1.8006 (1.5593) [2022-10-08 03:42:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [149/300][1200/1251] eta 0:00:16 lr 0.000505 time 0.3351 (0.3304) loss 3.7521 (3.5419) grad_norm 1.7515 (1.5607) [2022-10-08 03:42:37 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 149 training takes 0:06:53 [2022-10-08 03:42:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.158 (3.158) Loss 1.0514 (1.0514) Acc@1 75.195 (75.195) Acc@5 92.969 (92.969) [2022-10-08 03:42:51 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.064 Acc@5 93.280 [2022-10-08 03:42:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-08 03:42:51 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.06% [2022-10-08 03:42:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][0/1251] eta 0:54:41 lr 0.000505 time 2.6227 (2.6227) loss 3.7748 (3.7748) grad_norm 1.6533 (1.6533) [2022-10-08 03:43:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][100/1251] eta 0:06:42 lr 0.000505 time 0.3281 (0.3496) loss 3.3905 (3.5470) grad_norm 1.5011 (1.5753) [2022-10-08 03:43:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][200/1251] eta 0:05:54 lr 0.000504 time 0.3251 (0.3370) loss 3.3602 (3.5524) grad_norm 1.7041 (1.5549) [2022-10-08 03:44:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][300/1251] eta 0:05:16 lr 0.000504 time 0.3284 (0.3328) loss 3.8338 (3.5557) grad_norm 1.3463 (1.5524) [2022-10-08 03:45:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][400/1251] eta 0:04:41 lr 0.000503 time 0.3229 (0.3307) loss 3.3933 (3.5495) grad_norm 1.5381 (1.5511) [2022-10-08 03:45:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][500/1251] eta 0:04:07 lr 0.000503 time 0.3253 (0.3296) loss 3.5888 (3.5406) grad_norm 1.7628 (1.5563) [2022-10-08 03:46:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][600/1251] eta 0:03:34 lr 0.000503 time 0.3231 (0.3288) loss 3.3926 (3.5430) grad_norm 1.8396 (1.5584) [2022-10-08 03:46:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][700/1251] eta 0:03:00 lr 0.000502 time 0.3217 (0.3281) loss 3.7220 (3.5453) grad_norm 1.7325 (1.5586) [2022-10-08 03:47:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][800/1251] eta 0:02:27 lr 0.000502 time 0.3234 (0.3278) loss 3.6332 (3.5444) grad_norm 1.5893 (1.5564) [2022-10-08 03:47:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][900/1251] eta 0:01:54 lr 0.000501 time 0.3234 (0.3276) loss 3.4567 (3.5420) grad_norm 1.4029 (1.5585) [2022-10-08 03:48:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][1000/1251] eta 0:01:22 lr 0.000501 time 0.3222 (0.3275) loss 3.8387 (3.5398) grad_norm 2.0466 (1.5588) [2022-10-08 03:48:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][1100/1251] eta 0:00:49 lr 0.000500 time 0.3260 (0.3275) loss 3.7611 (3.5412) grad_norm 1.2867 (1.5568) [2022-10-08 03:49:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [150/300][1200/1251] eta 0:00:16 lr 0.000500 time 0.3247 (0.3276) loss 3.5492 (3.5409) grad_norm 1.7681 (1.5570) [2022-10-08 03:49:41 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 150 training takes 0:06:50 [2022-10-08 03:49:41 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_150 saving...... [2022-10-08 03:49:42 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_150 saved !!! [2022-10-08 03:49:44 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.371 (2.371) Loss 0.9804 (0.9804) Acc@1 75.977 (75.977) Acc@5 94.336 (94.336) [2022-10-08 03:49:55 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.050 Acc@5 93.286 [2022-10-08 03:49:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-08 03:49:55 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.06% [2022-10-08 03:49:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][0/1251] eta 1:03:19 lr 0.000500 time 3.0371 (3.0371) loss 3.5686 (3.5686) grad_norm 1.3726 (1.3726) [2022-10-08 03:50:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][100/1251] eta 0:06:48 lr 0.000499 time 0.3274 (0.3548) loss 3.3565 (3.5383) grad_norm 1.5866 (1.5483) [2022-10-08 03:51:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][200/1251] eta 0:05:58 lr 0.000499 time 0.3306 (0.3409) loss 3.5529 (3.5408) grad_norm 1.8636 (1.5439) [2022-10-08 03:51:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][300/1251] eta 0:05:19 lr 0.000499 time 0.3210 (0.3357) loss 3.6611 (3.5410) grad_norm 1.7417 (1.5562) [2022-10-08 03:52:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][400/1251] eta 0:04:43 lr 0.000498 time 0.3245 (0.3330) loss 3.4007 (3.5395) grad_norm 1.8474 (1.5689) [2022-10-08 03:52:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][500/1251] eta 0:04:09 lr 0.000498 time 0.3274 (0.3319) loss 3.2618 (3.5348) grad_norm 1.4567 (1.5718) [2022-10-08 03:53:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][600/1251] eta 0:03:35 lr 0.000497 time 0.3258 (0.3306) loss 3.5314 (3.5371) grad_norm 1.6896 (1.5737) [2022-10-08 03:53:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][700/1251] eta 0:03:01 lr 0.000497 time 0.3276 (0.3298) loss 3.1677 (3.5336) grad_norm 1.7167 (1.5727) [2022-10-08 03:54:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][800/1251] eta 0:02:28 lr 0.000497 time 0.3241 (0.3291) loss 3.7246 (3.5347) grad_norm 1.3891 (1.5671) [2022-10-08 03:54:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][900/1251] eta 0:01:55 lr 0.000496 time 0.3221 (0.3285) loss 3.6954 (3.5353) grad_norm 1.5947 (1.5677) [2022-10-08 03:55:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][1000/1251] eta 0:01:22 lr 0.000496 time 0.3233 (0.3280) loss 3.5466 (3.5352) grad_norm 1.4627 (1.5700) [2022-10-08 03:55:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][1100/1251] eta 0:00:49 lr 0.000495 time 0.3237 (0.3276) loss 3.5077 (3.5372) grad_norm 1.5807 (1.5674) [2022-10-08 03:56:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [151/300][1200/1251] eta 0:00:16 lr 0.000495 time 0.3259 (0.3274) loss 3.3022 (3.5363) grad_norm 1.3958 (1.5695) [2022-10-08 03:56:44 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 151 training takes 0:06:49 [2022-10-08 03:56:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.630 (2.630) Loss 0.9629 (0.9629) Acc@1 78.516 (78.516) Acc@5 94.043 (94.043) [2022-10-08 03:56:58 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 75.968 Acc@5 93.266 [2022-10-08 03:56:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-08 03:56:58 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.06% [2022-10-08 03:57:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][0/1251] eta 0:48:35 lr 0.000495 time 2.3306 (2.3306) loss 3.7132 (3.7132) grad_norm 1.4564 (1.4564) [2022-10-08 03:57:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][100/1251] eta 0:06:45 lr 0.000494 time 0.3211 (0.3522) loss 3.7693 (3.4998) grad_norm 1.4228 (1.6408) [2022-10-08 03:58:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][200/1251] eta 0:05:56 lr 0.000494 time 0.3249 (0.3393) loss 3.3578 (3.5321) grad_norm 1.6102 (1.6241) [2022-10-08 03:58:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][300/1251] eta 0:05:18 lr 0.000493 time 0.3303 (0.3351) loss 3.6898 (3.5313) grad_norm 1.8549 (1.6374) [2022-10-08 03:59:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][400/1251] eta 0:04:43 lr 0.000493 time 0.3265 (0.3331) loss 3.5858 (3.5260) grad_norm 2.0880 (1.6216) [2022-10-08 03:59:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][500/1251] eta 0:04:09 lr 0.000493 time 0.3268 (0.3322) loss 3.6588 (3.5195) grad_norm 1.6927 (1.6198) [2022-10-08 04:00:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][600/1251] eta 0:03:35 lr 0.000492 time 0.3294 (0.3314) loss 3.6799 (3.5206) grad_norm 1.6359 (1.6230) [2022-10-08 04:00:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][700/1251] eta 0:03:02 lr 0.000492 time 0.3277 (0.3311) loss 3.6321 (3.5250) grad_norm 1.4496 (1.6187) [2022-10-08 04:01:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][800/1251] eta 0:02:29 lr 0.000491 time 0.3404 (0.3308) loss 3.6380 (3.5255) grad_norm 1.4473 (1.6121) [2022-10-08 04:01:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][900/1251] eta 0:01:56 lr 0.000491 time 0.3248 (0.3307) loss 3.6168 (3.5310) grad_norm 1.3949 (1.6097) [2022-10-08 04:02:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][1000/1251] eta 0:01:22 lr 0.000490 time 0.3297 (0.3306) loss 3.8865 (3.5341) grad_norm 1.6201 (1.6062) [2022-10-08 04:03:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][1100/1251] eta 0:00:49 lr 0.000490 time 0.3300 (0.3307) loss 3.4371 (3.5351) grad_norm 1.3827 (1.6001) [2022-10-08 04:03:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [152/300][1200/1251] eta 0:00:16 lr 0.000490 time 0.3280 (0.3307) loss 3.6860 (3.5345) grad_norm 1.7158 (1.5992) [2022-10-08 04:03:53 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 152 training takes 0:06:54 [2022-10-08 04:03:55 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.436 (2.436) Loss 1.0337 (1.0337) Acc@1 75.293 (75.293) Acc@5 93.164 (93.164) [2022-10-08 04:04:06 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.178 Acc@5 93.290 [2022-10-08 04:04:06 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-08 04:04:06 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.18% [2022-10-08 04:04:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][0/1251] eta 1:11:03 lr 0.000489 time 3.4083 (3.4083) loss 3.6133 (3.6133) grad_norm 1.6979 (1.6979) [2022-10-08 04:04:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][100/1251] eta 0:06:49 lr 0.000489 time 0.3253 (0.3556) loss 3.6102 (3.5238) grad_norm 1.5662 (1.5633) [2022-10-08 04:05:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][200/1251] eta 0:05:58 lr 0.000489 time 0.3269 (0.3409) loss 3.3897 (3.4993) grad_norm 1.5715 (1.5618) [2022-10-08 04:05:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][300/1251] eta 0:05:19 lr 0.000488 time 0.3230 (0.3360) loss 3.6705 (3.5088) grad_norm 1.5635 (1.5812) [2022-10-08 04:06:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][400/1251] eta 0:04:43 lr 0.000488 time 0.3229 (0.3333) loss 3.7030 (3.5165) grad_norm 1.6145 (1.5933) [2022-10-08 04:06:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][500/1251] eta 0:04:09 lr 0.000487 time 0.3234 (0.3317) loss 3.3871 (3.5229) grad_norm 1.4503 (1.5942) [2022-10-08 04:07:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][600/1251] eta 0:03:35 lr 0.000487 time 0.3250 (0.3306) loss 3.3650 (3.5190) grad_norm 1.8318 (1.5953) [2022-10-08 04:07:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][700/1251] eta 0:03:01 lr 0.000487 time 0.3223 (0.3298) loss 3.7846 (3.5221) grad_norm 1.7902 (1.5938) [2022-10-08 04:08:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][800/1251] eta 0:02:28 lr 0.000486 time 0.3227 (0.3292) loss 3.4154 (3.5239) grad_norm 1.9863 (1.5927) [2022-10-08 04:09:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][900/1251] eta 0:01:55 lr 0.000486 time 0.3312 (0.3288) loss 3.4214 (3.5233) grad_norm 1.4761 (1.5939) [2022-10-08 04:09:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][1000/1251] eta 0:01:22 lr 0.000485 time 0.3252 (0.3286) loss 3.3084 (3.5243) grad_norm 1.3402 (1.5940) [2022-10-08 04:10:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][1100/1251] eta 0:00:49 lr 0.000485 time 0.3305 (0.3285) loss 3.7572 (3.5218) grad_norm 1.4784 (1.5923) [2022-10-08 04:10:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [153/300][1200/1251] eta 0:00:16 lr 0.000484 time 0.3219 (0.3285) loss 3.5026 (3.5214) grad_norm 1.6978 (1.5942) [2022-10-08 04:10:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 153 training takes 0:06:51 [2022-10-08 04:11:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.881 (2.881) Loss 1.0020 (1.0020) Acc@1 76.172 (76.172) Acc@5 92.480 (92.480) [2022-10-08 04:11:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.194 Acc@5 93.488 [2022-10-08 04:11:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-08 04:11:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.19% [2022-10-08 04:11:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][0/1251] eta 1:05:53 lr 0.000484 time 3.1600 (3.1600) loss 3.6447 (3.6447) grad_norm 1.5411 (1.5411) [2022-10-08 04:11:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][100/1251] eta 0:06:46 lr 0.000484 time 0.3259 (0.3529) loss 3.6452 (3.4985) grad_norm 1.7328 (1.5683) [2022-10-08 04:12:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][200/1251] eta 0:05:56 lr 0.000483 time 0.3242 (0.3391) loss 3.5941 (3.4993) grad_norm 1.6109 (1.5911) [2022-10-08 04:12:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][300/1251] eta 0:05:17 lr 0.000483 time 0.3257 (0.3343) loss 3.5634 (3.5013) grad_norm 1.5433 (1.5958) [2022-10-08 04:13:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][400/1251] eta 0:04:42 lr 0.000483 time 0.3264 (0.3323) loss 3.5430 (3.5040) grad_norm 1.5898 (1.6040) [2022-10-08 04:13:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][500/1251] eta 0:04:08 lr 0.000482 time 0.3240 (0.3307) loss 3.4552 (3.5071) grad_norm 1.7266 (1.5972) [2022-10-08 04:14:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][600/1251] eta 0:03:34 lr 0.000482 time 0.3247 (0.3296) loss 3.2875 (3.5064) grad_norm 1.5309 (1.5992) [2022-10-08 04:15:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][700/1251] eta 0:03:01 lr 0.000481 time 0.3232 (0.3289) loss 3.5179 (3.5120) grad_norm 1.6259 (1.6003) [2022-10-08 04:15:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][800/1251] eta 0:02:28 lr 0.000481 time 0.3291 (0.3283) loss 3.7800 (3.5151) grad_norm 1.8970 (1.6081) [2022-10-08 04:16:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][900/1251] eta 0:01:55 lr 0.000481 time 0.3237 (0.3279) loss 3.2520 (3.5120) grad_norm 1.8307 (1.6088) [2022-10-08 04:16:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][1000/1251] eta 0:01:22 lr 0.000480 time 0.3233 (0.3276) loss 3.6389 (3.5150) grad_norm 1.7364 (1.6054) [2022-10-08 04:17:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][1100/1251] eta 0:00:49 lr 0.000480 time 0.3246 (0.3273) loss 3.7101 (3.5161) grad_norm 1.4324 (1.6033) [2022-10-08 04:17:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [154/300][1200/1251] eta 0:00:16 lr 0.000479 time 0.3205 (0.3271) loss 3.7668 (3.5196) grad_norm 1.8404 (1.6039) [2022-10-08 04:18:00 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 154 training takes 0:06:49 [2022-10-08 04:18:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.595 (2.595) Loss 0.9606 (0.9606) Acc@1 77.441 (77.441) Acc@5 93.652 (93.652) [2022-10-08 04:18:14 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.084 Acc@5 93.334 [2022-10-08 04:18:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-08 04:18:14 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.19% [2022-10-08 04:18:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][0/1251] eta 0:47:24 lr 0.000479 time 2.2735 (2.2735) loss 3.5664 (3.5664) grad_norm 1.7667 (1.7667) [2022-10-08 04:18:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][100/1251] eta 0:06:42 lr 0.000479 time 0.3291 (0.3494) loss 3.4766 (3.5110) grad_norm 1.6988 (1.5868) [2022-10-08 04:19:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][200/1251] eta 0:05:54 lr 0.000478 time 0.3249 (0.3376) loss 3.4957 (3.4948) grad_norm 1.6743 (1.5845) [2022-10-08 04:19:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][300/1251] eta 0:05:17 lr 0.000478 time 0.3346 (0.3340) loss 3.2638 (3.5077) grad_norm 1.3117 (1.5791) [2022-10-08 04:20:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][400/1251] eta 0:04:42 lr 0.000477 time 0.3243 (0.3325) loss 3.5495 (3.5108) grad_norm 1.4502 (1.5711) [2022-10-08 04:21:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][500/1251] eta 0:04:09 lr 0.000477 time 0.3294 (0.3318) loss 3.3838 (3.5114) grad_norm 1.8913 (1.5751) [2022-10-08 04:21:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][600/1251] eta 0:03:35 lr 0.000477 time 0.3211 (0.3314) loss 3.7809 (3.5143) grad_norm 1.6420 (1.5778) [2022-10-08 04:22:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][700/1251] eta 0:03:02 lr 0.000476 time 0.3230 (0.3312) loss 3.2446 (3.5134) grad_norm 1.5374 (1.5819) [2022-10-08 04:22:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][800/1251] eta 0:02:29 lr 0.000476 time 0.3253 (0.3311) loss 3.5315 (3.5133) grad_norm 1.4617 (1.5808) [2022-10-08 04:23:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][900/1251] eta 0:01:56 lr 0.000475 time 0.3314 (0.3311) loss 3.2496 (3.5125) grad_norm 1.5811 (1.5841) [2022-10-08 04:23:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][1000/1251] eta 0:01:23 lr 0.000475 time 0.3257 (0.3311) loss 3.5272 (3.5123) grad_norm 1.5697 (1.5852) [2022-10-08 04:24:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][1100/1251] eta 0:00:49 lr 0.000475 time 0.3358 (0.3311) loss 3.5408 (3.5120) grad_norm 1.5352 (1.5903) [2022-10-08 04:24:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [155/300][1200/1251] eta 0:00:16 lr 0.000474 time 0.3372 (0.3312) loss 3.4401 (3.5127) grad_norm 1.9919 (1.5908) [2022-10-08 04:25:08 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 155 training takes 0:06:54 [2022-10-08 04:25:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.835 (2.835) Loss 0.9612 (0.9612) Acc@1 77.344 (77.344) Acc@5 93.945 (93.945) [2022-10-08 04:25:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.092 Acc@5 93.230 [2022-10-08 04:25:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-08 04:25:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.19% [2022-10-08 04:25:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][0/1251] eta 0:53:28 lr 0.000474 time 2.5646 (2.5646) loss 3.7912 (3.7912) grad_norm 2.4059 (2.4059) [2022-10-08 04:25:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][100/1251] eta 0:06:46 lr 0.000474 time 0.3306 (0.3534) loss 3.7888 (3.5144) grad_norm 1.5491 (1.5991) [2022-10-08 04:26:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][200/1251] eta 0:05:57 lr 0.000473 time 0.3253 (0.3397) loss 3.3936 (3.5155) grad_norm 1.7072 (1.6127) [2022-10-08 04:27:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][300/1251] eta 0:05:18 lr 0.000473 time 0.3322 (0.3350) loss 3.3815 (3.5025) grad_norm 1.4719 (1.6087) [2022-10-08 04:27:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][400/1251] eta 0:04:43 lr 0.000472 time 0.3221 (0.3326) loss 3.5773 (3.5066) grad_norm 1.4955 (1.6089) [2022-10-08 04:28:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][500/1251] eta 0:04:08 lr 0.000472 time 0.3224 (0.3313) loss 3.4531 (3.5073) grad_norm 1.6036 (1.6063) [2022-10-08 04:28:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][600/1251] eta 0:03:34 lr 0.000471 time 0.3233 (0.3302) loss 3.6434 (3.5051) grad_norm 1.5628 (1.6051) [2022-10-08 04:29:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][700/1251] eta 0:03:01 lr 0.000471 time 0.3246 (0.3295) loss 3.3429 (3.5065) grad_norm 1.6155 (1.5968) [2022-10-08 04:29:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][800/1251] eta 0:02:28 lr 0.000471 time 0.3185 (0.3290) loss 3.5102 (3.5053) grad_norm 1.5378 (1.5979) [2022-10-08 04:30:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][900/1251] eta 0:01:55 lr 0.000470 time 0.3212 (0.3286) loss 3.6544 (3.5081) grad_norm 1.9880 (1.5973) [2022-10-08 04:30:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][1000/1251] eta 0:01:22 lr 0.000470 time 0.3239 (0.3283) loss 3.7488 (3.5058) grad_norm 1.3860 (1.5957) [2022-10-08 04:31:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][1100/1251] eta 0:00:49 lr 0.000469 time 0.3345 (0.3282) loss 3.6822 (3.5088) grad_norm 2.0026 (1.5971) [2022-10-08 04:31:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [156/300][1200/1251] eta 0:00:16 lr 0.000469 time 0.3256 (0.3280) loss 3.2298 (3.5130) grad_norm 1.4211 (1.5972) [2022-10-08 04:32:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 156 training takes 0:06:50 [2022-10-08 04:32:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.769 (2.769) Loss 0.9905 (0.9905) Acc@1 77.344 (77.344) Acc@5 93.555 (93.555) [2022-10-08 04:32:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.360 Acc@5 93.444 [2022-10-08 04:32:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-08 04:32:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 04:32:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][0/1251] eta 1:07:28 lr 0.000469 time 3.2361 (3.2361) loss 3.4625 (3.4625) grad_norm 1.5122 (1.5122) [2022-10-08 04:33:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][100/1251] eta 0:06:49 lr 0.000468 time 0.3276 (0.3554) loss 3.5024 (3.5013) grad_norm 1.4888 (1.6297) [2022-10-08 04:33:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][200/1251] eta 0:06:00 lr 0.000468 time 0.5133 (0.3429) loss 3.4627 (3.4978) grad_norm 1.5976 (1.6434) [2022-10-08 04:34:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][300/1251] eta 0:05:21 lr 0.000468 time 0.3229 (0.3377) loss 3.1485 (3.4986) grad_norm 1.8164 (1.6166) [2022-10-08 04:34:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][400/1251] eta 0:04:45 lr 0.000467 time 0.3304 (0.3350) loss 3.7375 (3.4943) grad_norm 1.6427 (1.6122) [2022-10-08 04:35:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][500/1251] eta 0:04:10 lr 0.000467 time 0.3259 (0.3332) loss 3.6142 (3.5025) grad_norm 1.7713 (1.6126) [2022-10-08 04:35:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][600/1251] eta 0:03:36 lr 0.000466 time 0.3264 (0.3321) loss 3.4209 (3.5039) grad_norm 1.7612 (1.6141) [2022-10-08 04:36:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][700/1251] eta 0:03:02 lr 0.000466 time 0.3299 (0.3313) loss 3.5647 (3.5061) grad_norm 1.6269 (1.6197) [2022-10-08 04:36:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][800/1251] eta 0:02:29 lr 0.000465 time 0.3258 (0.3306) loss 3.2635 (3.5037) grad_norm 1.6026 (1.6189) [2022-10-08 04:37:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][900/1251] eta 0:01:55 lr 0.000465 time 0.3254 (0.3302) loss 3.6433 (3.5065) grad_norm 1.5982 (1.6257) [2022-10-08 04:37:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][1000/1251] eta 0:01:22 lr 0.000465 time 0.3269 (0.3299) loss 3.5646 (3.5080) grad_norm 1.5379 (1.6289) [2022-10-08 04:38:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][1100/1251] eta 0:00:49 lr 0.000464 time 0.3232 (0.3296) loss 3.6932 (3.5096) grad_norm 1.5187 (1.6272) [2022-10-08 04:39:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [157/300][1200/1251] eta 0:00:16 lr 0.000464 time 0.3297 (0.3293) loss 3.4619 (3.5088) grad_norm 1.6124 (1.6265) [2022-10-08 04:39:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 157 training takes 0:06:52 [2022-10-08 04:39:21 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.536 (2.536) Loss 0.9811 (0.9811) Acc@1 75.977 (75.977) Acc@5 93.750 (93.750) [2022-10-08 04:39:32 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.206 Acc@5 93.374 [2022-10-08 04:39:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-08 04:39:32 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 04:39:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][0/1251] eta 1:07:05 lr 0.000464 time 3.2180 (3.2180) loss 3.5630 (3.5630) grad_norm 1.5848 (1.5848) [2022-10-08 04:40:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][100/1251] eta 0:06:49 lr 0.000463 time 0.3209 (0.3556) loss 3.6996 (3.5012) grad_norm 1.5956 (1.6408) [2022-10-08 04:40:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][200/1251] eta 0:05:58 lr 0.000463 time 0.3205 (0.3408) loss 3.4726 (3.5095) grad_norm 1.4081 (1.6285) [2022-10-08 04:41:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][300/1251] eta 0:05:19 lr 0.000462 time 0.3231 (0.3361) loss 3.4041 (3.5130) grad_norm 1.8687 (1.6234) [2022-10-08 04:41:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][400/1251] eta 0:04:44 lr 0.000462 time 0.3229 (0.3337) loss 3.2855 (3.5086) grad_norm 1.6134 (1.6358) [2022-10-08 04:42:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][500/1251] eta 0:04:09 lr 0.000462 time 0.3270 (0.3324) loss 3.6043 (3.5074) grad_norm 1.7187 (1.6312) [2022-10-08 04:42:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][600/1251] eta 0:03:35 lr 0.000461 time 0.3255 (0.3315) loss 3.6265 (3.5077) grad_norm 1.6658 (1.6293) [2022-10-08 04:43:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][700/1251] eta 0:03:02 lr 0.000461 time 0.3369 (0.3309) loss 3.6575 (3.5064) grad_norm 1.6731 (1.6336) [2022-10-08 04:43:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][800/1251] eta 0:02:29 lr 0.000460 time 0.3276 (0.3306) loss 3.4965 (3.5084) grad_norm 1.3127 (1.6391) [2022-10-08 04:44:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][900/1251] eta 0:01:55 lr 0.000460 time 0.3331 (0.3305) loss 3.4456 (3.5085) grad_norm 1.6144 (1.6345) [2022-10-08 04:45:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][1000/1251] eta 0:01:22 lr 0.000459 time 0.3220 (0.3304) loss 3.6703 (3.5098) grad_norm 1.4650 (1.6322) [2022-10-08 04:45:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][1100/1251] eta 0:00:49 lr 0.000459 time 0.3295 (0.3304) loss 3.6745 (3.5081) grad_norm 1.8663 (1.6324) [2022-10-08 04:46:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [158/300][1200/1251] eta 0:00:16 lr 0.000459 time 0.3262 (0.3304) loss 3.3940 (3.5082) grad_norm 1.6305 (1.6377) [2022-10-08 04:46:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 158 training takes 0:06:53 [2022-10-08 04:46:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.985 (2.985) Loss 0.9691 (0.9691) Acc@1 76.465 (76.465) Acc@5 94.141 (94.141) [2022-10-08 04:46:39 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.220 Acc@5 93.418 [2022-10-08 04:46:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-08 04:46:39 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 04:46:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][0/1251] eta 0:45:38 lr 0.000458 time 2.1889 (2.1889) loss 3.5157 (3.5157) grad_norm 1.5876 (1.5876) [2022-10-08 04:47:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][100/1251] eta 0:06:42 lr 0.000458 time 0.3250 (0.3494) loss 3.6272 (3.4921) grad_norm 1.7093 (1.5952) [2022-10-08 04:47:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][200/1251] eta 0:05:54 lr 0.000458 time 0.3256 (0.3375) loss 3.4057 (3.4927) grad_norm 1.4983 (1.6083) [2022-10-08 04:48:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][300/1251] eta 0:05:17 lr 0.000457 time 0.3266 (0.3335) loss 3.4792 (3.4851) grad_norm 1.5428 (1.6147) [2022-10-08 04:48:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][400/1251] eta 0:04:42 lr 0.000457 time 0.3269 (0.3316) loss 3.6061 (3.4834) grad_norm 1.6829 (1.6244) [2022-10-08 04:49:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][500/1251] eta 0:04:08 lr 0.000456 time 0.3301 (0.3304) loss 3.4099 (3.4835) grad_norm 1.7968 (1.6255) [2022-10-08 04:49:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][600/1251] eta 0:03:34 lr 0.000456 time 0.3255 (0.3295) loss 3.5049 (3.4895) grad_norm 1.4149 (1.6230) [2022-10-08 04:50:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][700/1251] eta 0:03:01 lr 0.000456 time 0.3230 (0.3287) loss 3.6538 (3.4948) grad_norm 1.3477 (1.6256) [2022-10-08 04:51:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][800/1251] eta 0:02:28 lr 0.000455 time 0.3220 (0.3282) loss 3.8289 (3.4986) grad_norm 1.4597 (1.6209) [2022-10-08 04:51:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][900/1251] eta 0:01:55 lr 0.000455 time 0.3223 (0.3278) loss 3.7472 (3.4989) grad_norm 2.0333 (1.6208) [2022-10-08 04:52:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][1000/1251] eta 0:01:22 lr 0.000454 time 0.3238 (0.3275) loss 3.4321 (3.5013) grad_norm 1.5848 (1.6211) [2022-10-08 04:52:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][1100/1251] eta 0:00:49 lr 0.000454 time 0.3265 (0.3273) loss 3.2709 (3.5047) grad_norm 1.5811 (1.6193) [2022-10-08 04:53:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [159/300][1200/1251] eta 0:00:16 lr 0.000453 time 0.3271 (0.3272) loss 3.2901 (3.5039) grad_norm 1.3729 (1.6205) [2022-10-08 04:53:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 159 training takes 0:06:49 [2022-10-08 04:53:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.919 (2.919) Loss 0.9535 (0.9535) Acc@1 78.613 (78.613) Acc@5 94.238 (94.238) [2022-10-08 04:53:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.118 Acc@5 93.486 [2022-10-08 04:53:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-08 04:53:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 04:53:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][0/1251] eta 0:57:32 lr 0.000453 time 2.7597 (2.7597) loss 3.4159 (3.4159) grad_norm 1.6442 (1.6442) [2022-10-08 04:54:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][100/1251] eta 0:06:46 lr 0.000453 time 0.3244 (0.3529) loss 3.6896 (3.4771) grad_norm 1.7467 (1.6440) [2022-10-08 04:54:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][200/1251] eta 0:05:56 lr 0.000452 time 0.3248 (0.3390) loss 3.7151 (3.4803) grad_norm 1.9250 (1.6317) [2022-10-08 04:55:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][300/1251] eta 0:05:17 lr 0.000452 time 0.3232 (0.3342) loss 3.3203 (3.4919) grad_norm 1.4441 (1.6428) [2022-10-08 04:55:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][400/1251] eta 0:04:42 lr 0.000452 time 0.3222 (0.3317) loss 3.3925 (3.4928) grad_norm 1.9078 (1.6464) [2022-10-08 04:56:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][500/1251] eta 0:04:07 lr 0.000451 time 0.3232 (0.3302) loss 3.4672 (3.4929) grad_norm 1.4977 (1.6406) [2022-10-08 04:57:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][600/1251] eta 0:03:34 lr 0.000451 time 0.3242 (0.3293) loss 3.6636 (3.4907) grad_norm 1.7060 (1.6400) [2022-10-08 04:57:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][700/1251] eta 0:03:01 lr 0.000450 time 0.3273 (0.3287) loss 3.2529 (3.4913) grad_norm 1.6363 (1.6379) [2022-10-08 04:58:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][800/1251] eta 0:02:28 lr 0.000450 time 0.3224 (0.3282) loss 3.6805 (3.4925) grad_norm 1.4716 (1.6411) [2022-10-08 04:58:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][900/1251] eta 0:01:55 lr 0.000450 time 0.3223 (0.3277) loss 3.4119 (3.4940) grad_norm 1.5659 (1.6438) [2022-10-08 04:59:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][1000/1251] eta 0:01:22 lr 0.000449 time 0.3271 (0.3275) loss 3.4462 (3.4951) grad_norm 1.9420 (1.6441) [2022-10-08 04:59:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][1100/1251] eta 0:00:49 lr 0.000449 time 0.3209 (0.3272) loss 3.5607 (3.4902) grad_norm 1.4780 (1.6434) [2022-10-08 05:00:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [160/300][1200/1251] eta 0:00:16 lr 0.000448 time 0.3271 (0.3270) loss 3.8372 (3.4931) grad_norm 1.6165 (1.6466) [2022-10-08 05:00:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 160 training takes 0:06:49 [2022-10-08 05:00:31 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_160 saving...... [2022-10-08 05:00:32 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_160 saved !!! [2022-10-08 05:00:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.384 (2.384) Loss 0.8911 (0.8911) Acc@1 79.102 (79.102) Acc@5 94.629 (94.629) [2022-10-08 05:00:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.306 Acc@5 93.542 [2022-10-08 05:00:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-08 05:00:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 05:00:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][0/1251] eta 0:45:15 lr 0.000448 time 2.1708 (2.1708) loss 3.1214 (3.1214) grad_norm 1.5321 (1.5321) [2022-10-08 05:01:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][100/1251] eta 0:06:43 lr 0.000448 time 0.3248 (0.3505) loss 3.5533 (3.4867) grad_norm 1.7691 (1.6187) [2022-10-08 05:01:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][200/1251] eta 0:05:55 lr 0.000447 time 0.3269 (0.3381) loss 3.4008 (3.4811) grad_norm 1.6212 (1.6470) [2022-10-08 05:02:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][300/1251] eta 0:05:17 lr 0.000447 time 0.3213 (0.3337) loss 3.3113 (3.4719) grad_norm 1.5971 (1.6410) [2022-10-08 05:02:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][400/1251] eta 0:04:42 lr 0.000446 time 0.3308 (0.3316) loss 3.0965 (3.4720) grad_norm 1.5653 (1.6319) [2022-10-08 05:03:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][500/1251] eta 0:04:08 lr 0.000446 time 0.3318 (0.3306) loss 3.4380 (3.4751) grad_norm 2.1102 (1.6422) [2022-10-08 05:04:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][600/1251] eta 0:03:34 lr 0.000446 time 0.3286 (0.3299) loss 3.3932 (3.4806) grad_norm 1.7079 (1.6379) [2022-10-08 05:04:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][700/1251] eta 0:03:01 lr 0.000445 time 0.3250 (0.3295) loss 3.0944 (3.4816) grad_norm 1.7433 (1.6335) [2022-10-08 05:05:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][800/1251] eta 0:02:28 lr 0.000445 time 0.3229 (0.3295) loss 3.5511 (3.4867) grad_norm 1.5879 (1.6300) [2022-10-08 05:05:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][900/1251] eta 0:01:55 lr 0.000444 time 0.3260 (0.3296) loss 3.7115 (3.4883) grad_norm 1.4213 (1.6293) [2022-10-08 05:06:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][1000/1251] eta 0:01:22 lr 0.000444 time 0.3310 (0.3297) loss 3.4052 (3.4907) grad_norm 1.3695 (1.6291) [2022-10-08 05:06:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][1100/1251] eta 0:00:49 lr 0.000444 time 0.3338 (0.3299) loss 3.7768 (3.4900) grad_norm 2.2621 (1.6294) [2022-10-08 05:07:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [161/300][1200/1251] eta 0:00:16 lr 0.000443 time 0.3391 (0.3301) loss 3.5117 (3.4930) grad_norm 1.8364 (1.6396) [2022-10-08 05:07:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 161 training takes 0:06:53 [2022-10-08 05:07:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.692 (2.692) Loss 1.0236 (1.0236) Acc@1 76.660 (76.660) Acc@5 93.359 (93.359) [2022-10-08 05:07:52 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.166 Acc@5 93.494 [2022-10-08 05:07:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-08 05:07:52 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.36% [2022-10-08 05:07:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][0/1251] eta 1:08:20 lr 0.000443 time 3.2777 (3.2777) loss 3.6766 (3.6766) grad_norm 1.6606 (1.6606) [2022-10-08 05:08:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][100/1251] eta 0:06:50 lr 0.000443 time 0.3242 (0.3565) loss 3.2411 (3.4689) grad_norm 1.3144 (1.6219) [2022-10-08 05:09:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][200/1251] eta 0:05:59 lr 0.000442 time 0.3271 (0.3419) loss 3.4167 (3.4566) grad_norm 1.7238 (1.6328) [2022-10-08 05:09:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][300/1251] eta 0:05:20 lr 0.000442 time 0.3261 (0.3370) loss 3.6282 (3.4521) grad_norm 1.5383 (1.6352) [2022-10-08 05:10:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][400/1251] eta 0:04:44 lr 0.000441 time 0.3267 (0.3344) loss 3.2229 (3.4620) grad_norm 1.6492 (1.6409) [2022-10-08 05:10:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][500/1251] eta 0:04:10 lr 0.000441 time 0.3254 (0.3330) loss 3.5880 (3.4668) grad_norm 1.5657 (1.6452) [2022-10-08 05:11:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][600/1251] eta 0:03:36 lr 0.000440 time 0.3275 (0.3320) loss 3.4366 (3.4670) grad_norm 1.6319 (1.6511) [2022-10-08 05:11:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][700/1251] eta 0:03:02 lr 0.000440 time 0.3236 (0.3313) loss 3.3957 (3.4728) grad_norm 1.7125 (1.6623) [2022-10-08 05:12:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][800/1251] eta 0:02:29 lr 0.000440 time 0.3258 (0.3307) loss 3.5667 (3.4736) grad_norm 1.4317 (1.6668) [2022-10-08 05:12:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][900/1251] eta 0:01:55 lr 0.000439 time 0.3251 (0.3302) loss 3.6608 (3.4751) grad_norm 1.4629 (1.6634) [2022-10-08 05:13:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][1000/1251] eta 0:01:22 lr 0.000439 time 0.3309 (0.3298) loss 3.3515 (3.4765) grad_norm 1.5844 (1.6679) [2022-10-08 05:13:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][1100/1251] eta 0:00:49 lr 0.000438 time 0.3261 (0.3295) loss 3.5392 (3.4766) grad_norm 1.7514 (1.6648) [2022-10-08 05:14:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [162/300][1200/1251] eta 0:00:16 lr 0.000438 time 0.3228 (0.3292) loss 3.8074 (3.4761) grad_norm 1.6041 (1.6591) [2022-10-08 05:14:44 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 162 training takes 0:06:52 [2022-10-08 05:14:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.589 (2.589) Loss 1.0121 (1.0121) Acc@1 75.195 (75.195) Acc@5 92.188 (92.188) [2022-10-08 05:14:58 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.382 Acc@5 93.472 [2022-10-08 05:14:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-08 05:14:58 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.38% [2022-10-08 05:15:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][0/1251] eta 1:05:48 lr 0.000438 time 3.1564 (3.1564) loss 3.6323 (3.6323) grad_norm 1.6772 (1.6772) [2022-10-08 05:15:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][100/1251] eta 0:06:46 lr 0.000437 time 0.3251 (0.3534) loss 2.8973 (3.4656) grad_norm 1.7524 (1.6342) [2022-10-08 05:16:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][200/1251] eta 0:05:57 lr 0.000437 time 0.3313 (0.3400) loss 3.5165 (3.4656) grad_norm 1.4465 (1.6658) [2022-10-08 05:16:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][300/1251] eta 0:05:19 lr 0.000437 time 0.3300 (0.3360) loss 3.4098 (3.4776) grad_norm 1.5734 (1.6823) [2022-10-08 05:17:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][400/1251] eta 0:04:44 lr 0.000436 time 0.3320 (0.3341) loss 3.5543 (3.4780) grad_norm 1.5793 (1.6705) [2022-10-08 05:17:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][500/1251] eta 0:04:10 lr 0.000436 time 0.3295 (0.3333) loss 3.3597 (3.4759) grad_norm 1.5573 (1.6691) [2022-10-08 05:18:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][600/1251] eta 0:03:36 lr 0.000435 time 0.3325 (0.3329) loss 3.4617 (3.4818) grad_norm 1.7648 (1.6738) [2022-10-08 05:18:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][700/1251] eta 0:03:03 lr 0.000435 time 0.3286 (0.3327) loss 3.8717 (3.4857) grad_norm 1.5481 (1.6716) [2022-10-08 05:19:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][800/1251] eta 0:02:30 lr 0.000435 time 0.3377 (0.3327) loss 3.9216 (3.4854) grad_norm 2.1812 (1.6667) [2022-10-08 05:19:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][900/1251] eta 0:01:56 lr 0.000434 time 0.3364 (0.3327) loss 3.7293 (3.4874) grad_norm 1.4225 (1.6655) [2022-10-08 05:20:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][1000/1251] eta 0:01:23 lr 0.000434 time 0.3298 (0.3328) loss 3.4715 (3.4865) grad_norm 1.6228 (1.6640) [2022-10-08 05:21:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][1100/1251] eta 0:00:50 lr 0.000433 time 0.3282 (0.3328) loss 3.5113 (3.4855) grad_norm 1.5503 (1.6660) [2022-10-08 05:21:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [163/300][1200/1251] eta 0:00:16 lr 0.000433 time 0.3276 (0.3327) loss 3.2064 (3.4865) grad_norm 1.7401 (1.6655) [2022-10-08 05:21:55 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 163 training takes 0:06:56 [2022-10-08 05:21:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.862 (2.862) Loss 0.9741 (0.9741) Acc@1 76.953 (76.953) Acc@5 93.262 (93.262) [2022-10-08 05:22:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.566 Acc@5 93.662 [2022-10-08 05:22:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-08 05:22:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.57% [2022-10-08 05:22:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][0/1251] eta 0:57:17 lr 0.000433 time 2.7482 (2.7482) loss 3.2076 (3.2076) grad_norm 1.5755 (1.5755) [2022-10-08 05:22:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][100/1251] eta 0:06:42 lr 0.000432 time 0.3253 (0.3501) loss 3.4137 (3.4811) grad_norm 1.5473 (1.6626) [2022-10-08 05:23:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][200/1251] eta 0:05:54 lr 0.000432 time 0.3270 (0.3377) loss 3.3898 (3.4844) grad_norm 1.5967 (1.6599) [2022-10-08 05:23:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][300/1251] eta 0:05:16 lr 0.000431 time 0.3257 (0.3333) loss 3.1105 (3.4790) grad_norm 1.4069 (1.6560) [2022-10-08 05:24:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][400/1251] eta 0:04:41 lr 0.000431 time 0.3233 (0.3311) loss 3.4174 (3.4773) grad_norm 1.5467 (1.6606) [2022-10-08 05:24:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][500/1251] eta 0:04:07 lr 0.000431 time 0.3252 (0.3298) loss 3.3726 (3.4775) grad_norm 1.6607 (1.6646) [2022-10-08 05:25:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][600/1251] eta 0:03:34 lr 0.000430 time 0.3248 (0.3290) loss 3.7369 (3.4796) grad_norm 1.5964 (1.6635) [2022-10-08 05:25:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][700/1251] eta 0:03:01 lr 0.000430 time 0.3245 (0.3285) loss 3.5862 (3.4788) grad_norm 1.6435 (1.6629) [2022-10-08 05:26:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][800/1251] eta 0:02:28 lr 0.000429 time 0.3281 (0.3283) loss 3.5045 (3.4806) grad_norm 1.8242 (1.6595) [2022-10-08 05:27:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][900/1251] eta 0:01:55 lr 0.000429 time 0.3264 (0.3281) loss 3.3841 (3.4831) grad_norm 1.7027 (1.6617) [2022-10-08 05:27:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][1000/1251] eta 0:01:22 lr 0.000429 time 0.3228 (0.3280) loss 3.3038 (3.4847) grad_norm 1.6647 (1.6623) [2022-10-08 05:28:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][1100/1251] eta 0:00:49 lr 0.000428 time 0.3262 (0.3280) loss 3.6847 (3.4842) grad_norm 1.7573 (1.6652) [2022-10-08 05:28:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [164/300][1200/1251] eta 0:00:16 lr 0.000428 time 0.3247 (0.3282) loss 3.7892 (3.4845) grad_norm 1.4675 (1.6656) [2022-10-08 05:28:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 164 training takes 0:06:51 [2022-10-08 05:29:02 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.796 (2.796) Loss 0.9847 (0.9847) Acc@1 76.953 (76.953) Acc@5 93.652 (93.652) [2022-10-08 05:29:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.540 Acc@5 93.618 [2022-10-08 05:29:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-08 05:29:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.57% [2022-10-08 05:29:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][0/1251] eta 0:57:00 lr 0.000428 time 2.7346 (2.7346) loss 3.4559 (3.4559) grad_norm 1.9492 (1.9492) [2022-10-08 05:29:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][100/1251] eta 0:06:48 lr 0.000427 time 0.3267 (0.3550) loss 3.7402 (3.5030) grad_norm 1.9192 (1.6885) [2022-10-08 05:30:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][200/1251] eta 0:05:59 lr 0.000427 time 0.3317 (0.3418) loss 3.4839 (3.4890) grad_norm 1.7399 (1.6740) [2022-10-08 05:30:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][300/1251] eta 0:05:20 lr 0.000426 time 0.3298 (0.3370) loss 3.1606 (3.4819) grad_norm 1.5537 (1.6763) [2022-10-08 05:31:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][400/1251] eta 0:04:44 lr 0.000426 time 0.3244 (0.3345) loss 3.2888 (3.4823) grad_norm 1.6659 (1.6891) [2022-10-08 05:32:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][500/1251] eta 0:04:10 lr 0.000426 time 0.3288 (0.3329) loss 3.7823 (3.4817) grad_norm 1.7983 (1.6879) [2022-10-08 05:32:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][600/1251] eta 0:03:36 lr 0.000425 time 0.3277 (0.3318) loss 3.7128 (3.4804) grad_norm 1.8410 (1.6862) [2022-10-08 05:33:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][700/1251] eta 0:03:02 lr 0.000425 time 0.3221 (0.3308) loss 3.4417 (3.4789) grad_norm 1.4915 (1.6813) [2022-10-08 05:33:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][800/1251] eta 0:02:28 lr 0.000424 time 0.3250 (0.3302) loss 3.4466 (3.4775) grad_norm 1.8358 (1.6841) [2022-10-08 05:34:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][900/1251] eta 0:01:55 lr 0.000424 time 0.3291 (0.3297) loss 3.5253 (3.4768) grad_norm 1.6543 (1.6787) [2022-10-08 05:34:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][1000/1251] eta 0:01:22 lr 0.000423 time 0.3274 (0.3293) loss 3.5702 (3.4763) grad_norm 1.7850 (1.6831) [2022-10-08 05:35:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][1100/1251] eta 0:00:49 lr 0.000423 time 0.3276 (0.3290) loss 3.5201 (3.4749) grad_norm 1.8782 (1.6822) [2022-10-08 05:35:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [165/300][1200/1251] eta 0:00:16 lr 0.000423 time 0.3297 (0.3289) loss 3.3995 (3.4764) grad_norm 1.6668 (1.6825) [2022-10-08 05:36:05 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 165 training takes 0:06:51 [2022-10-08 05:36:08 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.350 (3.350) Loss 0.9293 (0.9293) Acc@1 79.395 (79.395) Acc@5 94.043 (94.043) [2022-10-08 05:36:18 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.624 Acc@5 93.676 [2022-10-08 05:36:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-08 05:36:18 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.62% [2022-10-08 05:36:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][0/1251] eta 0:53:05 lr 0.000422 time 2.5464 (2.5464) loss 3.8484 (3.8484) grad_norm 1.6461 (1.6461) [2022-10-08 05:36:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][100/1251] eta 0:06:43 lr 0.000422 time 0.3274 (0.3506) loss 3.8280 (3.4786) grad_norm 1.8024 (1.6701) [2022-10-08 05:37:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][200/1251] eta 0:05:56 lr 0.000422 time 0.3275 (0.3390) loss 3.3294 (3.4663) grad_norm 1.7872 (1.7149) [2022-10-08 05:37:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][300/1251] eta 0:05:18 lr 0.000421 time 0.3286 (0.3354) loss 3.6698 (3.4769) grad_norm 2.1357 (1.6976) [2022-10-08 05:38:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][400/1251] eta 0:04:44 lr 0.000421 time 0.3267 (0.3338) loss 3.5964 (3.4804) grad_norm 2.0765 (1.6945) [2022-10-08 05:39:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][500/1251] eta 0:04:10 lr 0.000420 time 0.3269 (0.3330) loss 3.5384 (3.4776) grad_norm 1.4327 (1.6954) [2022-10-08 05:39:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][600/1251] eta 0:03:36 lr 0.000420 time 0.3276 (0.3325) loss 3.6034 (3.4814) grad_norm 1.7979 (1.6846) [2022-10-08 05:40:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][700/1251] eta 0:03:03 lr 0.000420 time 0.3310 (0.3323) loss 3.7319 (3.4795) grad_norm 1.9405 (1.6791) [2022-10-08 05:40:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][800/1251] eta 0:02:29 lr 0.000419 time 0.3268 (0.3323) loss 3.4090 (3.4758) grad_norm 1.3981 (1.6925) [2022-10-08 05:41:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][900/1251] eta 0:01:56 lr 0.000419 time 0.3381 (0.3322) loss 3.8169 (3.4773) grad_norm 1.7421 (1.6946) [2022-10-08 05:41:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][1000/1251] eta 0:01:23 lr 0.000418 time 0.3289 (0.3323) loss 3.5369 (3.4792) grad_norm 1.8454 (1.6893) [2022-10-08 05:42:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][1100/1251] eta 0:00:50 lr 0.000418 time 0.3437 (0.3323) loss 3.7374 (3.4792) grad_norm 1.7710 (1.6912) [2022-10-08 05:42:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [166/300][1200/1251] eta 0:00:16 lr 0.000418 time 0.3372 (0.3324) loss 3.4067 (3.4811) grad_norm 2.0408 (1.6939) [2022-10-08 05:43:15 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 166 training takes 0:06:56 [2022-10-08 05:43:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.877 (2.877) Loss 0.9997 (0.9997) Acc@1 76.562 (76.562) Acc@5 93.066 (93.066) [2022-10-08 05:43:28 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.566 Acc@5 93.824 [2022-10-08 05:43:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-08 05:43:28 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.62% [2022-10-08 05:43:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][0/1251] eta 0:54:00 lr 0.000417 time 2.5903 (2.5903) loss 3.5197 (3.5197) grad_norm 1.5866 (1.5866) [2022-10-08 05:44:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][100/1251] eta 0:06:45 lr 0.000417 time 0.3257 (0.3523) loss 3.3871 (3.4630) grad_norm 1.6175 (1.7075) [2022-10-08 05:44:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][200/1251] eta 0:05:56 lr 0.000417 time 0.3268 (0.3390) loss 3.4305 (3.4652) grad_norm 1.7931 (1.6878) [2022-10-08 05:45:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][300/1251] eta 0:05:18 lr 0.000416 time 0.3309 (0.3346) loss 3.7463 (3.4686) grad_norm 1.7894 (1.7044) [2022-10-08 05:45:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][400/1251] eta 0:04:42 lr 0.000416 time 0.3251 (0.3324) loss 3.8341 (3.4666) grad_norm 1.6348 (1.7088) [2022-10-08 05:46:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][500/1251] eta 0:04:08 lr 0.000415 time 0.3230 (0.3310) loss 3.6503 (3.4711) grad_norm 1.5559 (1.7103) [2022-10-08 05:46:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][600/1251] eta 0:03:34 lr 0.000415 time 0.3198 (0.3302) loss 3.6120 (3.4678) grad_norm 1.8347 (1.7084) [2022-10-08 05:47:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][700/1251] eta 0:03:01 lr 0.000414 time 0.3307 (0.3296) loss 3.5639 (3.4697) grad_norm 1.7769 (1.7063) [2022-10-08 05:47:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][800/1251] eta 0:02:28 lr 0.000414 time 0.3237 (0.3291) loss 3.2954 (3.4676) grad_norm 1.6927 (1.7012) [2022-10-08 05:48:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][900/1251] eta 0:01:55 lr 0.000414 time 0.3203 (0.3287) loss 3.4066 (3.4673) grad_norm 1.8790 (1.6961) [2022-10-08 05:48:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][1000/1251] eta 0:01:22 lr 0.000413 time 0.3236 (0.3284) loss 3.5129 (3.4678) grad_norm 1.6152 (1.6949) [2022-10-08 05:49:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][1100/1251] eta 0:00:49 lr 0.000413 time 0.3314 (0.3283) loss 3.5573 (3.4660) grad_norm 1.7927 (1.6931) [2022-10-08 05:50:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [167/300][1200/1251] eta 0:00:16 lr 0.000412 time 0.3247 (0.3283) loss 3.9098 (3.4664) grad_norm 1.5906 (1.6901) [2022-10-08 05:50:19 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 167 training takes 0:06:50 [2022-10-08 05:50:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.090 (3.090) Loss 1.0161 (1.0161) Acc@1 75.781 (75.781) Acc@5 92.773 (92.773) [2022-10-08 05:50:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.952 Acc@5 93.740 [2022-10-08 05:50:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-08 05:50:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.95% [2022-10-08 05:50:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][0/1251] eta 0:57:21 lr 0.000412 time 2.7512 (2.7512) loss 3.5787 (3.5787) grad_norm 1.9738 (1.9738) [2022-10-08 05:51:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][100/1251] eta 0:06:44 lr 0.000412 time 0.3274 (0.3511) loss 3.2732 (3.4534) grad_norm 1.6665 (1.7002) [2022-10-08 05:51:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][200/1251] eta 0:05:55 lr 0.000411 time 0.3261 (0.3385) loss 3.3539 (3.4535) grad_norm 1.8559 (1.6877) [2022-10-08 05:52:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][300/1251] eta 0:05:18 lr 0.000411 time 0.3262 (0.3345) loss 3.4686 (3.4486) grad_norm 1.9957 (1.6926) [2022-10-08 05:52:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][400/1251] eta 0:04:42 lr 0.000411 time 0.3240 (0.3324) loss 3.5288 (3.4486) grad_norm 1.5571 (1.6944) [2022-10-08 05:53:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][500/1251] eta 0:04:08 lr 0.000410 time 0.3243 (0.3312) loss 3.5265 (3.4521) grad_norm 1.7162 (1.6887) [2022-10-08 05:53:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][600/1251] eta 0:03:35 lr 0.000410 time 0.3232 (0.3304) loss 3.6549 (3.4554) grad_norm 1.7154 (1.6856) [2022-10-08 05:54:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][700/1251] eta 0:03:01 lr 0.000409 time 0.3256 (0.3297) loss 3.5853 (3.4536) grad_norm 1.6701 (1.6880) [2022-10-08 05:54:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][800/1251] eta 0:02:28 lr 0.000409 time 0.3219 (0.3292) loss 3.2841 (3.4525) grad_norm 1.8560 (1.6861) [2022-10-08 05:55:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][900/1251] eta 0:01:55 lr 0.000409 time 0.3288 (0.3288) loss 3.3325 (3.4583) grad_norm 1.8816 (1.6850) [2022-10-08 05:56:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][1000/1251] eta 0:01:22 lr 0.000408 time 0.3291 (0.3285) loss 3.2424 (3.4589) grad_norm 1.6031 (1.6817) [2022-10-08 05:56:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][1100/1251] eta 0:00:49 lr 0.000408 time 0.3295 (0.3285) loss 3.6022 (3.4582) grad_norm 1.8427 (1.6851) [2022-10-08 05:57:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [168/300][1200/1251] eta 0:00:16 lr 0.000407 time 0.3254 (0.3283) loss 3.2335 (3.4556) grad_norm 1.4040 (1.6846) [2022-10-08 05:57:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 168 training takes 0:06:50 [2022-10-08 05:57:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.098 (3.098) Loss 0.9564 (0.9564) Acc@1 77.051 (77.051) Acc@5 94.434 (94.434) [2022-10-08 05:57:37 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.582 Acc@5 93.718 [2022-10-08 05:57:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-08 05:57:37 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.95% [2022-10-08 05:57:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][0/1251] eta 0:45:07 lr 0.000407 time 2.1644 (2.1644) loss 3.6818 (3.6818) grad_norm 1.6763 (1.6763) [2022-10-08 05:58:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][100/1251] eta 0:06:42 lr 0.000407 time 0.3212 (0.3493) loss 3.7361 (3.4431) grad_norm 1.7970 (1.7429) [2022-10-08 05:58:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][200/1251] eta 0:05:54 lr 0.000406 time 0.3273 (0.3377) loss 3.5369 (3.4482) grad_norm 1.6699 (1.7217) [2022-10-08 05:59:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][300/1251] eta 0:05:17 lr 0.000406 time 0.3319 (0.3342) loss 3.6367 (3.4443) grad_norm 1.6550 (1.7248) [2022-10-08 05:59:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][400/1251] eta 0:04:43 lr 0.000406 time 0.3253 (0.3326) loss 3.5248 (3.4485) grad_norm 1.6870 (1.7340) [2022-10-08 06:00:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][500/1251] eta 0:04:09 lr 0.000405 time 0.3348 (0.3320) loss 3.5642 (3.4504) grad_norm 1.7414 (1.7299) [2022-10-08 06:00:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][600/1251] eta 0:03:35 lr 0.000405 time 0.3306 (0.3317) loss 3.3421 (3.4512) grad_norm 1.5857 (1.7234) [2022-10-08 06:01:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][700/1251] eta 0:03:02 lr 0.000404 time 0.3329 (0.3315) loss 3.1638 (3.4489) grad_norm 1.5348 (1.7286) [2022-10-08 06:02:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][800/1251] eta 0:02:29 lr 0.000404 time 0.3295 (0.3314) loss 3.2345 (3.4526) grad_norm 1.8044 (1.7304) [2022-10-08 06:02:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][900/1251] eta 0:01:56 lr 0.000404 time 0.3311 (0.3314) loss 3.2882 (3.4523) grad_norm 1.5342 (1.7242) [2022-10-08 06:03:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][1000/1251] eta 0:01:23 lr 0.000403 time 0.3287 (0.3314) loss 3.6302 (3.4572) grad_norm 1.7782 (1.7234) [2022-10-08 06:03:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][1100/1251] eta 0:00:50 lr 0.000403 time 0.3387 (0.3315) loss 3.6631 (3.4594) grad_norm 1.7193 (1.7287) [2022-10-08 06:04:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [169/300][1200/1251] eta 0:00:16 lr 0.000402 time 0.3288 (0.3316) loss 3.4578 (3.4606) grad_norm 1.5763 (1.7263) [2022-10-08 06:04:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 169 training takes 0:06:55 [2022-10-08 06:04:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.075 (3.075) Loss 0.9862 (0.9862) Acc@1 75.781 (75.781) Acc@5 93.848 (93.848) [2022-10-08 06:04:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.864 Acc@5 93.684 [2022-10-08 06:04:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-08 06:04:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 76.95% [2022-10-08 06:04:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][0/1251] eta 0:54:41 lr 0.000402 time 2.6229 (2.6229) loss 3.5156 (3.5156) grad_norm 1.9113 (1.9113) [2022-10-08 06:05:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][100/1251] eta 0:06:45 lr 0.000402 time 0.3236 (0.3519) loss 3.6286 (3.4352) grad_norm 1.5216 (1.7505) [2022-10-08 06:05:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][200/1251] eta 0:05:57 lr 0.000401 time 0.3324 (0.3397) loss 3.2281 (3.4342) grad_norm 1.6651 (1.7279) [2022-10-08 06:06:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][300/1251] eta 0:05:19 lr 0.000401 time 0.3230 (0.3355) loss 3.5913 (3.4480) grad_norm 1.8289 (1.7269) [2022-10-08 06:07:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][400/1251] eta 0:04:43 lr 0.000400 time 0.3272 (0.3333) loss 3.7337 (3.4533) grad_norm 1.8300 (1.7367) [2022-10-08 06:07:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][500/1251] eta 0:04:09 lr 0.000400 time 0.3260 (0.3319) loss 3.5557 (3.4530) grad_norm 1.9337 (1.7274) [2022-10-08 06:08:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][600/1251] eta 0:03:35 lr 0.000400 time 0.3270 (0.3310) loss 3.4335 (3.4464) grad_norm 1.9224 (1.7323) [2022-10-08 06:08:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][700/1251] eta 0:03:02 lr 0.000399 time 0.3211 (0.3304) loss 3.3250 (3.4509) grad_norm 1.6419 (1.7389) [2022-10-08 06:09:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][800/1251] eta 0:02:28 lr 0.000399 time 0.3283 (0.3299) loss 3.6284 (3.4510) grad_norm 1.9060 (1.7356) [2022-10-08 06:09:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][900/1251] eta 0:01:55 lr 0.000398 time 0.3303 (0.3294) loss 3.1415 (3.4567) grad_norm 1.6977 (1.7337) [2022-10-08 06:10:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][1000/1251] eta 0:01:22 lr 0.000398 time 0.3336 (0.3293) loss 3.5019 (3.4575) grad_norm 1.5984 (1.7363) [2022-10-08 06:10:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][1100/1251] eta 0:00:49 lr 0.000398 time 0.3243 (0.3293) loss 3.5373 (3.4570) grad_norm 2.0031 (1.7338) [2022-10-08 06:11:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [170/300][1200/1251] eta 0:00:16 lr 0.000397 time 0.3310 (0.3293) loss 3.3824 (3.4567) grad_norm 1.7135 (1.7335) [2022-10-08 06:11:38 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 170 training takes 0:06:52 [2022-10-08 06:11:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_170 saving...... [2022-10-08 06:11:39 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_170 saved !!! [2022-10-08 06:11:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.832 (2.832) Loss 0.9989 (0.9989) Acc@1 76.758 (76.758) Acc@5 93.262 (93.262) [2022-10-08 06:11:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.030 Acc@5 93.856 [2022-10-08 06:11:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-08 06:11:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.03% [2022-10-08 06:11:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][0/1251] eta 1:03:53 lr 0.000397 time 3.0647 (3.0647) loss 3.6505 (3.6505) grad_norm 1.4735 (1.4735) [2022-10-08 06:12:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][100/1251] eta 0:06:48 lr 0.000397 time 0.3305 (0.3549) loss 3.4942 (3.4286) grad_norm 1.6592 (1.7207) [2022-10-08 06:13:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][200/1251] eta 0:05:58 lr 0.000396 time 0.3316 (0.3412) loss 3.1582 (3.4442) grad_norm 1.7397 (1.7207) [2022-10-08 06:13:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][300/1251] eta 0:05:20 lr 0.000396 time 0.3250 (0.3367) loss 3.6739 (3.4477) grad_norm 1.7950 (1.7338) [2022-10-08 06:14:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][400/1251] eta 0:04:44 lr 0.000395 time 0.3287 (0.3342) loss 3.2846 (3.4516) grad_norm 1.5280 (1.7385) [2022-10-08 06:14:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][500/1251] eta 0:04:09 lr 0.000395 time 0.3243 (0.3326) loss 3.7648 (3.4507) grad_norm 2.0555 (1.7367) [2022-10-08 06:15:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][600/1251] eta 0:03:35 lr 0.000395 time 0.3265 (0.3315) loss 3.2162 (3.4523) grad_norm 1.7812 (1.7383) [2022-10-08 06:15:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][700/1251] eta 0:03:02 lr 0.000394 time 0.3226 (0.3308) loss 3.2624 (3.4544) grad_norm 1.8178 (1.7386) [2022-10-08 06:16:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][800/1251] eta 0:02:28 lr 0.000394 time 0.3236 (0.3302) loss 3.3397 (3.4560) grad_norm 1.7108 (1.7409) [2022-10-08 06:16:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][900/1251] eta 0:01:55 lr 0.000393 time 0.3267 (0.3301) loss 3.4084 (3.4556) grad_norm 1.6333 (1.7394) [2022-10-08 06:17:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][1000/1251] eta 0:01:22 lr 0.000393 time 0.3286 (0.3297) loss 3.1508 (3.4533) grad_norm 1.5302 (1.7366) [2022-10-08 06:17:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][1100/1251] eta 0:00:49 lr 0.000393 time 0.3236 (0.3294) loss 3.5803 (3.4550) grad_norm 2.0553 (1.7353) [2022-10-08 06:18:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [171/300][1200/1251] eta 0:00:16 lr 0.000392 time 0.3305 (0.3291) loss 3.5215 (3.4586) grad_norm 1.4744 (1.7368) [2022-10-08 06:18:45 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 171 training takes 0:06:51 [2022-10-08 06:18:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.442 (2.442) Loss 1.0061 (1.0061) Acc@1 75.391 (75.391) Acc@5 94.043 (94.043) [2022-10-08 06:18:58 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.780 Acc@5 93.670 [2022-10-08 06:18:58 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-08 06:18:58 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.03% [2022-10-08 06:19:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][0/1251] eta 1:01:43 lr 0.000392 time 2.9604 (2.9604) loss 3.5353 (3.5353) grad_norm 1.6997 (1.6997) [2022-10-08 06:19:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][100/1251] eta 0:06:49 lr 0.000392 time 0.3304 (0.3556) loss 3.3898 (3.4134) grad_norm 1.9600 (1.7158) [2022-10-08 06:20:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][200/1251] eta 0:06:00 lr 0.000391 time 0.3285 (0.3426) loss 3.5668 (3.4080) grad_norm 1.8031 (1.7287) [2022-10-08 06:20:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][300/1251] eta 0:05:21 lr 0.000391 time 0.3221 (0.3384) loss 3.5455 (3.4195) grad_norm 1.7071 (1.7300) [2022-10-08 06:21:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][400/1251] eta 0:04:46 lr 0.000390 time 0.3348 (0.3364) loss 3.5278 (3.4215) grad_norm 1.7040 (1.7340) [2022-10-08 06:21:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][500/1251] eta 0:04:11 lr 0.000390 time 0.3259 (0.3352) loss 3.3215 (3.4249) grad_norm 1.6078 (1.7354) [2022-10-08 06:22:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][600/1251] eta 0:03:37 lr 0.000390 time 0.3344 (0.3344) loss 3.6506 (3.4282) grad_norm 1.8826 (1.7383) [2022-10-08 06:22:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][700/1251] eta 0:03:03 lr 0.000389 time 0.3337 (0.3339) loss 3.7049 (3.4314) grad_norm 1.8396 (1.7359) [2022-10-08 06:23:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][800/1251] eta 0:02:30 lr 0.000389 time 0.3283 (0.3336) loss 3.5532 (3.4353) grad_norm 1.5610 (1.7344) [2022-10-08 06:23:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][900/1251] eta 0:01:56 lr 0.000388 time 0.3341 (0.3333) loss 3.5696 (3.4327) grad_norm 1.6392 (1.7342) [2022-10-08 06:24:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][1000/1251] eta 0:01:23 lr 0.000388 time 0.3265 (0.3330) loss 3.2766 (3.4340) grad_norm 1.4922 (1.7352) [2022-10-08 06:25:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][1100/1251] eta 0:00:50 lr 0.000388 time 0.3364 (0.3328) loss 2.9402 (3.4361) grad_norm 1.7145 (1.7372) [2022-10-08 06:25:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [172/300][1200/1251] eta 0:00:16 lr 0.000387 time 0.3405 (0.3327) loss 3.5748 (3.4381) grad_norm 1.6481 (1.7392) [2022-10-08 06:25:54 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 172 training takes 0:06:56 [2022-10-08 06:25:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.477 (2.477) Loss 0.9605 (0.9605) Acc@1 76.270 (76.270) Acc@5 93.555 (93.555) [2022-10-08 06:26:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.982 Acc@5 93.718 [2022-10-08 06:26:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-08 06:26:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.03% [2022-10-08 06:26:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][0/1251] eta 0:54:40 lr 0.000387 time 2.6221 (2.6221) loss 3.5704 (3.5704) grad_norm 1.8841 (1.8841) [2022-10-08 06:26:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][100/1251] eta 0:06:46 lr 0.000387 time 0.3252 (0.3533) loss 3.3611 (3.4258) grad_norm 1.4183 (1.7174) [2022-10-08 06:27:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][200/1251] eta 0:05:56 lr 0.000386 time 0.3230 (0.3396) loss 3.1717 (3.4396) grad_norm 1.4530 (1.7221) [2022-10-08 06:27:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][300/1251] eta 0:05:18 lr 0.000386 time 0.3229 (0.3353) loss 3.3457 (3.4322) grad_norm 2.2395 (1.7450) [2022-10-08 06:28:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][400/1251] eta 0:04:43 lr 0.000385 time 0.3248 (0.3330) loss 3.8164 (3.4368) grad_norm 2.4278 (1.7402) [2022-10-08 06:28:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][500/1251] eta 0:04:09 lr 0.000385 time 0.3242 (0.3316) loss 3.0356 (3.4407) grad_norm 2.0820 (1.7432) [2022-10-08 06:29:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][600/1251] eta 0:03:35 lr 0.000385 time 0.3220 (0.3306) loss 3.3139 (3.4410) grad_norm 1.9015 (1.7455) [2022-10-08 06:29:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][700/1251] eta 0:03:01 lr 0.000384 time 0.3284 (0.3300) loss 3.5735 (3.4441) grad_norm 1.5303 (1.7484) [2022-10-08 06:30:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][800/1251] eta 0:02:28 lr 0.000384 time 0.3278 (0.3295) loss 3.5181 (3.4440) grad_norm 1.5966 (1.7421) [2022-10-08 06:31:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][900/1251] eta 0:01:55 lr 0.000383 time 0.3248 (0.3292) loss 3.2047 (3.4464) grad_norm 1.7226 (1.7435) [2022-10-08 06:31:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][1000/1251] eta 0:01:22 lr 0.000383 time 0.3261 (0.3289) loss 3.5153 (3.4480) grad_norm 1.7506 (1.7458) [2022-10-08 06:32:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][1100/1251] eta 0:00:49 lr 0.000383 time 0.3280 (0.3286) loss 3.5600 (3.4472) grad_norm 1.6388 (1.7472) [2022-10-08 06:32:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [173/300][1200/1251] eta 0:00:16 lr 0.000382 time 0.3260 (0.3285) loss 3.3446 (3.4469) grad_norm 2.1689 (1.7469) [2022-10-08 06:32:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 173 training takes 0:06:51 [2022-10-08 06:33:02 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.778 (2.778) Loss 1.0178 (1.0178) Acc@1 75.488 (75.488) Acc@5 93.555 (93.555) [2022-10-08 06:33:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.118 Acc@5 93.862 [2022-10-08 06:33:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-08 06:33:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.12% [2022-10-08 06:33:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][0/1251] eta 1:01:18 lr 0.000382 time 2.9401 (2.9401) loss 3.1602 (3.1602) grad_norm 1.8948 (1.8948) [2022-10-08 06:33:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][100/1251] eta 0:06:47 lr 0.000381 time 0.3300 (0.3538) loss 3.4645 (3.4315) grad_norm 1.8115 (1.6987) [2022-10-08 06:34:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][200/1251] eta 0:05:57 lr 0.000381 time 0.3287 (0.3406) loss 3.3654 (3.4358) grad_norm 2.8578 (1.7610) [2022-10-08 06:34:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][300/1251] eta 0:05:19 lr 0.000381 time 0.3287 (0.3361) loss 3.1271 (3.4353) grad_norm 1.5559 (1.7545) [2022-10-08 06:35:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][400/1251] eta 0:04:44 lr 0.000380 time 0.3318 (0.3338) loss 3.4623 (3.4310) grad_norm 1.3457 (1.7590) [2022-10-08 06:35:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][500/1251] eta 0:04:09 lr 0.000380 time 0.3197 (0.3322) loss 3.2535 (3.4349) grad_norm 1.4993 (1.7636) [2022-10-08 06:36:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][600/1251] eta 0:03:35 lr 0.000379 time 0.3290 (0.3312) loss 3.3055 (3.4346) grad_norm 1.8307 (1.7661) [2022-10-08 06:37:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][700/1251] eta 0:03:02 lr 0.000379 time 0.3263 (0.3305) loss 3.2993 (3.4319) grad_norm 1.8031 (1.7641) [2022-10-08 06:37:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][800/1251] eta 0:02:28 lr 0.000379 time 0.3289 (0.3302) loss 2.9097 (3.4295) grad_norm 1.7597 (1.7703) [2022-10-08 06:38:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][900/1251] eta 0:01:55 lr 0.000378 time 0.3254 (0.3296) loss 3.4012 (3.4312) grad_norm 1.7683 (1.7668) [2022-10-08 06:38:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][1000/1251] eta 0:01:22 lr 0.000378 time 0.3248 (0.3292) loss 3.7233 (3.4308) grad_norm 2.1662 (1.7738) [2022-10-08 06:39:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][1100/1251] eta 0:00:49 lr 0.000377 time 0.3318 (0.3289) loss 3.3441 (3.4357) grad_norm 1.8489 (1.7721) [2022-10-08 06:39:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [174/300][1200/1251] eta 0:00:16 lr 0.000377 time 0.3216 (0.3286) loss 3.1856 (3.4376) grad_norm 1.5317 (1.7724) [2022-10-08 06:40:04 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 174 training takes 0:06:51 [2022-10-08 06:40:07 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.753 (2.753) Loss 1.0900 (1.0900) Acc@1 74.316 (74.316) Acc@5 92.285 (92.285) [2022-10-08 06:40:18 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.140 Acc@5 93.738 [2022-10-08 06:40:18 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.1% [2022-10-08 06:40:18 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.14% [2022-10-08 06:40:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][0/1251] eta 1:03:16 lr 0.000377 time 3.0351 (3.0351) loss 3.6025 (3.6025) grad_norm 1.7308 (1.7308) [2022-10-08 06:40:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][100/1251] eta 0:06:49 lr 0.000376 time 0.3289 (0.3561) loss 3.3643 (3.4433) grad_norm 1.6632 (1.7696) [2022-10-08 06:41:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][200/1251] eta 0:05:59 lr 0.000376 time 0.3320 (0.3422) loss 3.1674 (3.4244) grad_norm 1.7346 (1.7656) [2022-10-08 06:41:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][300/1251] eta 0:05:20 lr 0.000376 time 0.3240 (0.3374) loss 3.3322 (3.4220) grad_norm 2.1403 (1.7727) [2022-10-08 06:42:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][400/1251] eta 0:04:45 lr 0.000375 time 0.3317 (0.3351) loss 3.5062 (3.4243) grad_norm 1.6771 (1.7697) [2022-10-08 06:43:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][500/1251] eta 0:04:10 lr 0.000375 time 0.3265 (0.3338) loss 3.3682 (3.4250) grad_norm 1.7280 (1.7665) [2022-10-08 06:43:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][600/1251] eta 0:03:36 lr 0.000374 time 0.3281 (0.3329) loss 3.6106 (3.4267) grad_norm 1.6331 (1.7708) [2022-10-08 06:44:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][700/1251] eta 0:03:03 lr 0.000374 time 0.3236 (0.3324) loss 3.5080 (3.4273) grad_norm 1.8978 (1.7834) [2022-10-08 06:44:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][800/1251] eta 0:02:29 lr 0.000374 time 0.3287 (0.3320) loss 3.5604 (3.4330) grad_norm 1.7610 (1.7798) [2022-10-08 06:45:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][900/1251] eta 0:01:56 lr 0.000373 time 0.3235 (0.3318) loss 3.2276 (3.4335) grad_norm 1.8899 (1.7805) [2022-10-08 06:45:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][1000/1251] eta 0:01:23 lr 0.000373 time 0.3280 (0.3316) loss 3.6456 (3.4340) grad_norm 1.7380 (1.7838) [2022-10-08 06:46:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][1100/1251] eta 0:00:50 lr 0.000372 time 0.3374 (0.3316) loss 3.4659 (3.4364) grad_norm 1.6974 (1.7842) [2022-10-08 06:46:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [175/300][1200/1251] eta 0:00:16 lr 0.000372 time 0.3266 (0.3315) loss 3.5116 (3.4373) grad_norm 1.8023 (1.7841) [2022-10-08 06:47:13 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 175 training takes 0:06:55 [2022-10-08 06:47:16 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.583 (3.583) Loss 0.8314 (0.8314) Acc@1 80.664 (80.664) Acc@5 95.605 (95.605) [2022-10-08 06:47:27 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 76.928 Acc@5 93.878 [2022-10-08 06:47:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-10-08 06:47:27 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.14% [2022-10-08 06:47:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][0/1251] eta 1:06:50 lr 0.000372 time 3.2055 (3.2055) loss 3.5013 (3.5013) grad_norm 1.7373 (1.7373) [2022-10-08 06:48:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][100/1251] eta 0:06:49 lr 0.000371 time 0.3282 (0.3557) loss 3.4515 (3.4308) grad_norm 1.8317 (1.8321) [2022-10-08 06:48:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][200/1251] eta 0:05:58 lr 0.000371 time 0.3245 (0.3411) loss 3.3936 (3.4282) grad_norm 1.7190 (1.7805) [2022-10-08 06:49:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][300/1251] eta 0:05:19 lr 0.000371 time 0.3247 (0.3358) loss 3.3726 (3.4301) grad_norm 1.7941 (1.7806) [2022-10-08 06:49:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][400/1251] eta 0:04:43 lr 0.000370 time 0.3303 (0.3331) loss 3.1482 (3.4375) grad_norm 2.2019 (1.8080) [2022-10-08 06:50:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][500/1251] eta 0:04:08 lr 0.000370 time 0.3196 (0.3315) loss 3.3897 (3.4384) grad_norm 2.0644 (1.8033) [2022-10-08 06:50:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][600/1251] eta 0:03:34 lr 0.000369 time 0.3217 (0.3302) loss 3.5558 (3.4401) grad_norm 1.6600 (1.7969) [2022-10-08 06:51:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][700/1251] eta 0:03:01 lr 0.000369 time 0.3224 (0.3295) loss 3.5372 (3.4357) grad_norm 1.6317 (1.7909) [2022-10-08 06:51:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][800/1251] eta 0:02:28 lr 0.000369 time 0.3253 (0.3289) loss 3.4658 (3.4354) grad_norm 1.9018 (1.7909) [2022-10-08 06:52:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][900/1251] eta 0:01:55 lr 0.000368 time 0.3215 (0.3284) loss 3.3520 (3.4332) grad_norm 1.9747 (1.7891) [2022-10-08 06:52:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][1000/1251] eta 0:01:22 lr 0.000368 time 0.3219 (0.3280) loss 3.6425 (3.4350) grad_norm 1.8401 (1.7915) [2022-10-08 06:53:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][1100/1251] eta 0:00:49 lr 0.000368 time 0.3224 (0.3277) loss 3.4155 (3.4352) grad_norm 1.8868 (1.7984) [2022-10-08 06:54:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [176/300][1200/1251] eta 0:00:16 lr 0.000367 time 0.3294 (0.3276) loss 3.2501 (3.4378) grad_norm 2.1029 (1.8002) [2022-10-08 06:54:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 176 training takes 0:06:50 [2022-10-08 06:54:20 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.325 (3.325) Loss 0.9920 (0.9920) Acc@1 75.879 (75.879) Acc@5 93.848 (93.848) [2022-10-08 06:54:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.048 Acc@5 93.754 [2022-10-08 06:54:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-08 06:54:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.14% [2022-10-08 06:54:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][0/1251] eta 0:59:28 lr 0.000367 time 2.8524 (2.8524) loss 3.7633 (3.7633) grad_norm 2.0918 (2.0918) [2022-10-08 06:55:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][100/1251] eta 0:06:44 lr 0.000367 time 0.3250 (0.3512) loss 3.5557 (3.4403) grad_norm 1.7789 (1.8069) [2022-10-08 06:55:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][200/1251] eta 0:05:55 lr 0.000366 time 0.3302 (0.3385) loss 3.3034 (3.4216) grad_norm 1.4861 (1.7961) [2022-10-08 06:56:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][300/1251] eta 0:05:17 lr 0.000366 time 0.3308 (0.3343) loss 3.3641 (3.4190) grad_norm 1.7453 (1.8064) [2022-10-08 06:56:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][400/1251] eta 0:04:42 lr 0.000365 time 0.3242 (0.3321) loss 3.6118 (3.4117) grad_norm 1.8183 (1.7949) [2022-10-08 06:57:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][500/1251] eta 0:04:08 lr 0.000365 time 0.3305 (0.3308) loss 3.2950 (3.4082) grad_norm 2.1786 (1.8037) [2022-10-08 06:57:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][600/1251] eta 0:03:34 lr 0.000365 time 0.3278 (0.3300) loss 3.5563 (3.4152) grad_norm 1.5478 (1.8065) [2022-10-08 06:58:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][700/1251] eta 0:03:01 lr 0.000364 time 0.3248 (0.3298) loss 3.1831 (3.4160) grad_norm 1.7367 (1.8080) [2022-10-08 06:58:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][800/1251] eta 0:02:28 lr 0.000364 time 0.3274 (0.3292) loss 3.2561 (3.4205) grad_norm 1.5693 (1.8025) [2022-10-08 06:59:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][900/1251] eta 0:01:55 lr 0.000363 time 0.3299 (0.3289) loss 3.3209 (3.4204) grad_norm 1.7789 (1.7985) [2022-10-08 06:59:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][1000/1251] eta 0:01:22 lr 0.000363 time 0.3203 (0.3286) loss 3.7280 (3.4235) grad_norm 1.7406 (1.7954) [2022-10-08 07:00:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][1100/1251] eta 0:00:49 lr 0.000363 time 0.3225 (0.3283) loss 3.1031 (3.4243) grad_norm 1.7719 (1.7919) [2022-10-08 07:01:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [177/300][1200/1251] eta 0:00:16 lr 0.000362 time 0.3220 (0.3281) loss 3.4825 (3.4252) grad_norm 1.4631 (1.7868) [2022-10-08 07:01:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 177 training takes 0:06:50 [2022-10-08 07:01:24 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.533 (2.533) Loss 0.9405 (0.9405) Acc@1 78.613 (78.613) Acc@5 93.750 (93.750) [2022-10-08 07:01:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.264 Acc@5 93.958 [2022-10-08 07:01:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-08 07:01:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.26% [2022-10-08 07:01:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][0/1251] eta 1:06:36 lr 0.000362 time 3.1947 (3.1947) loss 3.5235 (3.5235) grad_norm 1.7262 (1.7262) [2022-10-08 07:02:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][100/1251] eta 0:06:48 lr 0.000362 time 0.3310 (0.3548) loss 3.3910 (3.3785) grad_norm 2.1807 (1.7876) [2022-10-08 07:02:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][200/1251] eta 0:05:58 lr 0.000361 time 0.3243 (0.3407) loss 3.2831 (3.4143) grad_norm 1.7964 (1.8025) [2022-10-08 07:03:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][300/1251] eta 0:05:19 lr 0.000361 time 0.3243 (0.3360) loss 3.4684 (3.4133) grad_norm 2.0511 (1.7934) [2022-10-08 07:03:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][400/1251] eta 0:04:44 lr 0.000360 time 0.3249 (0.3338) loss 3.2525 (3.4184) grad_norm 2.2961 (1.8035) [2022-10-08 07:04:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][500/1251] eta 0:04:09 lr 0.000360 time 0.3286 (0.3326) loss 3.1255 (3.4192) grad_norm 1.4574 (1.8013) [2022-10-08 07:04:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][600/1251] eta 0:03:36 lr 0.000360 time 0.3278 (0.3319) loss 3.2137 (3.4180) grad_norm 1.6335 (1.8033) [2022-10-08 07:05:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][700/1251] eta 0:03:02 lr 0.000359 time 0.3376 (0.3315) loss 3.2066 (3.4209) grad_norm 1.8244 (1.8013) [2022-10-08 07:06:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][800/1251] eta 0:02:29 lr 0.000359 time 0.3272 (0.3312) loss 3.6184 (3.4237) grad_norm 1.8110 (1.8033) [2022-10-08 07:06:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][900/1251] eta 0:01:56 lr 0.000358 time 0.3337 (0.3310) loss 3.3406 (3.4232) grad_norm 1.6222 (1.8058) [2022-10-08 07:07:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][1000/1251] eta 0:01:23 lr 0.000358 time 0.3319 (0.3310) loss 3.0572 (3.4218) grad_norm 1.6809 (1.8152) [2022-10-08 07:07:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][1100/1251] eta 0:00:49 lr 0.000358 time 0.3406 (0.3310) loss 3.8681 (3.4239) grad_norm 1.6879 (1.8154) [2022-10-08 07:08:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [178/300][1200/1251] eta 0:00:16 lr 0.000357 time 0.3252 (0.3310) loss 3.5940 (3.4256) grad_norm 2.0343 (1.8131) [2022-10-08 07:08:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 178 training takes 0:06:54 [2022-10-08 07:08:33 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.283 (3.283) Loss 1.0023 (1.0023) Acc@1 75.781 (75.781) Acc@5 93.359 (93.359) [2022-10-08 07:08:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.184 Acc@5 93.878 [2022-10-08 07:08:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-10-08 07:08:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.26% [2022-10-08 07:08:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][0/1251] eta 1:01:03 lr 0.000357 time 2.9282 (2.9282) loss 3.3865 (3.3865) grad_norm 1.7727 (1.7727) [2022-10-08 07:09:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][100/1251] eta 0:06:46 lr 0.000357 time 0.3246 (0.3533) loss 3.3962 (3.4104) grad_norm 1.6499 (1.8145) [2022-10-08 07:09:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][200/1251] eta 0:05:57 lr 0.000356 time 0.3285 (0.3397) loss 3.3441 (3.4213) grad_norm 1.8343 (1.8102) [2022-10-08 07:10:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][300/1251] eta 0:05:18 lr 0.000356 time 0.3226 (0.3348) loss 3.5564 (3.4156) grad_norm 1.5669 (1.8264) [2022-10-08 07:10:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][400/1251] eta 0:04:42 lr 0.000355 time 0.3289 (0.3323) loss 3.3876 (3.4162) grad_norm 1.8065 (1.8122) [2022-10-08 07:11:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][500/1251] eta 0:04:08 lr 0.000355 time 0.3228 (0.3309) loss 3.6850 (3.4165) grad_norm 1.9941 (1.7976) [2022-10-08 07:12:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][600/1251] eta 0:03:34 lr 0.000355 time 0.3242 (0.3299) loss 3.5031 (3.4147) grad_norm 1.6270 (1.7998) [2022-10-08 07:12:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][700/1251] eta 0:03:01 lr 0.000354 time 0.3228 (0.3292) loss 3.6023 (3.4165) grad_norm 2.0422 (1.7935) [2022-10-08 07:13:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][800/1251] eta 0:02:28 lr 0.000354 time 0.3260 (0.3286) loss 3.5995 (3.4154) grad_norm 1.6712 (1.8020) [2022-10-08 07:13:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][900/1251] eta 0:01:55 lr 0.000353 time 0.3218 (0.3281) loss 3.7651 (3.4192) grad_norm 2.2106 (1.8053) [2022-10-08 07:14:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][1000/1251] eta 0:01:22 lr 0.000353 time 0.3274 (0.3278) loss 3.5814 (3.4188) grad_norm 1.8464 (1.8040) [2022-10-08 07:14:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][1100/1251] eta 0:00:49 lr 0.000353 time 0.3260 (0.3275) loss 3.3168 (3.4182) grad_norm 2.2780 (1.8068) [2022-10-08 07:15:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [179/300][1200/1251] eta 0:00:16 lr 0.000352 time 0.3228 (0.3272) loss 3.2750 (3.4175) grad_norm 1.7567 (1.8072) [2022-10-08 07:15:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 179 training takes 0:06:49 [2022-10-08 07:15:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.242 (3.242) Loss 0.9720 (0.9720) Acc@1 77.539 (77.539) Acc@5 93.848 (93.848) [2022-10-08 07:15:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.404 Acc@5 93.936 [2022-10-08 07:15:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-08 07:15:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.40% [2022-10-08 07:15:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][0/1251] eta 0:47:20 lr 0.000352 time 2.2706 (2.2706) loss 3.1180 (3.1180) grad_norm 1.6296 (1.6296) [2022-10-08 07:16:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][100/1251] eta 0:06:42 lr 0.000352 time 0.3264 (0.3500) loss 3.4524 (3.4081) grad_norm 1.7926 (1.8424) [2022-10-08 07:16:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][200/1251] eta 0:05:54 lr 0.000351 time 0.3271 (0.3376) loss 3.3198 (3.3862) grad_norm 1.7969 (1.7933) [2022-10-08 07:17:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][300/1251] eta 0:05:16 lr 0.000351 time 0.3251 (0.3333) loss 3.3999 (3.3879) grad_norm 1.5829 (1.7969) [2022-10-08 07:17:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][400/1251] eta 0:04:41 lr 0.000350 time 0.3384 (0.3314) loss 3.3611 (3.3924) grad_norm 1.7971 (1.8090) [2022-10-08 07:18:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][500/1251] eta 0:04:07 lr 0.000350 time 0.3241 (0.3301) loss 3.1351 (3.3966) grad_norm 1.8648 (1.8063) [2022-10-08 07:19:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][600/1251] eta 0:03:34 lr 0.000350 time 0.3211 (0.3298) loss 3.3692 (3.4008) grad_norm 1.5345 (1.7995) [2022-10-08 07:19:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][700/1251] eta 0:03:01 lr 0.000349 time 0.3243 (0.3291) loss 3.7171 (3.4017) grad_norm 1.6744 (1.8150) [2022-10-08 07:20:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][800/1251] eta 0:02:28 lr 0.000349 time 0.3290 (0.3286) loss 3.3766 (3.4052) grad_norm 1.9193 (1.8181) [2022-10-08 07:20:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][900/1251] eta 0:01:55 lr 0.000348 time 0.3276 (0.3282) loss 3.3383 (3.4068) grad_norm 2.1721 (1.8189) [2022-10-08 07:21:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][1000/1251] eta 0:01:22 lr 0.000348 time 0.3273 (0.3278) loss 3.5134 (3.4061) grad_norm 1.9550 (1.8184) [2022-10-08 07:21:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][1100/1251] eta 0:00:49 lr 0.000348 time 0.3233 (0.3275) loss 3.6785 (3.4083) grad_norm 1.5376 (1.8194) [2022-10-08 07:22:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [180/300][1200/1251] eta 0:00:16 lr 0.000347 time 0.3278 (0.3272) loss 3.4178 (3.4084) grad_norm 1.6536 (1.8170) [2022-10-08 07:22:36 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 180 training takes 0:06:49 [2022-10-08 07:22:36 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_180 saving...... [2022-10-08 07:22:36 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_180 saved !!! [2022-10-08 07:22:39 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.583 (2.583) Loss 0.9809 (0.9809) Acc@1 75.586 (75.586) Acc@5 93.555 (93.555) [2022-10-08 07:22:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.480 Acc@5 94.004 [2022-10-08 07:22:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-10-08 07:22:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.48% [2022-10-08 07:22:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][0/1251] eta 1:07:27 lr 0.000347 time 3.2356 (3.2356) loss 3.1388 (3.1388) grad_norm 1.7648 (1.7648) [2022-10-08 07:23:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][100/1251] eta 0:06:50 lr 0.000347 time 0.3309 (0.3563) loss 3.5463 (3.3892) grad_norm 2.0245 (1.8256) [2022-10-08 07:23:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][200/1251] eta 0:06:00 lr 0.000346 time 0.3358 (0.3428) loss 3.5489 (3.4068) grad_norm 1.9090 (1.8518) [2022-10-08 07:24:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][300/1251] eta 0:05:21 lr 0.000346 time 0.3329 (0.3380) loss 3.3816 (3.4091) grad_norm 1.6623 (1.8362) [2022-10-08 07:25:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][400/1251] eta 0:04:45 lr 0.000346 time 0.3296 (0.3355) loss 3.2160 (3.4029) grad_norm 1.6278 (1.8404) [2022-10-08 07:25:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][500/1251] eta 0:04:10 lr 0.000345 time 0.3241 (0.3340) loss 3.1517 (3.4033) grad_norm 1.7845 (1.8424) [2022-10-08 07:26:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][600/1251] eta 0:03:36 lr 0.000345 time 0.3304 (0.3330) loss 3.4333 (3.4055) grad_norm 1.9448 (1.8399) [2022-10-08 07:26:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][700/1251] eta 0:03:03 lr 0.000344 time 0.3231 (0.3322) loss 3.5526 (3.4085) grad_norm 2.0850 (1.8449) [2022-10-08 07:27:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][800/1251] eta 0:02:29 lr 0.000344 time 0.3243 (0.3317) loss 3.8900 (3.4109) grad_norm 1.8836 (1.8427) [2022-10-08 07:27:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][900/1251] eta 0:01:56 lr 0.000344 time 0.3244 (0.3314) loss 3.3539 (3.4102) grad_norm 1.9353 (1.8408) [2022-10-08 07:28:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][1000/1251] eta 0:01:23 lr 0.000343 time 0.3319 (0.3312) loss 3.3181 (3.4118) grad_norm 2.2320 (1.8397) [2022-10-08 07:28:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][1100/1251] eta 0:00:49 lr 0.000343 time 0.3384 (0.3311) loss 3.3969 (3.4135) grad_norm 1.6931 (1.8370) [2022-10-08 07:29:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [181/300][1200/1251] eta 0:00:16 lr 0.000342 time 0.3271 (0.3311) loss 3.6177 (3.4117) grad_norm 1.7644 (1.8368) [2022-10-08 07:29:44 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 181 training takes 0:06:54 [2022-10-08 07:29:46 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.616 (2.616) Loss 0.9443 (0.9443) Acc@1 78.418 (78.418) Acc@5 94.629 (94.629) [2022-10-08 07:29:57 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.390 Acc@5 93.972 [2022-10-08 07:29:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-08 07:29:57 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.48% [2022-10-08 07:30:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][0/1251] eta 0:59:09 lr 0.000342 time 2.8376 (2.8376) loss 3.5189 (3.5189) grad_norm 1.8651 (1.8651) [2022-10-08 07:30:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][100/1251] eta 0:06:45 lr 0.000342 time 0.3305 (0.3520) loss 3.3346 (3.3934) grad_norm 1.7224 (1.8428) [2022-10-08 07:31:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][200/1251] eta 0:05:56 lr 0.000341 time 0.3290 (0.3388) loss 3.4691 (3.3943) grad_norm 1.9334 (1.8089) [2022-10-08 07:31:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][300/1251] eta 0:05:17 lr 0.000341 time 0.3281 (0.3343) loss 3.5916 (3.4005) grad_norm 1.9023 (1.8188) [2022-10-08 07:32:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][400/1251] eta 0:04:42 lr 0.000341 time 0.3256 (0.3320) loss 3.5445 (3.4104) grad_norm 1.8192 (1.8231) [2022-10-08 07:32:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][500/1251] eta 0:04:08 lr 0.000340 time 0.3269 (0.3307) loss 3.4947 (3.4099) grad_norm 1.7231 (1.8201) [2022-10-08 07:33:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][600/1251] eta 0:03:34 lr 0.000340 time 0.3231 (0.3297) loss 3.4855 (3.4083) grad_norm 1.8243 (1.8175) [2022-10-08 07:33:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][700/1251] eta 0:03:01 lr 0.000339 time 0.3260 (0.3289) loss 3.4900 (3.4088) grad_norm 1.9263 (1.8118) [2022-10-08 07:34:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][800/1251] eta 0:02:28 lr 0.000339 time 0.3216 (0.3284) loss 3.2878 (3.4063) grad_norm 2.1255 (1.8177) [2022-10-08 07:34:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][900/1251] eta 0:01:55 lr 0.000339 time 0.3234 (0.3279) loss 3.1336 (3.4050) grad_norm 1.8917 (1.8174) [2022-10-08 07:35:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][1000/1251] eta 0:01:22 lr 0.000338 time 0.3228 (0.3276) loss 3.2238 (3.4076) grad_norm 1.5348 (1.8144) [2022-10-08 07:35:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][1100/1251] eta 0:00:49 lr 0.000338 time 0.3263 (0.3273) loss 3.6794 (3.4100) grad_norm 1.9780 (1.8196) [2022-10-08 07:36:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [182/300][1200/1251] eta 0:00:16 lr 0.000338 time 0.3280 (0.3271) loss 3.6819 (3.4104) grad_norm 1.8558 (1.8186) [2022-10-08 07:36:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 182 training takes 0:06:49 [2022-10-08 07:36:49 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.476 (2.476) Loss 0.9491 (0.9491) Acc@1 77.246 (77.246) Acc@5 94.629 (94.629) [2022-10-08 07:37:00 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.294 Acc@5 94.056 [2022-10-08 07:37:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-08 07:37:00 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.48% [2022-10-08 07:37:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][0/1251] eta 0:59:53 lr 0.000337 time 2.8724 (2.8724) loss 3.4213 (3.4213) grad_norm 1.8976 (1.8976) [2022-10-08 07:37:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][100/1251] eta 0:06:42 lr 0.000337 time 0.3254 (0.3501) loss 3.0693 (3.3986) grad_norm 1.8941 (1.8834) [2022-10-08 07:38:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][200/1251] eta 0:05:54 lr 0.000337 time 0.3233 (0.3373) loss 3.1305 (3.4001) grad_norm 2.6152 (1.8562) [2022-10-08 07:38:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][300/1251] eta 0:05:16 lr 0.000336 time 0.3217 (0.3330) loss 3.5488 (3.3939) grad_norm 1.5637 (1.8620) [2022-10-08 07:39:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][400/1251] eta 0:04:41 lr 0.000336 time 0.3241 (0.3310) loss 3.5910 (3.3965) grad_norm 1.7585 (1.8597) [2022-10-08 07:39:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][500/1251] eta 0:04:08 lr 0.000335 time 0.3251 (0.3302) loss 3.4707 (3.3970) grad_norm 1.8782 (1.8506) [2022-10-08 07:40:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][600/1251] eta 0:03:34 lr 0.000335 time 0.3328 (0.3293) loss 3.3047 (3.3970) grad_norm 2.2772 (1.8528) [2022-10-08 07:40:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][700/1251] eta 0:03:01 lr 0.000335 time 0.3245 (0.3288) loss 3.4842 (3.3972) grad_norm 1.7189 (1.8540) [2022-10-08 07:41:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][800/1251] eta 0:02:28 lr 0.000334 time 0.3278 (0.3283) loss 3.6377 (3.3992) grad_norm 2.1900 (1.8540) [2022-10-08 07:41:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][900/1251] eta 0:01:55 lr 0.000334 time 0.3269 (0.3280) loss 3.4350 (3.3978) grad_norm 1.6421 (1.8596) [2022-10-08 07:42:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][1000/1251] eta 0:01:22 lr 0.000333 time 0.3269 (0.3276) loss 3.2067 (3.3974) grad_norm 1.6684 (1.8570) [2022-10-08 07:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][1100/1251] eta 0:00:49 lr 0.000333 time 0.3258 (0.3274) loss 2.9723 (3.3994) grad_norm 1.7680 (1.8584) [2022-10-08 07:43:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [183/300][1200/1251] eta 0:00:16 lr 0.000333 time 0.3221 (0.3272) loss 3.7296 (3.3970) grad_norm 1.7069 (1.8593) [2022-10-08 07:43:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 183 training takes 0:06:49 [2022-10-08 07:43:53 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.198 (3.198) Loss 0.9766 (0.9766) Acc@1 75.879 (75.879) Acc@5 93.750 (93.750) [2022-10-08 07:44:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.414 Acc@5 93.972 [2022-10-08 07:44:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-08 07:44:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.48% [2022-10-08 07:44:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][0/1251] eta 1:00:02 lr 0.000332 time 2.8794 (2.8794) loss 2.9449 (2.9449) grad_norm 2.0345 (2.0345) [2022-10-08 07:44:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][100/1251] eta 0:06:47 lr 0.000332 time 0.3255 (0.3539) loss 2.9827 (3.3725) grad_norm 1.6386 (1.8456) [2022-10-08 07:45:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][200/1251] eta 0:05:58 lr 0.000332 time 0.3272 (0.3410) loss 3.0867 (3.3819) grad_norm 1.6932 (1.8741) [2022-10-08 07:45:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][300/1251] eta 0:05:20 lr 0.000331 time 0.3289 (0.3367) loss 3.3226 (3.3866) grad_norm 1.8040 (1.8694) [2022-10-08 07:46:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][400/1251] eta 0:04:44 lr 0.000331 time 0.3309 (0.3347) loss 3.1592 (3.3848) grad_norm 1.9088 (1.8753) [2022-10-08 07:46:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][500/1251] eta 0:04:10 lr 0.000331 time 0.3399 (0.3335) loss 3.5361 (3.3920) grad_norm 1.7738 (1.8742) [2022-10-08 07:47:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][600/1251] eta 0:03:36 lr 0.000330 time 0.3295 (0.3329) loss 3.1944 (3.3937) grad_norm 1.6056 (1.8765) [2022-10-08 07:47:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][700/1251] eta 0:03:03 lr 0.000330 time 0.3251 (0.3326) loss 3.1146 (3.3936) grad_norm 1.7886 (1.8776) [2022-10-08 07:48:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][800/1251] eta 0:02:29 lr 0.000329 time 0.3274 (0.3323) loss 3.3937 (3.3945) grad_norm 1.8222 (1.8769) [2022-10-08 07:49:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][900/1251] eta 0:01:56 lr 0.000329 time 0.3346 (0.3323) loss 3.5455 (3.3937) grad_norm 2.0262 (1.8796) [2022-10-08 07:49:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][1000/1251] eta 0:01:23 lr 0.000329 time 0.3268 (0.3323) loss 3.2389 (3.3940) grad_norm 1.5778 (1.8781) [2022-10-08 07:50:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][1100/1251] eta 0:00:50 lr 0.000328 time 0.3335 (0.3322) loss 3.4415 (3.3950) grad_norm 1.9048 (1.8741) [2022-10-08 07:50:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [184/300][1200/1251] eta 0:00:16 lr 0.000328 time 0.3275 (0.3321) loss 3.3129 (3.3952) grad_norm 2.0097 (1.8734) [2022-10-08 07:50:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 184 training takes 0:06:55 [2022-10-08 07:51:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.213 (3.213) Loss 0.9965 (0.9965) Acc@1 76.758 (76.758) Acc@5 93.652 (93.652) [2022-10-08 07:51:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.704 Acc@5 94.030 [2022-10-08 07:51:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-08 07:51:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.70% [2022-10-08 07:51:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][0/1251] eta 0:54:44 lr 0.000328 time 2.6258 (2.6258) loss 3.4839 (3.4839) grad_norm 1.7967 (1.7967) [2022-10-08 07:51:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][100/1251] eta 0:06:43 lr 0.000327 time 0.3267 (0.3510) loss 3.0770 (3.3853) grad_norm 2.1739 (1.8495) [2022-10-08 07:52:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][200/1251] eta 0:05:57 lr 0.000327 time 0.3266 (0.3399) loss 3.3155 (3.3784) grad_norm 1.5959 (1.8431) [2022-10-08 07:52:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][300/1251] eta 0:05:19 lr 0.000326 time 0.3268 (0.3356) loss 3.4328 (3.3718) grad_norm 1.7298 (1.8340) [2022-10-08 07:53:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][400/1251] eta 0:04:43 lr 0.000326 time 0.3289 (0.3336) loss 3.3305 (3.3790) grad_norm 2.1350 (1.8391) [2022-10-08 07:54:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][500/1251] eta 0:04:09 lr 0.000326 time 0.3303 (0.3323) loss 3.4416 (3.3815) grad_norm 1.7994 (1.8465) [2022-10-08 07:54:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][600/1251] eta 0:03:35 lr 0.000325 time 0.3276 (0.3314) loss 3.2702 (3.3797) grad_norm 1.7594 (1.8398) [2022-10-08 07:55:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][700/1251] eta 0:03:02 lr 0.000325 time 0.3269 (0.3310) loss 3.2672 (3.3841) grad_norm 1.7457 (1.8447) [2022-10-08 07:55:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][800/1251] eta 0:02:29 lr 0.000325 time 0.3270 (0.3312) loss 3.3514 (3.3860) grad_norm 1.5497 (1.8415) [2022-10-08 07:56:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][900/1251] eta 0:01:56 lr 0.000324 time 0.3275 (0.3306) loss 3.1152 (3.3849) grad_norm 1.5374 (1.8439) [2022-10-08 07:56:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][1000/1251] eta 0:01:22 lr 0.000324 time 0.3224 (0.3302) loss 3.8440 (3.3850) grad_norm 2.0583 (1.8431) [2022-10-08 07:57:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][1100/1251] eta 0:00:49 lr 0.000323 time 0.3290 (0.3303) loss 3.2347 (3.3861) grad_norm 2.1683 (1.8451) [2022-10-08 07:57:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [185/300][1200/1251] eta 0:00:16 lr 0.000323 time 0.3270 (0.3300) loss 3.4258 (3.3867) grad_norm 1.6501 (1.8468) [2022-10-08 07:58:06 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 185 training takes 0:06:52 [2022-10-08 07:58:09 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.735 (2.735) Loss 0.9632 (0.9632) Acc@1 76.953 (76.953) Acc@5 93.848 (93.848) [2022-10-08 07:58:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.626 Acc@5 94.028 [2022-10-08 07:58:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-08 07:58:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.70% [2022-10-08 07:58:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][0/1251] eta 1:00:02 lr 0.000323 time 2.8797 (2.8797) loss 3.2152 (3.2152) grad_norm 2.2777 (2.2777) [2022-10-08 07:58:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][100/1251] eta 0:06:49 lr 0.000322 time 0.3363 (0.3557) loss 3.4732 (3.3566) grad_norm 1.6829 (1.8678) [2022-10-08 07:59:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][200/1251] eta 0:05:58 lr 0.000322 time 0.3258 (0.3416) loss 3.3607 (3.3669) grad_norm 1.6159 (1.8630) [2022-10-08 08:00:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][300/1251] eta 0:05:20 lr 0.000322 time 0.3277 (0.3370) loss 3.2519 (3.3724) grad_norm 1.8212 (1.8704) [2022-10-08 08:00:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][400/1251] eta 0:04:44 lr 0.000321 time 0.3232 (0.3347) loss 3.4262 (3.3766) grad_norm 2.1493 (1.8770) [2022-10-08 08:01:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][500/1251] eta 0:04:10 lr 0.000321 time 0.3322 (0.3331) loss 3.3205 (3.3765) grad_norm 1.6706 (1.8695) [2022-10-08 08:01:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][600/1251] eta 0:03:36 lr 0.000320 time 0.3266 (0.3323) loss 3.5899 (3.3773) grad_norm 2.2811 (1.8597) [2022-10-08 08:02:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][700/1251] eta 0:03:02 lr 0.000320 time 0.3371 (0.3315) loss 3.0754 (3.3822) grad_norm 1.8415 (1.8583) [2022-10-08 08:02:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][800/1251] eta 0:02:29 lr 0.000320 time 0.3229 (0.3310) loss 3.4052 (3.3798) grad_norm 1.8261 (1.8548) [2022-10-08 08:03:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][900/1251] eta 0:01:55 lr 0.000319 time 0.3280 (0.3305) loss 3.3415 (3.3810) grad_norm 1.5880 (1.8556) [2022-10-08 08:03:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][1000/1251] eta 0:01:22 lr 0.000319 time 0.3200 (0.3301) loss 3.1997 (3.3820) grad_norm 1.7429 (1.8627) [2022-10-08 08:04:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][1100/1251] eta 0:00:49 lr 0.000319 time 0.3260 (0.3298) loss 3.6500 (3.3813) grad_norm 1.8965 (1.8574) [2022-10-08 08:04:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [186/300][1200/1251] eta 0:00:16 lr 0.000318 time 0.3260 (0.3296) loss 3.6176 (3.3830) grad_norm 2.2265 (1.8601) [2022-10-08 08:05:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 186 training takes 0:06:52 [2022-10-08 08:05:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.204 (3.204) Loss 0.9443 (0.9443) Acc@1 76.270 (76.270) Acc@5 94.531 (94.531) [2022-10-08 08:05:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.710 Acc@5 94.046 [2022-10-08 08:05:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-08 08:05:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.71% [2022-10-08 08:05:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][0/1251] eta 1:02:11 lr 0.000318 time 2.9824 (2.9824) loss 3.2992 (3.2992) grad_norm 1.7927 (1.7927) [2022-10-08 08:06:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][100/1251] eta 0:06:48 lr 0.000318 time 0.3242 (0.3547) loss 3.1420 (3.3500) grad_norm 1.9702 (1.8587) [2022-10-08 08:06:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][200/1251] eta 0:05:59 lr 0.000317 time 0.3256 (0.3417) loss 3.4600 (3.3784) grad_norm 1.9539 (1.8528) [2022-10-08 08:07:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][300/1251] eta 0:05:20 lr 0.000317 time 0.3258 (0.3369) loss 3.6096 (3.3913) grad_norm 2.3011 (1.8616) [2022-10-08 08:07:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][400/1251] eta 0:04:44 lr 0.000316 time 0.3232 (0.3344) loss 3.6277 (3.3925) grad_norm 1.8028 (1.8779) [2022-10-08 08:08:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][500/1251] eta 0:04:10 lr 0.000316 time 0.3283 (0.3329) loss 3.1619 (3.3885) grad_norm 1.8190 (1.8720) [2022-10-08 08:08:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][600/1251] eta 0:03:35 lr 0.000316 time 0.3327 (0.3318) loss 3.6030 (3.3919) grad_norm 2.1741 (1.8683) [2022-10-08 08:09:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][700/1251] eta 0:03:02 lr 0.000315 time 0.3247 (0.3310) loss 3.1358 (3.3890) grad_norm 1.5453 (1.8639) [2022-10-08 08:09:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][800/1251] eta 0:02:29 lr 0.000315 time 0.3236 (0.3304) loss 3.3860 (3.3852) grad_norm 1.7968 (1.8656) [2022-10-08 08:10:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][900/1251] eta 0:01:55 lr 0.000315 time 0.3249 (0.3300) loss 3.3323 (3.3832) grad_norm 2.0136 (1.8620) [2022-10-08 08:10:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][1000/1251] eta 0:01:22 lr 0.000314 time 0.3226 (0.3297) loss 3.3658 (3.3845) grad_norm 2.7157 (1.8630) [2022-10-08 08:11:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][1100/1251] eta 0:00:49 lr 0.000314 time 0.3247 (0.3295) loss 3.6210 (3.3858) grad_norm 2.1401 (1.8683) [2022-10-08 08:12:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [187/300][1200/1251] eta 0:00:16 lr 0.000313 time 0.3220 (0.3293) loss 3.5598 (3.3884) grad_norm 1.5844 (1.8708) [2022-10-08 08:12:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 187 training takes 0:06:52 [2022-10-08 08:12:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.376 (3.376) Loss 0.8964 (0.8964) Acc@1 79.297 (79.297) Acc@5 94.434 (94.434) [2022-10-08 08:12:32 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.690 Acc@5 94.056 [2022-10-08 08:12:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-08 08:12:32 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.71% [2022-10-08 08:12:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][0/1251] eta 1:09:15 lr 0.000313 time 3.3216 (3.3216) loss 3.1369 (3.1369) grad_norm 1.7946 (1.7946) [2022-10-08 08:13:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][100/1251] eta 0:06:50 lr 0.000313 time 0.3266 (0.3568) loss 3.6357 (3.3622) grad_norm 2.0625 (1.8647) [2022-10-08 08:13:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][200/1251] eta 0:05:59 lr 0.000312 time 0.3263 (0.3423) loss 3.3936 (3.3778) grad_norm 2.2412 (1.9152) [2022-10-08 08:14:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][300/1251] eta 0:05:20 lr 0.000312 time 0.3223 (0.3372) loss 3.5990 (3.3799) grad_norm 2.0622 (1.9124) [2022-10-08 08:14:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][400/1251] eta 0:04:44 lr 0.000312 time 0.3277 (0.3346) loss 3.2915 (3.3819) grad_norm 1.7883 (1.9067) [2022-10-08 08:15:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][500/1251] eta 0:04:10 lr 0.000311 time 0.3284 (0.3330) loss 3.2298 (3.3795) grad_norm 2.6023 (1.9223) [2022-10-08 08:15:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][600/1251] eta 0:03:36 lr 0.000311 time 0.3259 (0.3319) loss 2.8416 (3.3733) grad_norm 1.9314 (1.9128) [2022-10-08 08:16:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][700/1251] eta 0:03:02 lr 0.000311 time 0.3326 (0.3310) loss 3.3953 (3.3764) grad_norm 2.1538 (1.9050) [2022-10-08 08:16:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][800/1251] eta 0:02:28 lr 0.000310 time 0.3253 (0.3303) loss 3.5824 (3.3738) grad_norm 1.7756 (1.9061) [2022-10-08 08:17:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][900/1251] eta 0:01:55 lr 0.000310 time 0.3281 (0.3299) loss 3.2219 (3.3760) grad_norm 1.8114 (1.9066) [2022-10-08 08:18:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][1000/1251] eta 0:01:22 lr 0.000309 time 0.3286 (0.3296) loss 3.0718 (3.3741) grad_norm 1.6268 (1.9084) [2022-10-08 08:18:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][1100/1251] eta 0:00:49 lr 0.000309 time 0.3267 (0.3296) loss 3.5489 (3.3735) grad_norm 1.7079 (1.9064) [2022-10-08 08:19:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [188/300][1200/1251] eta 0:00:16 lr 0.000309 time 0.3315 (0.3296) loss 3.5314 (3.3748) grad_norm 1.9806 (1.9065) [2022-10-08 08:19:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 188 training takes 0:06:52 [2022-10-08 08:19:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.351 (3.351) Loss 0.9852 (0.9852) Acc@1 75.977 (75.977) Acc@5 94.531 (94.531) [2022-10-08 08:19:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.704 Acc@5 94.178 [2022-10-08 08:19:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-08 08:19:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.71% [2022-10-08 08:19:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][0/1251] eta 1:00:26 lr 0.000308 time 2.8989 (2.8989) loss 3.1843 (3.1843) grad_norm 1.7975 (1.7975) [2022-10-08 08:20:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][100/1251] eta 0:06:46 lr 0.000308 time 0.3314 (0.3536) loss 3.1325 (3.3679) grad_norm 1.5493 (1.8501) [2022-10-08 08:20:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][200/1251] eta 0:05:57 lr 0.000308 time 0.3231 (0.3397) loss 3.2642 (3.3637) grad_norm 2.0141 (1.8591) [2022-10-08 08:21:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][300/1251] eta 0:05:19 lr 0.000307 time 0.3253 (0.3357) loss 3.5445 (3.3764) grad_norm 2.1463 (1.8969) [2022-10-08 08:21:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][400/1251] eta 0:04:43 lr 0.000307 time 0.3221 (0.3332) loss 3.6020 (3.3676) grad_norm 1.8510 (1.9038) [2022-10-08 08:22:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][500/1251] eta 0:04:08 lr 0.000307 time 0.3323 (0.3315) loss 3.6426 (3.3669) grad_norm 1.8457 (1.9062) [2022-10-08 08:22:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][600/1251] eta 0:03:35 lr 0.000306 time 0.3283 (0.3304) loss 3.5319 (3.3670) grad_norm 1.8693 (1.9101) [2022-10-08 08:23:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][700/1251] eta 0:03:01 lr 0.000306 time 0.3247 (0.3297) loss 3.1975 (3.3639) grad_norm 1.9135 (1.9063) [2022-10-08 08:24:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][800/1251] eta 0:02:28 lr 0.000305 time 0.3234 (0.3290) loss 3.1837 (3.3644) grad_norm 1.6025 (1.9020) [2022-10-08 08:24:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][900/1251] eta 0:01:55 lr 0.000305 time 0.3244 (0.3284) loss 3.2075 (3.3638) grad_norm 1.8335 (1.9008) [2022-10-08 08:25:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][1000/1251] eta 0:01:22 lr 0.000305 time 0.3218 (0.3279) loss 3.4554 (3.3637) grad_norm 1.7800 (1.9016) [2022-10-08 08:25:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][1100/1251] eta 0:00:49 lr 0.000304 time 0.3206 (0.3276) loss 3.4020 (3.3636) grad_norm 2.0674 (1.9042) [2022-10-08 08:26:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [189/300][1200/1251] eta 0:00:16 lr 0.000304 time 0.3276 (0.3274) loss 3.3299 (3.3657) grad_norm 1.9749 (1.9015) [2022-10-08 08:26:28 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 189 training takes 0:06:49 [2022-10-08 08:26:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.742 (2.742) Loss 0.9636 (0.9636) Acc@1 78.125 (78.125) Acc@5 93.457 (93.457) [2022-10-08 08:26:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.758 Acc@5 93.974 [2022-10-08 08:26:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-08 08:26:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.76% [2022-10-08 08:26:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][0/1251] eta 1:00:40 lr 0.000304 time 2.9100 (2.9100) loss 3.3671 (3.3671) grad_norm 1.7334 (1.7334) [2022-10-08 08:27:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][100/1251] eta 0:06:44 lr 0.000303 time 0.3235 (0.3516) loss 3.4149 (3.3753) grad_norm 1.9610 (1.9394) [2022-10-08 08:27:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][200/1251] eta 0:05:55 lr 0.000303 time 0.3289 (0.3385) loss 3.3926 (3.3745) grad_norm 1.7981 (1.9119) [2022-10-08 08:28:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][300/1251] eta 0:05:18 lr 0.000303 time 0.3226 (0.3347) loss 3.3743 (3.3697) grad_norm 1.8011 (1.9144) [2022-10-08 08:28:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][400/1251] eta 0:04:43 lr 0.000302 time 0.3269 (0.3331) loss 3.2209 (3.3720) grad_norm 1.5350 (1.9020) [2022-10-08 08:29:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][500/1251] eta 0:04:09 lr 0.000302 time 0.3259 (0.3325) loss 3.2664 (3.3766) grad_norm 2.2016 (1.9005) [2022-10-08 08:30:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][600/1251] eta 0:03:36 lr 0.000301 time 0.3416 (0.3325) loss 3.5232 (3.3742) grad_norm 2.0016 (1.9057) [2022-10-08 08:30:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][700/1251] eta 0:03:03 lr 0.000301 time 0.3392 (0.3324) loss 3.4275 (3.3717) grad_norm 1.8727 (1.9034) [2022-10-08 08:31:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][800/1251] eta 0:02:29 lr 0.000301 time 0.3394 (0.3323) loss 3.6358 (3.3702) grad_norm 1.7456 (1.9071) [2022-10-08 08:31:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][900/1251] eta 0:01:56 lr 0.000300 time 0.3375 (0.3322) loss 3.1191 (3.3694) grad_norm 1.9450 (1.9039) [2022-10-08 08:32:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][1000/1251] eta 0:01:23 lr 0.000300 time 0.3317 (0.3320) loss 3.2654 (3.3710) grad_norm 1.8465 (1.9034) [2022-10-08 08:32:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][1100/1251] eta 0:00:50 lr 0.000300 time 0.3366 (0.3318) loss 3.5529 (3.3710) grad_norm 1.8190 (1.9049) [2022-10-08 08:33:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [190/300][1200/1251] eta 0:00:16 lr 0.000299 time 0.3243 (0.3317) loss 3.3175 (3.3710) grad_norm 1.6292 (1.9046) [2022-10-08 08:33:37 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 190 training takes 0:06:55 [2022-10-08 08:33:37 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_190 saving...... [2022-10-08 08:33:38 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_190 saved !!! [2022-10-08 08:33:40 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.331 (2.331) Loss 0.9123 (0.9123) Acc@1 79.395 (79.395) Acc@5 94.922 (94.922) [2022-10-08 08:33:51 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.750 Acc@5 94.076 [2022-10-08 08:33:51 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-08 08:33:51 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.76% [2022-10-08 08:33:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][0/1251] eta 1:08:45 lr 0.000299 time 3.2981 (3.2981) loss 3.3457 (3.3457) grad_norm 1.7919 (1.7919) [2022-10-08 08:34:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][100/1251] eta 0:06:52 lr 0.000299 time 0.3288 (0.3586) loss 3.6272 (3.3165) grad_norm 1.7083 (1.8854) [2022-10-08 08:35:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][200/1251] eta 0:06:01 lr 0.000298 time 0.3289 (0.3443) loss 2.9809 (3.3225) grad_norm 1.9508 (1.8975) [2022-10-08 08:35:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][300/1251] eta 0:05:22 lr 0.000298 time 0.3264 (0.3391) loss 3.3602 (3.3382) grad_norm 2.3992 (1.9104) [2022-10-08 08:36:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][400/1251] eta 0:04:46 lr 0.000297 time 0.3288 (0.3362) loss 3.4902 (3.3490) grad_norm 1.8135 (1.9127) [2022-10-08 08:36:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][500/1251] eta 0:04:11 lr 0.000297 time 0.3281 (0.3345) loss 3.3247 (3.3508) grad_norm 2.1396 (1.9112) [2022-10-08 08:37:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][600/1251] eta 0:03:37 lr 0.000297 time 0.3267 (0.3334) loss 3.4753 (3.3530) grad_norm 2.0977 (1.9286) [2022-10-08 08:37:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][700/1251] eta 0:03:03 lr 0.000296 time 0.3313 (0.3326) loss 3.3684 (3.3569) grad_norm 1.6606 (1.9289) [2022-10-08 08:38:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][800/1251] eta 0:02:29 lr 0.000296 time 0.3321 (0.3321) loss 3.6738 (3.3601) grad_norm 1.7856 (1.9256) [2022-10-08 08:38:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][900/1251] eta 0:01:56 lr 0.000296 time 0.3294 (0.3319) loss 3.3583 (3.3634) grad_norm 1.8997 (1.9234) [2022-10-08 08:39:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][1000/1251] eta 0:01:23 lr 0.000295 time 0.3297 (0.3318) loss 3.7467 (3.3643) grad_norm 1.8715 (1.9236) [2022-10-08 08:39:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][1100/1251] eta 0:00:50 lr 0.000295 time 0.3296 (0.3319) loss 3.3755 (3.3650) grad_norm 2.0180 (1.9229) [2022-10-08 08:40:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [191/300][1200/1251] eta 0:00:16 lr 0.000294 time 0.3391 (0.3320) loss 3.1766 (3.3664) grad_norm 1.7043 (1.9222) [2022-10-08 08:40:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 191 training takes 0:06:55 [2022-10-08 08:40:50 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.000 (3.000) Loss 0.8471 (0.8471) Acc@1 78.809 (78.809) Acc@5 95.215 (95.215) [2022-10-08 08:41:01 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.970 Acc@5 94.186 [2022-10-08 08:41:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-08 08:41:01 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 77.97% [2022-10-08 08:41:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][0/1251] eta 0:57:52 lr 0.000294 time 2.7755 (2.7755) loss 3.4123 (3.4123) grad_norm 1.8680 (1.8680) [2022-10-08 08:41:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][100/1251] eta 0:06:46 lr 0.000294 time 0.3302 (0.3534) loss 3.3482 (3.3333) grad_norm 2.2245 (1.9437) [2022-10-08 08:42:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][200/1251] eta 0:05:58 lr 0.000293 time 0.3240 (0.3415) loss 3.4343 (3.3475) grad_norm 1.7306 (1.9345) [2022-10-08 08:42:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][300/1251] eta 0:05:20 lr 0.000293 time 0.3307 (0.3368) loss 3.2062 (3.3568) grad_norm 1.6957 (1.9305) [2022-10-08 08:43:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][400/1251] eta 0:04:44 lr 0.000293 time 0.3236 (0.3345) loss 3.3685 (3.3490) grad_norm 2.2441 (1.9334) [2022-10-08 08:43:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][500/1251] eta 0:04:09 lr 0.000292 time 0.3258 (0.3329) loss 3.5761 (3.3474) grad_norm 2.1586 (1.9406) [2022-10-08 08:44:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][600/1251] eta 0:03:35 lr 0.000292 time 0.3246 (0.3316) loss 3.5819 (3.3513) grad_norm 1.8632 (1.9458) [2022-10-08 08:44:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][700/1251] eta 0:03:02 lr 0.000292 time 0.3246 (0.3307) loss 3.1537 (3.3539) grad_norm 1.8694 (1.9457) [2022-10-08 08:45:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][800/1251] eta 0:02:28 lr 0.000291 time 0.3221 (0.3301) loss 3.2499 (3.3542) grad_norm 2.0433 (1.9465) [2022-10-08 08:45:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][900/1251] eta 0:01:55 lr 0.000291 time 0.3244 (0.3296) loss 3.2948 (3.3549) grad_norm 1.8272 (1.9375) [2022-10-08 08:46:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][1000/1251] eta 0:01:22 lr 0.000290 time 0.3231 (0.3293) loss 3.2260 (3.3548) grad_norm 1.9512 (1.9365) [2022-10-08 08:47:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][1100/1251] eta 0:00:49 lr 0.000290 time 0.3251 (0.3292) loss 3.0690 (3.3572) grad_norm 1.7443 (1.9359) [2022-10-08 08:47:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [192/300][1200/1251] eta 0:00:16 lr 0.000290 time 0.3239 (0.3292) loss 3.0010 (3.3594) grad_norm 1.8347 (1.9349) [2022-10-08 08:47:53 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 192 training takes 0:06:52 [2022-10-08 08:47:55 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.597 (2.597) Loss 0.9805 (0.9805) Acc@1 76.855 (76.855) Acc@5 94.727 (94.727) [2022-10-08 08:48:07 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.008 Acc@5 94.122 [2022-10-08 08:48:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-08 08:48:07 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.01% [2022-10-08 08:48:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][0/1251] eta 1:07:27 lr 0.000290 time 3.2357 (3.2357) loss 3.3965 (3.3965) grad_norm 1.5676 (1.5676) [2022-10-08 08:48:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][100/1251] eta 0:06:47 lr 0.000289 time 0.3266 (0.3542) loss 3.4873 (3.3550) grad_norm 2.0365 (1.9318) [2022-10-08 08:49:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][200/1251] eta 0:05:57 lr 0.000289 time 0.3438 (0.3404) loss 3.3671 (3.3472) grad_norm 1.8083 (1.9366) [2022-10-08 08:49:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][300/1251] eta 0:05:20 lr 0.000288 time 0.3282 (0.3368) loss 3.0038 (3.3555) grad_norm 1.9695 (1.9470) [2022-10-08 08:50:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][400/1251] eta 0:04:45 lr 0.000288 time 0.3207 (0.3354) loss 3.3907 (3.3553) grad_norm 1.9302 (1.9492) [2022-10-08 08:50:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][500/1251] eta 0:04:11 lr 0.000288 time 0.3353 (0.3349) loss 3.4509 (3.3529) grad_norm 2.0287 (1.9430) [2022-10-08 08:51:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][600/1251] eta 0:03:37 lr 0.000287 time 0.3405 (0.3347) loss 3.3630 (3.3559) grad_norm 1.8897 (1.9387) [2022-10-08 08:52:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][700/1251] eta 0:03:04 lr 0.000287 time 0.3257 (0.3344) loss 3.3843 (3.3545) grad_norm 2.4165 (1.9353) [2022-10-08 08:52:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][800/1251] eta 0:02:30 lr 0.000287 time 0.3260 (0.3340) loss 3.2942 (3.3555) grad_norm 2.3819 (1.9324) [2022-10-08 08:53:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][900/1251] eta 0:01:57 lr 0.000286 time 0.3229 (0.3336) loss 3.7775 (3.3578) grad_norm 1.8957 (1.9356) [2022-10-08 08:53:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][1000/1251] eta 0:01:23 lr 0.000286 time 0.3383 (0.3332) loss 3.1705 (3.3587) grad_norm 2.1286 (1.9363) [2022-10-08 08:54:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][1100/1251] eta 0:00:50 lr 0.000285 time 0.3322 (0.3330) loss 3.4688 (3.3602) grad_norm 1.9378 (1.9374) [2022-10-08 08:54:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [193/300][1200/1251] eta 0:00:16 lr 0.000285 time 0.3299 (0.3327) loss 3.1208 (3.3594) grad_norm 1.6779 (1.9425) [2022-10-08 08:55:03 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 193 training takes 0:06:56 [2022-10-08 08:55:06 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.725 (2.725) Loss 0.8908 (0.8908) Acc@1 79.590 (79.590) Acc@5 94.434 (94.434) [2022-10-08 08:55:17 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.810 Acc@5 94.138 [2022-10-08 08:55:17 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-08 08:55:17 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.01% [2022-10-08 08:55:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][0/1251] eta 1:00:05 lr 0.000285 time 2.8824 (2.8824) loss 3.4144 (3.4144) grad_norm 2.1734 (2.1734) [2022-10-08 08:55:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][100/1251] eta 0:06:46 lr 0.000285 time 0.3318 (0.3529) loss 3.2935 (3.3480) grad_norm 1.7155 (1.9539) [2022-10-08 08:56:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][200/1251] eta 0:05:57 lr 0.000284 time 0.3267 (0.3398) loss 3.4636 (3.3560) grad_norm 1.8112 (1.9715) [2022-10-08 08:56:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][300/1251] eta 0:05:18 lr 0.000284 time 0.3273 (0.3353) loss 3.4086 (3.3458) grad_norm 1.7223 (1.9624) [2022-10-08 08:57:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][400/1251] eta 0:04:43 lr 0.000283 time 0.3269 (0.3330) loss 3.4215 (3.3389) grad_norm 2.0201 (1.9656) [2022-10-08 08:58:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][500/1251] eta 0:04:09 lr 0.000283 time 0.3242 (0.3316) loss 3.5712 (3.3433) grad_norm 1.9469 (1.9661) [2022-10-08 08:58:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][600/1251] eta 0:03:35 lr 0.000283 time 0.3267 (0.3306) loss 3.2705 (3.3431) grad_norm 1.8355 (1.9751) [2022-10-08 08:59:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][700/1251] eta 0:03:01 lr 0.000282 time 0.3298 (0.3301) loss 3.4938 (3.3389) grad_norm 2.2396 (1.9736) [2022-10-08 08:59:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][800/1251] eta 0:02:28 lr 0.000282 time 0.3240 (0.3298) loss 3.0275 (3.3394) grad_norm 1.7601 (1.9708) [2022-10-08 09:00:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][900/1251] eta 0:01:55 lr 0.000282 time 0.3248 (0.3298) loss 3.6613 (3.3373) grad_norm 1.8935 (1.9697) [2022-10-08 09:00:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][1000/1251] eta 0:01:22 lr 0.000281 time 0.3285 (0.3296) loss 3.2307 (3.3395) grad_norm 1.9474 (1.9653) [2022-10-08 09:01:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][1100/1251] eta 0:00:49 lr 0.000281 time 0.3351 (0.3296) loss 3.0953 (3.3384) grad_norm 1.8767 (1.9680) [2022-10-08 09:01:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [194/300][1200/1251] eta 0:00:16 lr 0.000280 time 0.3231 (0.3297) loss 3.1520 (3.3395) grad_norm 1.8031 (1.9672) [2022-10-08 09:02:10 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 194 training takes 0:06:52 [2022-10-08 09:02:13 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.167 (3.167) Loss 1.0114 (1.0114) Acc@1 76.270 (76.270) Acc@5 94.434 (94.434) [2022-10-08 09:02:24 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.028 Acc@5 94.270 [2022-10-08 09:02:24 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-08 09:02:24 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.03% [2022-10-08 09:02:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][0/1251] eta 0:54:04 lr 0.000280 time 2.5935 (2.5935) loss 3.4147 (3.4147) grad_norm 1.9444 (1.9444) [2022-10-08 09:03:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][100/1251] eta 0:06:42 lr 0.000280 time 0.3254 (0.3498) loss 3.6236 (3.3332) grad_norm 2.1013 (1.9848) [2022-10-08 09:03:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][200/1251] eta 0:05:55 lr 0.000280 time 0.3330 (0.3384) loss 3.3657 (3.3298) grad_norm 1.9338 (1.9728) [2022-10-08 09:04:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][300/1251] eta 0:05:18 lr 0.000279 time 0.3217 (0.3351) loss 3.3581 (3.3419) grad_norm 1.9233 (1.9792) [2022-10-08 09:04:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][400/1251] eta 0:04:44 lr 0.000279 time 0.3263 (0.3339) loss 3.2283 (3.3474) grad_norm 2.0186 (1.9763) [2022-10-08 09:05:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][500/1251] eta 0:04:10 lr 0.000278 time 0.3302 (0.3334) loss 3.4719 (3.3463) grad_norm 1.9009 (1.9706) [2022-10-08 09:05:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][600/1251] eta 0:03:36 lr 0.000278 time 0.3312 (0.3333) loss 3.3858 (3.3435) grad_norm 1.7810 (1.9646) [2022-10-08 09:06:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][700/1251] eta 0:03:03 lr 0.000278 time 0.3403 (0.3332) loss 3.3076 (3.3462) grad_norm 2.2509 (1.9685) [2022-10-08 09:06:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][800/1251] eta 0:02:30 lr 0.000277 time 0.3274 (0.3326) loss 3.1835 (3.3472) grad_norm 1.9926 (1.9723) [2022-10-08 09:07:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][900/1251] eta 0:01:56 lr 0.000277 time 0.3257 (0.3322) loss 3.1241 (3.3465) grad_norm 1.6698 (1.9740) [2022-10-08 09:07:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][1000/1251] eta 0:01:23 lr 0.000277 time 0.3298 (0.3315) loss 3.5806 (3.3484) grad_norm 1.8217 (1.9750) [2022-10-08 09:08:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][1100/1251] eta 0:00:49 lr 0.000276 time 0.3259 (0.3309) loss 3.4852 (3.3475) grad_norm 2.0579 (1.9743) [2022-10-08 09:09:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [195/300][1200/1251] eta 0:00:16 lr 0.000276 time 0.3221 (0.3304) loss 3.0234 (3.3469) grad_norm 1.6420 (1.9764) [2022-10-08 09:09:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 195 training takes 0:06:53 [2022-10-08 09:09:21 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.216 (3.216) Loss 0.8140 (0.8140) Acc@1 80.664 (80.664) Acc@5 95.605 (95.605) [2022-10-08 09:09:31 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 77.972 Acc@5 94.338 [2022-10-08 09:09:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-08 09:09:31 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.03% [2022-10-08 09:09:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][0/1251] eta 0:50:36 lr 0.000276 time 2.4276 (2.4276) loss 3.3668 (3.3668) grad_norm 1.7585 (1.7585) [2022-10-08 09:10:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][100/1251] eta 0:06:43 lr 0.000275 time 0.3280 (0.3505) loss 3.2522 (3.3208) grad_norm 1.9689 (1.9757) [2022-10-08 09:10:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][200/1251] eta 0:05:55 lr 0.000275 time 0.3293 (0.3386) loss 2.9652 (3.3252) grad_norm 1.6233 (1.9942) [2022-10-08 09:11:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][300/1251] eta 0:05:18 lr 0.000275 time 0.3263 (0.3345) loss 3.4346 (3.3245) grad_norm 2.0837 (1.9897) [2022-10-08 09:11:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][400/1251] eta 0:04:42 lr 0.000274 time 0.3260 (0.3324) loss 3.5262 (3.3264) grad_norm 2.5850 (2.0021) [2022-10-08 09:12:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][500/1251] eta 0:04:08 lr 0.000274 time 0.3254 (0.3311) loss 3.7018 (3.3322) grad_norm 2.0538 (2.0196) [2022-10-08 09:12:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][600/1251] eta 0:03:35 lr 0.000273 time 0.3396 (0.3309) loss 3.4007 (3.3346) grad_norm 1.8855 (2.0181) [2022-10-08 09:13:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][700/1251] eta 0:03:02 lr 0.000273 time 0.3232 (0.3304) loss 3.1853 (3.3399) grad_norm 1.9868 (2.0153) [2022-10-08 09:13:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][800/1251] eta 0:02:28 lr 0.000273 time 0.3279 (0.3300) loss 3.3227 (3.3397) grad_norm 2.2519 (2.0133) [2022-10-08 09:14:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][900/1251] eta 0:01:55 lr 0.000272 time 0.3279 (0.3298) loss 3.4498 (3.3377) grad_norm 1.8391 (2.0104) [2022-10-08 09:15:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][1000/1251] eta 0:01:22 lr 0.000272 time 0.3292 (0.3297) loss 3.0583 (3.3399) grad_norm 1.6148 (2.0144) [2022-10-08 09:15:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][1100/1251] eta 0:00:49 lr 0.000272 time 0.3287 (0.3296) loss 3.2751 (3.3402) grad_norm 1.5353 (2.0112) [2022-10-08 09:16:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [196/300][1200/1251] eta 0:00:16 lr 0.000271 time 0.3269 (0.3296) loss 3.5763 (3.3429) grad_norm 1.9382 (2.0145) [2022-10-08 09:16:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 196 training takes 0:06:52 [2022-10-08 09:16:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.337 (3.337) Loss 0.8634 (0.8634) Acc@1 80.371 (80.371) Acc@5 94.922 (94.922) [2022-10-08 09:16:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.064 Acc@5 94.272 [2022-10-08 09:16:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-08 09:16:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.06% [2022-10-08 09:16:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][0/1251] eta 1:00:17 lr 0.000271 time 2.8914 (2.8914) loss 3.4069 (3.4069) grad_norm 1.7547 (1.7547) [2022-10-08 09:17:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][100/1251] eta 0:06:45 lr 0.000271 time 0.3280 (0.3519) loss 3.4542 (3.3249) grad_norm 1.8922 (2.0277) [2022-10-08 09:17:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][200/1251] eta 0:05:56 lr 0.000270 time 0.3276 (0.3390) loss 3.2736 (3.3305) grad_norm 2.1919 (2.0329) [2022-10-08 09:18:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][300/1251] eta 0:05:18 lr 0.000270 time 0.3247 (0.3347) loss 3.2840 (3.3285) grad_norm 1.9358 (2.0331) [2022-10-08 09:18:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][400/1251] eta 0:04:42 lr 0.000270 time 0.3287 (0.3324) loss 3.1655 (3.3310) grad_norm 1.7580 (2.0242) [2022-10-08 09:19:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][500/1251] eta 0:04:08 lr 0.000269 time 0.3274 (0.3312) loss 3.2482 (3.3346) grad_norm 1.8197 (2.0171) [2022-10-08 09:19:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][600/1251] eta 0:03:35 lr 0.000269 time 0.3247 (0.3304) loss 3.3216 (3.3284) grad_norm 1.8517 (2.0124) [2022-10-08 09:20:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][700/1251] eta 0:03:01 lr 0.000269 time 0.3245 (0.3297) loss 3.4848 (3.3298) grad_norm 2.0124 (2.0111) [2022-10-08 09:21:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][800/1251] eta 0:02:28 lr 0.000268 time 0.3283 (0.3292) loss 3.1409 (3.3299) grad_norm 2.1636 (2.0146) [2022-10-08 09:21:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][900/1251] eta 0:01:55 lr 0.000268 time 0.3277 (0.3289) loss 3.6731 (3.3303) grad_norm 2.0736 (2.0243) [2022-10-08 09:22:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][1000/1251] eta 0:01:22 lr 0.000267 time 0.3294 (0.3285) loss 3.1440 (3.3316) grad_norm 1.9050 (2.0242) [2022-10-08 09:22:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][1100/1251] eta 0:00:49 lr 0.000267 time 0.3256 (0.3283) loss 3.1407 (3.3333) grad_norm 1.8539 (2.0218) [2022-10-08 09:23:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [197/300][1200/1251] eta 0:00:16 lr 0.000267 time 0.3250 (0.3284) loss 3.0914 (3.3348) grad_norm 1.6324 (2.0222) [2022-10-08 09:23:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 197 training takes 0:06:51 [2022-10-08 09:23:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.683 (2.683) Loss 0.9848 (0.9848) Acc@1 76.855 (76.855) Acc@5 94.141 (94.141) [2022-10-08 09:23:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.118 Acc@5 94.272 [2022-10-08 09:23:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-08 09:23:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.12% [2022-10-08 09:23:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][0/1251] eta 0:52:17 lr 0.000267 time 2.5077 (2.5077) loss 3.1831 (3.1831) grad_norm 1.7659 (1.7659) [2022-10-08 09:24:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][100/1251] eta 0:06:45 lr 0.000266 time 0.3259 (0.3522) loss 3.1615 (3.3135) grad_norm 1.6781 (1.9893) [2022-10-08 09:24:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][200/1251] eta 0:05:57 lr 0.000266 time 0.3262 (0.3397) loss 3.4916 (3.3269) grad_norm 1.8802 (2.0170) [2022-10-08 09:25:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][300/1251] eta 0:05:19 lr 0.000265 time 0.3374 (0.3361) loss 3.4302 (3.3273) grad_norm 1.7635 (2.0226) [2022-10-08 09:25:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][400/1251] eta 0:04:44 lr 0.000265 time 0.3251 (0.3346) loss 3.3610 (3.3261) grad_norm 1.9011 (2.0197) [2022-10-08 09:26:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][500/1251] eta 0:04:10 lr 0.000265 time 0.3272 (0.3340) loss 3.3459 (3.3311) grad_norm 2.3307 (2.0287) [2022-10-08 09:27:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][600/1251] eta 0:03:37 lr 0.000264 time 0.3307 (0.3338) loss 3.3908 (3.3330) grad_norm 1.7792 (2.0377) [2022-10-08 09:27:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][700/1251] eta 0:03:03 lr 0.000264 time 0.3247 (0.3334) loss 3.4093 (3.3337) grad_norm 1.8486 (2.0328) [2022-10-08 09:28:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][800/1251] eta 0:02:30 lr 0.000264 time 0.3362 (0.3332) loss 3.2780 (3.3361) grad_norm 2.0007 (2.0358) [2022-10-08 09:28:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][900/1251] eta 0:01:56 lr 0.000263 time 0.3339 (0.3332) loss 3.3001 (3.3373) grad_norm 1.7660 (2.0371) [2022-10-08 09:29:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][1000/1251] eta 0:01:23 lr 0.000263 time 0.3268 (0.3331) loss 3.2235 (3.3355) grad_norm 2.0944 (2.0423) [2022-10-08 09:29:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][1100/1251] eta 0:00:50 lr 0.000263 time 0.3317 (0.3330) loss 3.4254 (3.3348) grad_norm 1.9749 (2.0416) [2022-10-08 09:30:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [198/300][1200/1251] eta 0:00:16 lr 0.000262 time 0.3323 (0.3328) loss 3.3783 (3.3360) grad_norm 1.7681 (2.0410) [2022-10-08 09:30:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 198 training takes 0:06:56 [2022-10-08 09:30:43 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.994 (2.994) Loss 0.9421 (0.9421) Acc@1 79.199 (79.199) Acc@5 93.555 (93.555) [2022-10-08 09:30:54 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.042 Acc@5 94.238 [2022-10-08 09:30:54 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-08 09:30:54 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.12% [2022-10-08 09:30:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][0/1251] eta 0:49:37 lr 0.000262 time 2.3800 (2.3800) loss 2.8358 (2.8358) grad_norm 1.9351 (1.9351) [2022-10-08 09:31:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][100/1251] eta 0:06:47 lr 0.000262 time 0.3284 (0.3541) loss 3.2019 (3.2910) grad_norm 1.9015 (1.9934) [2022-10-08 09:32:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][200/1251] eta 0:05:58 lr 0.000261 time 0.3228 (0.3408) loss 2.9171 (3.3189) grad_norm 1.7325 (2.0306) [2022-10-08 09:32:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][300/1251] eta 0:05:19 lr 0.000261 time 0.3315 (0.3362) loss 3.2619 (3.3157) grad_norm 1.8222 (2.0252) [2022-10-08 09:33:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][400/1251] eta 0:04:44 lr 0.000261 time 0.3268 (0.3340) loss 3.2650 (3.3223) grad_norm 2.4121 (2.0352) [2022-10-08 09:33:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][500/1251] eta 0:04:10 lr 0.000260 time 0.3246 (0.3329) loss 3.3863 (3.3215) grad_norm 1.9570 (2.0394) [2022-10-08 09:34:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][600/1251] eta 0:03:36 lr 0.000260 time 0.3261 (0.3320) loss 3.4082 (3.3227) grad_norm 1.7812 (2.0395) [2022-10-08 09:34:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][700/1251] eta 0:03:02 lr 0.000259 time 0.3313 (0.3315) loss 3.4857 (3.3277) grad_norm 1.8059 (2.0481) [2022-10-08 09:35:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][800/1251] eta 0:02:29 lr 0.000259 time 0.3346 (0.3313) loss 3.5394 (3.3272) grad_norm 2.0343 (2.0410) [2022-10-08 09:35:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][900/1251] eta 0:01:56 lr 0.000259 time 0.3274 (0.3311) loss 2.9366 (3.3275) grad_norm 1.9648 (2.0419) [2022-10-08 09:36:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][1000/1251] eta 0:01:23 lr 0.000258 time 0.3322 (0.3312) loss 3.3889 (3.3269) grad_norm 2.1685 (2.0412) [2022-10-08 09:36:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][1100/1251] eta 0:00:50 lr 0.000258 time 0.3388 (0.3314) loss 3.2645 (3.3277) grad_norm 1.9292 (2.0427) [2022-10-08 09:37:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [199/300][1200/1251] eta 0:00:16 lr 0.000258 time 0.3324 (0.3316) loss 3.2410 (3.3255) grad_norm 1.9060 (2.0397) [2022-10-08 09:37:49 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 199 training takes 0:06:55 [2022-10-08 09:37:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.399 (2.399) Loss 0.8832 (0.8832) Acc@1 78.711 (78.711) Acc@5 95.117 (95.117) [2022-10-08 09:38:03 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.194 Acc@5 94.268 [2022-10-08 09:38:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-08 09:38:03 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.19% [2022-10-08 09:38:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][0/1251] eta 1:01:05 lr 0.000258 time 2.9304 (2.9304) loss 3.2150 (3.2150) grad_norm 1.9101 (1.9101) [2022-10-08 09:38:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][100/1251] eta 0:06:48 lr 0.000257 time 0.3292 (0.3552) loss 3.1604 (3.3241) grad_norm 2.2348 (2.0805) [2022-10-08 09:39:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][200/1251] eta 0:05:59 lr 0.000257 time 0.3261 (0.3417) loss 3.2172 (3.3173) grad_norm 1.8785 (2.0496) [2022-10-08 09:39:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][300/1251] eta 0:05:20 lr 0.000256 time 0.3296 (0.3374) loss 3.3141 (3.3158) grad_norm 1.7861 (2.0275) [2022-10-08 09:40:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][400/1251] eta 0:04:45 lr 0.000256 time 0.3239 (0.3353) loss 3.0282 (3.3159) grad_norm 1.7471 (2.0292) [2022-10-08 09:40:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][500/1251] eta 0:04:10 lr 0.000256 time 0.3275 (0.3340) loss 3.1683 (3.3168) grad_norm 1.7518 (2.0294) [2022-10-08 09:41:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][600/1251] eta 0:03:36 lr 0.000255 time 0.3260 (0.3329) loss 3.4688 (3.3145) grad_norm 1.8286 (2.0259) [2022-10-08 09:41:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][700/1251] eta 0:03:03 lr 0.000255 time 0.3274 (0.3322) loss 3.3312 (3.3166) grad_norm 1.9210 (2.0249) [2022-10-08 09:42:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][800/1251] eta 0:02:29 lr 0.000255 time 0.3278 (0.3317) loss 3.3862 (3.3183) grad_norm 2.4270 (2.0351) [2022-10-08 09:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][900/1251] eta 0:01:56 lr 0.000254 time 0.3288 (0.3313) loss 3.4639 (3.3165) grad_norm 2.1498 (2.0419) [2022-10-08 09:43:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][1000/1251] eta 0:01:23 lr 0.000254 time 0.3259 (0.3310) loss 3.3312 (3.3173) grad_norm 1.8561 (2.0398) [2022-10-08 09:44:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][1100/1251] eta 0:00:50 lr 0.000254 time 0.3315 (0.3312) loss 3.2599 (3.3183) grad_norm 1.8716 (2.0389) [2022-10-08 09:44:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [200/300][1200/1251] eta 0:00:16 lr 0.000253 time 0.3306 (0.3311) loss 3.2991 (3.3196) grad_norm 2.0238 (2.0386) [2022-10-08 09:44:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 200 training takes 0:06:54 [2022-10-08 09:44:57 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_200 saving...... [2022-10-08 09:44:58 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_200 saved !!! [2022-10-08 09:45:01 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.959 (2.959) Loss 0.9273 (0.9273) Acc@1 78.809 (78.809) Acc@5 93.848 (93.848) [2022-10-08 09:45:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.388 Acc@5 94.326 [2022-10-08 09:45:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-08 09:45:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.39% [2022-10-08 09:45:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][0/1251] eta 0:58:24 lr 0.000253 time 2.8016 (2.8016) loss 3.2463 (3.2463) grad_norm 2.1898 (2.1898) [2022-10-08 09:45:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][100/1251] eta 0:06:44 lr 0.000253 time 0.3250 (0.3516) loss 3.4647 (3.3156) grad_norm 1.9589 (2.0908) [2022-10-08 09:46:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][200/1251] eta 0:05:56 lr 0.000252 time 0.3352 (0.3396) loss 3.4643 (3.3243) grad_norm 1.9355 (2.0601) [2022-10-08 09:46:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][300/1251] eta 0:05:19 lr 0.000252 time 0.3355 (0.3361) loss 3.3263 (3.3168) grad_norm 1.7833 (2.0685) [2022-10-08 09:47:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][400/1251] eta 0:04:44 lr 0.000252 time 0.3277 (0.3347) loss 3.7398 (3.3173) grad_norm 2.1433 (2.0561) [2022-10-08 09:47:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][500/1251] eta 0:04:10 lr 0.000251 time 0.3485 (0.3342) loss 3.3933 (3.3217) grad_norm 1.8566 (2.0586) [2022-10-08 09:48:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][600/1251] eta 0:03:37 lr 0.000251 time 0.3321 (0.3341) loss 3.4233 (3.3211) grad_norm 1.8638 (2.0601) [2022-10-08 09:49:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][700/1251] eta 0:03:04 lr 0.000251 time 0.3431 (0.3341) loss 2.9390 (3.3185) grad_norm 2.0854 (2.0597) [2022-10-08 09:49:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][800/1251] eta 0:02:30 lr 0.000250 time 0.3252 (0.3340) loss 3.5638 (3.3181) grad_norm 2.1399 (2.0610) [2022-10-08 09:50:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][900/1251] eta 0:01:57 lr 0.000250 time 0.3371 (0.3338) loss 3.4141 (3.3182) grad_norm 2.1785 (2.0615) [2022-10-08 09:50:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][1000/1251] eta 0:01:23 lr 0.000249 time 0.3231 (0.3338) loss 3.3892 (3.3192) grad_norm 2.3988 (2.0536) [2022-10-08 09:51:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][1100/1251] eta 0:00:50 lr 0.000249 time 0.3347 (0.3336) loss 3.4904 (3.3161) grad_norm 2.1658 (2.0532) [2022-10-08 09:51:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [201/300][1200/1251] eta 0:00:17 lr 0.000249 time 0.3295 (0.3334) loss 3.4851 (3.3165) grad_norm 1.7831 (2.0563) [2022-10-08 09:52:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 201 training takes 0:06:57 [2022-10-08 09:52:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.390 (2.390) Loss 0.9804 (0.9804) Acc@1 79.102 (79.102) Acc@5 92.676 (92.676) [2022-10-08 09:52:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.200 Acc@5 94.298 [2022-10-08 09:52:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-08 09:52:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.39% [2022-10-08 09:52:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][0/1251] eta 1:08:16 lr 0.000249 time 3.2742 (3.2742) loss 3.5023 (3.5023) grad_norm 1.9683 (1.9683) [2022-10-08 09:52:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][100/1251] eta 0:06:51 lr 0.000248 time 0.3336 (0.3577) loss 3.3988 (3.3324) grad_norm 1.8333 (2.0476) [2022-10-08 09:53:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][200/1251] eta 0:06:00 lr 0.000248 time 0.3305 (0.3435) loss 3.0537 (3.3170) grad_norm 3.0462 (2.0582) [2022-10-08 09:54:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][300/1251] eta 0:05:22 lr 0.000248 time 0.3286 (0.3389) loss 3.2888 (3.3073) grad_norm 2.1190 (2.0532) [2022-10-08 09:54:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][400/1251] eta 0:04:46 lr 0.000247 time 0.3280 (0.3364) loss 3.4725 (3.3018) grad_norm 1.8272 (2.0439) [2022-10-08 09:55:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][500/1251] eta 0:04:11 lr 0.000247 time 0.3329 (0.3348) loss 3.6313 (3.3031) grad_norm 1.9575 (2.0342) [2022-10-08 09:55:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][600/1251] eta 0:03:37 lr 0.000246 time 0.3244 (0.3336) loss 3.3920 (3.3033) grad_norm 1.8449 (2.0405) [2022-10-08 09:56:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][700/1251] eta 0:03:03 lr 0.000246 time 0.3343 (0.3327) loss 3.2169 (3.3029) grad_norm 1.8729 (2.0385) [2022-10-08 09:56:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][800/1251] eta 0:02:29 lr 0.000246 time 0.3282 (0.3320) loss 3.1584 (3.3024) grad_norm 1.9249 (2.0409) [2022-10-08 09:57:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][900/1251] eta 0:01:56 lr 0.000245 time 0.3328 (0.3315) loss 3.4668 (3.3069) grad_norm 1.9110 (2.0439) [2022-10-08 09:57:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][1000/1251] eta 0:01:23 lr 0.000245 time 0.3287 (0.3311) loss 3.3962 (3.3082) grad_norm 2.0693 (2.0460) [2022-10-08 09:58:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][1100/1251] eta 0:00:49 lr 0.000245 time 0.3320 (0.3308) loss 3.3796 (3.3113) grad_norm 1.7762 (2.0442) [2022-10-08 09:59:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [202/300][1200/1251] eta 0:00:16 lr 0.000244 time 0.3334 (0.3307) loss 3.3417 (3.3127) grad_norm 2.1790 (2.0442) [2022-10-08 09:59:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 202 training takes 0:06:53 [2022-10-08 09:59:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.634 (2.634) Loss 0.9028 (0.9028) Acc@1 77.734 (77.734) Acc@5 94.434 (94.434) [2022-10-08 09:59:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.416 Acc@5 94.570 [2022-10-08 09:59:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-08 09:59:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.42% [2022-10-08 09:59:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][0/1251] eta 0:46:55 lr 0.000244 time 2.2507 (2.2507) loss 3.0973 (3.0973) grad_norm 1.8111 (1.8111) [2022-10-08 10:00:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][100/1251] eta 0:06:46 lr 0.000244 time 0.3281 (0.3532) loss 3.3155 (3.2706) grad_norm 1.8818 (2.0049) [2022-10-08 10:00:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][200/1251] eta 0:05:57 lr 0.000243 time 0.3285 (0.3403) loss 3.2621 (3.2829) grad_norm 2.2379 (2.0675) [2022-10-08 10:01:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][300/1251] eta 0:05:19 lr 0.000243 time 0.3253 (0.3359) loss 3.3040 (3.2887) grad_norm 2.0049 (2.0738) [2022-10-08 10:01:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][400/1251] eta 0:04:43 lr 0.000243 time 0.3257 (0.3336) loss 3.0987 (3.2936) grad_norm 1.8390 (2.0557) [2022-10-08 10:02:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][500/1251] eta 0:04:09 lr 0.000242 time 0.3244 (0.3321) loss 3.2644 (3.2963) grad_norm 1.9148 (2.0596) [2022-10-08 10:02:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][600/1251] eta 0:03:35 lr 0.000242 time 0.3279 (0.3311) loss 3.3602 (3.2990) grad_norm 2.2322 (2.0706) [2022-10-08 10:03:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][700/1251] eta 0:03:02 lr 0.000242 time 0.3231 (0.3304) loss 3.4467 (3.3023) grad_norm 1.9883 (2.0685) [2022-10-08 10:03:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][800/1251] eta 0:02:28 lr 0.000241 time 0.3273 (0.3297) loss 2.9955 (3.3024) grad_norm 1.8822 (2.0645) [2022-10-08 10:04:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][900/1251] eta 0:01:55 lr 0.000241 time 0.3226 (0.3292) loss 3.0538 (3.3017) grad_norm 1.7814 (2.0600) [2022-10-08 10:04:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][1000/1251] eta 0:01:22 lr 0.000241 time 0.3256 (0.3290) loss 3.1338 (3.3024) grad_norm 1.9856 (2.0608) [2022-10-08 10:05:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][1100/1251] eta 0:00:49 lr 0.000240 time 0.3243 (0.3286) loss 3.0513 (3.3045) grad_norm 1.8834 (2.0590) [2022-10-08 10:06:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [203/300][1200/1251] eta 0:00:16 lr 0.000240 time 0.3257 (0.3282) loss 3.3158 (3.3054) grad_norm 2.5048 (2.0622) [2022-10-08 10:06:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 203 training takes 0:06:50 [2022-10-08 10:06:24 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.296 (3.296) Loss 0.8317 (0.8317) Acc@1 79.492 (79.492) Acc@5 95.508 (95.508) [2022-10-08 10:06:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.472 Acc@5 94.476 [2022-10-08 10:06:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-08 10:06:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.47% [2022-10-08 10:06:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][0/1251] eta 0:55:16 lr 0.000240 time 2.6511 (2.6511) loss 3.4294 (3.4294) grad_norm 2.5831 (2.5831) [2022-10-08 10:07:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][100/1251] eta 0:06:43 lr 0.000239 time 0.3265 (0.3505) loss 3.5272 (3.3011) grad_norm 2.5169 (2.1178) [2022-10-08 10:07:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][200/1251] eta 0:05:55 lr 0.000239 time 0.3281 (0.3383) loss 3.5941 (3.3005) grad_norm 1.9803 (2.0896) [2022-10-08 10:08:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][300/1251] eta 0:05:17 lr 0.000239 time 0.3280 (0.3344) loss 2.5378 (3.2828) grad_norm 1.8650 (2.0864) [2022-10-08 10:08:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][400/1251] eta 0:04:42 lr 0.000238 time 0.3289 (0.3324) loss 3.3492 (3.2915) grad_norm 2.3873 (2.0683) [2022-10-08 10:09:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][500/1251] eta 0:04:08 lr 0.000238 time 0.3296 (0.3316) loss 3.5270 (3.2923) grad_norm 1.8995 (2.0684) [2022-10-08 10:09:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][600/1251] eta 0:03:35 lr 0.000238 time 0.3289 (0.3313) loss 3.2954 (3.2962) grad_norm 2.1545 (2.0679) [2022-10-08 10:10:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][700/1251] eta 0:03:02 lr 0.000237 time 0.3316 (0.3310) loss 3.2621 (3.2996) grad_norm 2.4147 (2.0686) [2022-10-08 10:11:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][800/1251] eta 0:02:29 lr 0.000237 time 0.3312 (0.3309) loss 3.2013 (3.3044) grad_norm 2.2160 (2.0622) [2022-10-08 10:11:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][900/1251] eta 0:01:56 lr 0.000237 time 0.3329 (0.3309) loss 3.1566 (3.3045) grad_norm 2.0182 (2.0605) [2022-10-08 10:12:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][1000/1251] eta 0:01:23 lr 0.000236 time 0.3332 (0.3309) loss 3.2201 (3.3045) grad_norm 2.1200 (2.0533) [2022-10-08 10:12:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][1100/1251] eta 0:00:49 lr 0.000236 time 0.3324 (0.3310) loss 3.2954 (3.3064) grad_norm 2.0439 (2.0554) [2022-10-08 10:13:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [204/300][1200/1251] eta 0:00:16 lr 0.000236 time 0.3277 (0.3311) loss 3.1679 (3.3068) grad_norm 2.1145 (2.0610) [2022-10-08 10:13:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 204 training takes 0:06:54 [2022-10-08 10:13:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.916 (2.916) Loss 0.9341 (0.9341) Acc@1 77.637 (77.637) Acc@5 94.238 (94.238) [2022-10-08 10:13:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.304 Acc@5 94.430 [2022-10-08 10:13:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-10-08 10:13:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.47% [2022-10-08 10:13:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][0/1251] eta 1:07:51 lr 0.000235 time 3.2546 (3.2546) loss 3.2997 (3.2997) grad_norm 1.7967 (1.7967) [2022-10-08 10:14:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][100/1251] eta 0:06:48 lr 0.000235 time 0.3268 (0.3551) loss 3.1962 (3.3134) grad_norm 1.7951 (2.1085) [2022-10-08 10:14:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][200/1251] eta 0:05:58 lr 0.000235 time 0.3268 (0.3408) loss 3.3891 (3.3084) grad_norm 2.1161 (2.0686) [2022-10-08 10:15:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][300/1251] eta 0:05:19 lr 0.000234 time 0.3243 (0.3357) loss 3.2693 (3.2898) grad_norm 1.9791 (2.0660) [2022-10-08 10:15:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][400/1251] eta 0:04:43 lr 0.000234 time 0.3238 (0.3330) loss 3.4334 (3.2870) grad_norm 2.1683 (2.0610) [2022-10-08 10:16:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][500/1251] eta 0:04:08 lr 0.000234 time 0.3231 (0.3314) loss 3.3149 (3.2940) grad_norm 2.1949 (2.0655) [2022-10-08 10:17:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][600/1251] eta 0:03:34 lr 0.000233 time 0.3253 (0.3302) loss 3.3000 (3.2946) grad_norm 2.0545 (2.0620) [2022-10-08 10:17:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][700/1251] eta 0:03:01 lr 0.000233 time 0.3169 (0.3294) loss 3.6538 (3.2936) grad_norm 1.8127 (2.0596) [2022-10-08 10:18:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][800/1251] eta 0:02:28 lr 0.000233 time 0.3263 (0.3288) loss 3.3899 (3.2933) grad_norm 2.0557 (2.0605) [2022-10-08 10:18:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][900/1251] eta 0:01:55 lr 0.000232 time 0.3278 (0.3283) loss 3.2784 (3.2933) grad_norm 1.7279 (2.0651) [2022-10-08 10:19:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][1000/1251] eta 0:01:22 lr 0.000232 time 0.3275 (0.3280) loss 3.0121 (3.2945) grad_norm 1.7067 (2.0759) [2022-10-08 10:19:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][1100/1251] eta 0:00:49 lr 0.000232 time 0.3261 (0.3279) loss 3.2571 (3.2958) grad_norm 2.1645 (2.0775) [2022-10-08 10:20:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [205/300][1200/1251] eta 0:00:16 lr 0.000231 time 0.3244 (0.3279) loss 3.1310 (3.2962) grad_norm 1.8440 (2.0764) [2022-10-08 10:20:34 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 205 training takes 0:06:50 [2022-10-08 10:20:37 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.985 (2.985) Loss 0.8551 (0.8551) Acc@1 80.176 (80.176) Acc@5 94.434 (94.434) [2022-10-08 10:20:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.482 Acc@5 94.378 [2022-10-08 10:20:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-08 10:20:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.48% [2022-10-08 10:20:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][0/1251] eta 0:49:29 lr 0.000231 time 2.3735 (2.3735) loss 3.2214 (3.2214) grad_norm 1.8179 (1.8179) [2022-10-08 10:21:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][100/1251] eta 0:06:43 lr 0.000231 time 0.3275 (0.3504) loss 3.0431 (3.2647) grad_norm 1.9545 (2.0431) [2022-10-08 10:21:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][200/1251] eta 0:05:55 lr 0.000230 time 0.3264 (0.3384) loss 3.2977 (3.2871) grad_norm 2.2821 (2.0612) [2022-10-08 10:22:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][300/1251] eta 0:05:17 lr 0.000230 time 0.3201 (0.3342) loss 3.2523 (3.2895) grad_norm 1.8246 (2.0903) [2022-10-08 10:23:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][400/1251] eta 0:04:42 lr 0.000230 time 0.3259 (0.3319) loss 3.4571 (3.2855) grad_norm 1.9796 (2.0979) [2022-10-08 10:23:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][500/1251] eta 0:04:08 lr 0.000229 time 0.3243 (0.3305) loss 3.1274 (3.2800) grad_norm 1.8790 (2.1046) [2022-10-08 10:24:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][600/1251] eta 0:03:34 lr 0.000229 time 0.3233 (0.3296) loss 3.3684 (3.2905) grad_norm 2.2429 (2.1081) [2022-10-08 10:24:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][700/1251] eta 0:03:01 lr 0.000229 time 0.3254 (0.3288) loss 3.1002 (3.2912) grad_norm 1.8996 (2.1096) [2022-10-08 10:25:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][800/1251] eta 0:02:28 lr 0.000228 time 0.3282 (0.3286) loss 3.2709 (3.2888) grad_norm 1.9570 (2.1165) [2022-10-08 10:25:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][900/1251] eta 0:01:55 lr 0.000228 time 0.3214 (0.3280) loss 3.6475 (3.2924) grad_norm 2.0614 (2.1239) [2022-10-08 10:26:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][1000/1251] eta 0:01:22 lr 0.000228 time 0.3299 (0.3276) loss 3.2061 (3.2927) grad_norm 2.2809 (2.1205) [2022-10-08 10:26:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][1100/1251] eta 0:00:49 lr 0.000227 time 0.3220 (0.3273) loss 3.2161 (3.2941) grad_norm 1.9196 (2.1253) [2022-10-08 10:27:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [206/300][1200/1251] eta 0:00:16 lr 0.000227 time 0.3301 (0.3270) loss 3.0915 (3.2969) grad_norm 1.9447 (2.1301) [2022-10-08 10:27:37 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 206 training takes 0:06:49 [2022-10-08 10:27:39 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.825 (2.825) Loss 0.9102 (0.9102) Acc@1 78.613 (78.613) Acc@5 93.262 (93.262) [2022-10-08 10:27:50 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.486 Acc@5 94.506 [2022-10-08 10:27:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-08 10:27:50 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.49% [2022-10-08 10:27:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][0/1251] eta 0:49:51 lr 0.000227 time 2.3916 (2.3916) loss 3.2974 (3.2974) grad_norm 2.4455 (2.4455) [2022-10-08 10:28:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][100/1251] eta 0:06:44 lr 0.000226 time 0.3246 (0.3515) loss 3.3985 (3.2755) grad_norm 2.2694 (2.1351) [2022-10-08 10:28:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][200/1251] eta 0:05:55 lr 0.000226 time 0.3273 (0.3387) loss 3.2015 (3.2822) grad_norm 1.7508 (2.1175) [2022-10-08 10:29:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][300/1251] eta 0:05:18 lr 0.000226 time 0.3255 (0.3345) loss 3.4410 (3.2889) grad_norm 2.0614 (2.1187) [2022-10-08 10:30:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][400/1251] eta 0:04:42 lr 0.000225 time 0.3333 (0.3324) loss 3.4425 (3.2934) grad_norm 1.7579 (2.1294) [2022-10-08 10:30:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][500/1251] eta 0:04:08 lr 0.000225 time 0.3249 (0.3313) loss 3.1699 (3.2889) grad_norm 1.8984 (2.1337) [2022-10-08 10:31:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][600/1251] eta 0:03:35 lr 0.000225 time 0.3262 (0.3305) loss 3.1713 (3.2879) grad_norm 1.9502 (2.1283) [2022-10-08 10:31:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][700/1251] eta 0:03:01 lr 0.000224 time 0.3260 (0.3300) loss 3.6350 (3.2875) grad_norm 2.3440 (2.1262) [2022-10-08 10:32:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][800/1251] eta 0:02:28 lr 0.000224 time 0.3267 (0.3299) loss 3.0915 (3.2877) grad_norm 1.9940 (2.1318) [2022-10-08 10:32:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][900/1251] eta 0:01:55 lr 0.000224 time 0.3288 (0.3297) loss 3.5745 (3.2893) grad_norm 2.7262 (2.1314) [2022-10-08 10:33:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][1000/1251] eta 0:01:22 lr 0.000223 time 0.3333 (0.3295) loss 3.1640 (3.2900) grad_norm 1.8785 (2.1310) [2022-10-08 10:33:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][1100/1251] eta 0:00:49 lr 0.000223 time 0.3293 (0.3296) loss 3.7000 (3.2903) grad_norm 2.4609 (2.1295) [2022-10-08 10:34:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [207/300][1200/1251] eta 0:00:16 lr 0.000223 time 0.3356 (0.3297) loss 3.0366 (3.2890) grad_norm 1.9662 (2.1272) [2022-10-08 10:34:43 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 207 training takes 0:06:52 [2022-10-08 10:34:46 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.432 (2.432) Loss 0.9108 (0.9108) Acc@1 78.613 (78.613) Acc@5 94.141 (94.141) [2022-10-08 10:34:57 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.568 Acc@5 94.608 [2022-10-08 10:34:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-08 10:34:57 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.57% [2022-10-08 10:35:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][0/1251] eta 0:59:19 lr 0.000222 time 2.8453 (2.8453) loss 3.5925 (3.5925) grad_norm 1.8772 (1.8772) [2022-10-08 10:35:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][100/1251] eta 0:06:47 lr 0.000222 time 0.3277 (0.3544) loss 3.1900 (3.2809) grad_norm 1.9022 (2.0882) [2022-10-08 10:36:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][200/1251] eta 0:05:58 lr 0.000222 time 0.3310 (0.3410) loss 3.1454 (3.2755) grad_norm 2.1362 (2.1143) [2022-10-08 10:36:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][300/1251] eta 0:05:20 lr 0.000221 time 0.3225 (0.3366) loss 2.8756 (3.2797) grad_norm 2.1823 (2.1237) [2022-10-08 10:37:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][400/1251] eta 0:04:44 lr 0.000221 time 0.3197 (0.3342) loss 3.1912 (3.2805) grad_norm 2.0071 (2.1209) [2022-10-08 10:37:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][500/1251] eta 0:04:09 lr 0.000221 time 0.3265 (0.3327) loss 3.6474 (3.2775) grad_norm 1.8961 (2.1109) [2022-10-08 10:38:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][600/1251] eta 0:03:36 lr 0.000220 time 0.3306 (0.3319) loss 3.1732 (3.2801) grad_norm 2.2902 (2.1213) [2022-10-08 10:38:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][700/1251] eta 0:03:02 lr 0.000220 time 0.3238 (0.3311) loss 3.4099 (3.2815) grad_norm 2.1162 (2.1169) [2022-10-08 10:39:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][800/1251] eta 0:02:29 lr 0.000220 time 0.3271 (0.3307) loss 3.4378 (3.2808) grad_norm 2.0522 (2.1203) [2022-10-08 10:39:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][900/1251] eta 0:01:55 lr 0.000219 time 0.3267 (0.3302) loss 3.3724 (3.2795) grad_norm 2.0725 (2.1299) [2022-10-08 10:40:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][1000/1251] eta 0:01:22 lr 0.000219 time 0.3303 (0.3299) loss 3.5059 (3.2773) grad_norm 1.8472 (2.1338) [2022-10-08 10:41:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][1100/1251] eta 0:00:49 lr 0.000219 time 0.3277 (0.3296) loss 2.9340 (3.2768) grad_norm 2.3709 (2.1330) [2022-10-08 10:41:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [208/300][1200/1251] eta 0:00:16 lr 0.000218 time 0.3247 (0.3294) loss 3.4302 (3.2788) grad_norm 2.5122 (2.1361) [2022-10-08 10:41:49 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 208 training takes 0:06:52 [2022-10-08 10:41:52 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.762 (2.762) Loss 0.9171 (0.9171) Acc@1 76.367 (76.367) Acc@5 94.531 (94.531) [2022-10-08 10:42:03 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.532 Acc@5 94.480 [2022-10-08 10:42:03 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-08 10:42:03 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.57% [2022-10-08 10:42:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][0/1251] eta 0:46:33 lr 0.000218 time 2.2327 (2.2327) loss 3.2695 (3.2695) grad_norm 2.0693 (2.0693) [2022-10-08 10:42:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][100/1251] eta 0:06:51 lr 0.000218 time 0.3238 (0.3577) loss 3.1233 (3.2691) grad_norm 2.1512 (2.1267) [2022-10-08 10:43:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][200/1251] eta 0:05:59 lr 0.000218 time 0.3254 (0.3422) loss 3.0251 (3.2494) grad_norm 2.0552 (2.1496) [2022-10-08 10:43:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][300/1251] eta 0:05:20 lr 0.000217 time 0.3260 (0.3370) loss 3.3333 (3.2492) grad_norm 2.7180 (2.1320) [2022-10-08 10:44:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][400/1251] eta 0:04:44 lr 0.000217 time 0.3266 (0.3342) loss 3.3027 (3.2498) grad_norm 2.0706 (2.1242) [2022-10-08 10:44:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][500/1251] eta 0:04:10 lr 0.000217 time 0.3263 (0.3336) loss 3.3296 (3.2566) grad_norm 1.8058 (2.1340) [2022-10-08 10:45:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][600/1251] eta 0:03:36 lr 0.000216 time 0.3289 (0.3327) loss 3.1121 (3.2651) grad_norm 2.6971 (2.1462) [2022-10-08 10:45:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][700/1251] eta 0:03:02 lr 0.000216 time 0.3232 (0.3319) loss 3.0822 (3.2699) grad_norm 1.9984 (2.1408) [2022-10-08 10:46:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][800/1251] eta 0:02:29 lr 0.000216 time 0.3223 (0.3313) loss 3.4127 (3.2696) grad_norm 1.8232 (2.1392) [2022-10-08 10:47:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][900/1251] eta 0:01:56 lr 0.000215 time 0.3257 (0.3306) loss 3.2489 (3.2689) grad_norm 2.4941 (2.1415) [2022-10-08 10:47:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][1000/1251] eta 0:01:22 lr 0.000215 time 0.5121 (0.3302) loss 3.1012 (3.2724) grad_norm 2.0473 (2.1431) [2022-10-08 10:48:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][1100/1251] eta 0:00:49 lr 0.000215 time 0.3234 (0.3297) loss 3.5757 (3.2750) grad_norm 2.0596 (2.1438) [2022-10-08 10:48:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [209/300][1200/1251] eta 0:00:16 lr 0.000214 time 0.3221 (0.3294) loss 3.3166 (3.2765) grad_norm 1.9786 (2.1447) [2022-10-08 10:48:55 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 209 training takes 0:06:52 [2022-10-08 10:48:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.506 (2.506) Loss 0.9336 (0.9336) Acc@1 77.539 (77.539) Acc@5 94.238 (94.238) [2022-10-08 10:49:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.736 Acc@5 94.538 [2022-10-08 10:49:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-08 10:49:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.74% [2022-10-08 10:49:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][0/1251] eta 0:59:22 lr 0.000214 time 2.8475 (2.8475) loss 3.3489 (3.3489) grad_norm 1.9985 (1.9985) [2022-10-08 10:49:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][100/1251] eta 0:06:44 lr 0.000214 time 0.3292 (0.3518) loss 3.2727 (3.2703) grad_norm 2.0499 (2.2085) [2022-10-08 10:50:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][200/1251] eta 0:05:56 lr 0.000213 time 0.3235 (0.3396) loss 3.4703 (3.2843) grad_norm 2.6142 (2.1816) [2022-10-08 10:50:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][300/1251] eta 0:05:18 lr 0.000213 time 0.3214 (0.3351) loss 3.2315 (3.2772) grad_norm 1.9408 (2.1706) [2022-10-08 10:51:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][400/1251] eta 0:04:43 lr 0.000213 time 0.3203 (0.3327) loss 3.3296 (3.2713) grad_norm 1.8776 (2.1559) [2022-10-08 10:51:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][500/1251] eta 0:04:08 lr 0.000212 time 0.3267 (0.3314) loss 3.5557 (3.2706) grad_norm 1.9165 (2.1548) [2022-10-08 10:52:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][600/1251] eta 0:03:35 lr 0.000212 time 0.3209 (0.3308) loss 3.3183 (3.2729) grad_norm 2.0432 (2.1661) [2022-10-08 10:53:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][700/1251] eta 0:03:02 lr 0.000212 time 0.3362 (0.3304) loss 3.5742 (3.2753) grad_norm 2.4911 (2.1654) [2022-10-08 10:53:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][800/1251] eta 0:02:28 lr 0.000211 time 0.3337 (0.3303) loss 3.2208 (3.2765) grad_norm 1.9730 (2.1580) [2022-10-08 10:54:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][900/1251] eta 0:01:55 lr 0.000211 time 0.3265 (0.3301) loss 3.1377 (3.2787) grad_norm 1.7115 (2.1561) [2022-10-08 10:54:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][1000/1251] eta 0:01:22 lr 0.000211 time 0.3228 (0.3302) loss 3.3635 (3.2780) grad_norm 2.8201 (2.1566) [2022-10-08 10:55:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][1100/1251] eta 0:00:49 lr 0.000210 time 0.3329 (0.3302) loss 3.3312 (3.2787) grad_norm 2.2105 (2.1495) [2022-10-08 10:55:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [210/300][1200/1251] eta 0:00:16 lr 0.000210 time 0.3369 (0.3302) loss 3.3029 (3.2804) grad_norm 2.3764 (2.1546) [2022-10-08 10:56:02 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 210 training takes 0:06:53 [2022-10-08 10:56:02 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_210 saving...... [2022-10-08 10:56:02 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_210 saved !!! [2022-10-08 10:56:05 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.727 (2.727) Loss 0.9231 (0.9231) Acc@1 77.930 (77.930) Acc@5 93.750 (93.750) [2022-10-08 10:56:16 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.736 Acc@5 94.540 [2022-10-08 10:56:16 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-08 10:56:16 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.74% [2022-10-08 10:56:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][0/1251] eta 0:59:33 lr 0.000210 time 2.8562 (2.8562) loss 3.3626 (3.3626) grad_norm 2.0286 (2.0286) [2022-10-08 10:56:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][100/1251] eta 0:06:48 lr 0.000210 time 0.3340 (0.3546) loss 3.2917 (3.2649) grad_norm 2.1921 (2.1582) [2022-10-08 10:57:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][200/1251] eta 0:05:59 lr 0.000209 time 0.3263 (0.3419) loss 3.2011 (3.2445) grad_norm 2.0097 (2.1505) [2022-10-08 10:57:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][300/1251] eta 0:05:20 lr 0.000209 time 0.3276 (0.3372) loss 3.4007 (3.2506) grad_norm 2.1319 (2.1346) [2022-10-08 10:58:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][400/1251] eta 0:04:44 lr 0.000209 time 0.3261 (0.3349) loss 3.1498 (3.2584) grad_norm 2.3916 (2.1566) [2022-10-08 10:59:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][500/1251] eta 0:04:10 lr 0.000208 time 0.3236 (0.3336) loss 3.4870 (3.2633) grad_norm 2.1761 (2.1688) [2022-10-08 10:59:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][600/1251] eta 0:03:36 lr 0.000208 time 0.3280 (0.3327) loss 3.0294 (3.2640) grad_norm 2.1456 (2.1699) [2022-10-08 11:00:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][700/1251] eta 0:03:03 lr 0.000208 time 0.3329 (0.3324) loss 3.1092 (3.2679) grad_norm 2.0887 (2.1664) [2022-10-08 11:00:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][800/1251] eta 0:02:29 lr 0.000207 time 0.3318 (0.3320) loss 3.2737 (3.2688) grad_norm 1.8976 (2.1636) [2022-10-08 11:01:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][900/1251] eta 0:01:56 lr 0.000207 time 0.3315 (0.3319) loss 3.0549 (3.2680) grad_norm 2.2012 (2.1657) [2022-10-08 11:01:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][1000/1251] eta 0:01:23 lr 0.000207 time 0.3263 (0.3317) loss 3.3462 (3.2696) grad_norm 1.8400 (2.1687) [2022-10-08 11:02:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][1100/1251] eta 0:00:50 lr 0.000206 time 0.3269 (0.3317) loss 3.2345 (3.2670) grad_norm 2.1931 (2.1791) [2022-10-08 11:02:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [211/300][1200/1251] eta 0:00:16 lr 0.000206 time 0.3289 (0.3316) loss 3.1340 (3.2698) grad_norm 2.1807 (2.1778) [2022-10-08 11:03:11 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 211 training takes 0:06:55 [2022-10-08 11:03:14 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.866 (2.866) Loss 0.9091 (0.9091) Acc@1 78.418 (78.418) Acc@5 95.117 (95.117) [2022-10-08 11:03:25 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.908 Acc@5 94.646 [2022-10-08 11:03:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-08 11:03:25 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:03:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][0/1251] eta 0:50:48 lr 0.000206 time 2.4369 (2.4369) loss 3.1432 (3.1432) grad_norm 2.0923 (2.0923) [2022-10-08 11:04:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][100/1251] eta 0:06:45 lr 0.000205 time 0.3260 (0.3524) loss 3.2248 (3.2508) grad_norm 2.1306 (2.1703) [2022-10-08 11:04:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][200/1251] eta 0:05:56 lr 0.000205 time 0.3202 (0.3394) loss 3.1729 (3.2481) grad_norm 2.0718 (2.1892) [2022-10-08 11:05:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][300/1251] eta 0:05:18 lr 0.000205 time 0.3275 (0.3349) loss 3.3246 (3.2536) grad_norm 1.9986 (2.1837) [2022-10-08 11:05:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][400/1251] eta 0:04:43 lr 0.000204 time 0.3255 (0.3328) loss 3.6625 (3.2529) grad_norm 2.4292 (2.1743) [2022-10-08 11:06:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][500/1251] eta 0:04:08 lr 0.000204 time 0.3283 (0.3314) loss 3.2041 (3.2523) grad_norm 2.0097 (2.1755) [2022-10-08 11:06:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][600/1251] eta 0:03:35 lr 0.000204 time 0.3234 (0.3309) loss 3.3126 (3.2554) grad_norm 2.1758 (2.1764) [2022-10-08 11:07:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][700/1251] eta 0:03:01 lr 0.000203 time 0.3252 (0.3301) loss 3.2867 (3.2588) grad_norm 2.1799 (2.1735) [2022-10-08 11:07:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][800/1251] eta 0:02:28 lr 0.000203 time 0.3233 (0.3296) loss 3.3670 (3.2595) grad_norm 2.3607 (2.1772) [2022-10-08 11:08:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][900/1251] eta 0:01:55 lr 0.000203 time 0.3260 (0.3293) loss 3.2582 (3.2568) grad_norm 2.1305 (2.1795) [2022-10-08 11:08:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][1000/1251] eta 0:01:22 lr 0.000202 time 0.3280 (0.3291) loss 3.5152 (3.2587) grad_norm 2.0639 (2.1780) [2022-10-08 11:09:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][1100/1251] eta 0:00:49 lr 0.000202 time 0.3278 (0.3290) loss 3.0652 (3.2603) grad_norm 2.7852 (2.1812) [2022-10-08 11:10:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [212/300][1200/1251] eta 0:00:16 lr 0.000202 time 0.3245 (0.3290) loss 3.3168 (3.2611) grad_norm 1.8752 (2.1782) [2022-10-08 11:10:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 212 training takes 0:06:52 [2022-10-08 11:10:20 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.814 (2.814) Loss 0.8590 (0.8590) Acc@1 80.566 (80.566) Acc@5 94.629 (94.629) [2022-10-08 11:10:31 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.742 Acc@5 94.662 [2022-10-08 11:10:31 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-08 11:10:31 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:10:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][0/1251] eta 1:07:00 lr 0.000202 time 3.2141 (3.2141) loss 3.2436 (3.2436) grad_norm 2.2832 (2.2832) [2022-10-08 11:11:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][100/1251] eta 0:06:45 lr 0.000201 time 0.3224 (0.3526) loss 3.6495 (3.2571) grad_norm 1.9038 (2.1840) [2022-10-08 11:11:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][200/1251] eta 0:05:55 lr 0.000201 time 0.3220 (0.3386) loss 3.3400 (3.2552) grad_norm 2.2513 (2.2005) [2022-10-08 11:12:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][300/1251] eta 0:05:17 lr 0.000201 time 0.3228 (0.3338) loss 3.5686 (3.2558) grad_norm 2.3750 (2.1964) [2022-10-08 11:12:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][400/1251] eta 0:04:42 lr 0.000200 time 0.3250 (0.3316) loss 3.1977 (3.2607) grad_norm 2.2550 (2.1903) [2022-10-08 11:13:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][500/1251] eta 0:04:08 lr 0.000200 time 0.3238 (0.3304) loss 3.2040 (3.2592) grad_norm 2.3490 (2.2008) [2022-10-08 11:13:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][600/1251] eta 0:03:34 lr 0.000200 time 0.3281 (0.3297) loss 3.5538 (3.2622) grad_norm 2.6333 (2.2042) [2022-10-08 11:14:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][700/1251] eta 0:03:01 lr 0.000199 time 0.3263 (0.3295) loss 3.1975 (3.2640) grad_norm 2.3991 (2.2209) [2022-10-08 11:14:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][800/1251] eta 0:02:28 lr 0.000199 time 0.3244 (0.3293) loss 3.1440 (3.2620) grad_norm 1.9152 (2.2235) [2022-10-08 11:15:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][900/1251] eta 0:01:55 lr 0.000199 time 0.3267 (0.3292) loss 3.0647 (3.2620) grad_norm 2.0131 (2.2158) [2022-10-08 11:16:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][1000/1251] eta 0:01:22 lr 0.000198 time 0.3353 (0.3292) loss 3.4131 (3.2631) grad_norm 2.2174 (2.2128) [2022-10-08 11:16:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][1100/1251] eta 0:00:49 lr 0.000198 time 0.3347 (0.3293) loss 3.2665 (3.2641) grad_norm 2.3456 (2.2135) [2022-10-08 11:17:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [213/300][1200/1251] eta 0:00:16 lr 0.000198 time 0.3228 (0.3295) loss 3.3433 (3.2643) grad_norm 2.3682 (2.2115) [2022-10-08 11:17:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 213 training takes 0:06:52 [2022-10-08 11:17:26 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.730 (2.730) Loss 0.8546 (0.8546) Acc@1 79.297 (79.297) Acc@5 95.215 (95.215) [2022-10-08 11:17:37 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.722 Acc@5 94.624 [2022-10-08 11:17:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-08 11:17:37 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:17:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][0/1251] eta 1:07:23 lr 0.000198 time 3.2325 (3.2325) loss 3.4323 (3.4323) grad_norm 2.1125 (2.1125) [2022-10-08 11:18:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][100/1251] eta 0:06:49 lr 0.000197 time 0.3217 (0.3559) loss 3.2618 (3.2466) grad_norm 2.5029 (2.1699) [2022-10-08 11:18:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][200/1251] eta 0:05:58 lr 0.000197 time 0.3243 (0.3411) loss 3.4195 (3.2437) grad_norm 1.9848 (2.1987) [2022-10-08 11:19:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][300/1251] eta 0:05:19 lr 0.000197 time 0.3264 (0.3365) loss 3.3392 (3.2554) grad_norm 2.0891 (2.2011) [2022-10-08 11:19:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][400/1251] eta 0:04:44 lr 0.000196 time 0.3305 (0.3344) loss 3.5786 (3.2538) grad_norm 2.1122 (2.2132) [2022-10-08 11:20:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][500/1251] eta 0:04:10 lr 0.000196 time 0.3325 (0.3333) loss 3.3342 (3.2552) grad_norm 2.0567 (2.2080) [2022-10-08 11:20:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][600/1251] eta 0:03:36 lr 0.000196 time 0.3236 (0.3321) loss 3.2237 (3.2553) grad_norm 2.5992 (2.2067) [2022-10-08 11:21:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][700/1251] eta 0:03:02 lr 0.000195 time 0.3224 (0.3313) loss 3.4664 (3.2561) grad_norm 2.4051 (2.2105) [2022-10-08 11:22:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][800/1251] eta 0:02:29 lr 0.000195 time 0.3288 (0.3308) loss 3.2724 (3.2538) grad_norm 2.1839 (2.2178) [2022-10-08 11:22:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][900/1251] eta 0:01:55 lr 0.000195 time 0.3275 (0.3304) loss 3.2209 (3.2552) grad_norm 1.8550 (2.2291) [2022-10-08 11:23:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][1000/1251] eta 0:01:22 lr 0.000194 time 0.3319 (0.3304) loss 2.9319 (3.2525) grad_norm 2.1934 (2.2299) [2022-10-08 11:23:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][1100/1251] eta 0:00:49 lr 0.000194 time 0.3230 (0.3303) loss 3.3084 (3.2524) grad_norm 2.1264 (2.2297) [2022-10-08 11:24:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [214/300][1200/1251] eta 0:00:16 lr 0.000194 time 0.3362 (0.3302) loss 3.1397 (3.2548) grad_norm 2.5831 (2.2319) [2022-10-08 11:24:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 214 training takes 0:06:53 [2022-10-08 11:24:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.306 (3.306) Loss 0.9463 (0.9463) Acc@1 76.953 (76.953) Acc@5 94.434 (94.434) [2022-10-08 11:24:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.788 Acc@5 94.676 [2022-10-08 11:24:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-08 11:24:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:24:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][0/1251] eta 0:59:48 lr 0.000193 time 2.8683 (2.8683) loss 3.2782 (3.2782) grad_norm 2.9050 (2.9050) [2022-10-08 11:25:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][100/1251] eta 0:06:45 lr 0.000193 time 0.3300 (0.3523) loss 3.5832 (3.2479) grad_norm 2.8612 (2.1985) [2022-10-08 11:25:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][200/1251] eta 0:05:57 lr 0.000193 time 0.3239 (0.3398) loss 3.5673 (3.2541) grad_norm 2.0651 (2.2205) [2022-10-08 11:26:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][300/1251] eta 0:05:18 lr 0.000193 time 0.3265 (0.3353) loss 3.5932 (3.2578) grad_norm 2.5601 (2.2285) [2022-10-08 11:26:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][400/1251] eta 0:04:43 lr 0.000192 time 0.3205 (0.3330) loss 3.2251 (3.2516) grad_norm 2.0377 (2.2293) [2022-10-08 11:27:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][500/1251] eta 0:04:09 lr 0.000192 time 0.3263 (0.3318) loss 3.1186 (3.2463) grad_norm 1.9760 (2.2248) [2022-10-08 11:28:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][600/1251] eta 0:03:35 lr 0.000192 time 0.3218 (0.3308) loss 3.0737 (3.2524) grad_norm 2.0474 (2.2340) [2022-10-08 11:28:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][700/1251] eta 0:03:01 lr 0.000191 time 0.3312 (0.3300) loss 3.3778 (3.2518) grad_norm 2.1914 (2.2291) [2022-10-08 11:29:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][800/1251] eta 0:02:28 lr 0.000191 time 0.3198 (0.3293) loss 3.2684 (3.2530) grad_norm 1.9651 (2.2331) [2022-10-08 11:29:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][900/1251] eta 0:01:55 lr 0.000191 time 0.3234 (0.3288) loss 3.0097 (3.2565) grad_norm 1.9432 (2.2306) [2022-10-08 11:30:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][1000/1251] eta 0:01:22 lr 0.000190 time 0.3277 (0.3283) loss 3.4264 (3.2550) grad_norm 2.3860 (2.2326) [2022-10-08 11:30:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][1100/1251] eta 0:00:49 lr 0.000190 time 0.3238 (0.3280) loss 3.0996 (3.2541) grad_norm 2.0045 (2.2318) [2022-10-08 11:31:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [215/300][1200/1251] eta 0:00:16 lr 0.000190 time 0.3263 (0.3278) loss 3.2251 (3.2530) grad_norm 1.9190 (2.2343) [2022-10-08 11:31:35 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 215 training takes 0:06:50 [2022-10-08 11:31:38 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.843 (2.843) Loss 0.9680 (0.9680) Acc@1 75.684 (75.684) Acc@5 94.434 (94.434) [2022-10-08 11:31:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.716 Acc@5 94.586 [2022-10-08 11:31:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-08 11:31:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:31:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][0/1251] eta 1:08:28 lr 0.000189 time 3.2841 (3.2841) loss 3.0061 (3.0061) grad_norm 2.2991 (2.2991) [2022-10-08 11:32:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][100/1251] eta 0:06:48 lr 0.000189 time 0.3247 (0.3546) loss 3.2214 (3.2386) grad_norm 2.1935 (2.2333) [2022-10-08 11:32:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][200/1251] eta 0:05:58 lr 0.000189 time 0.3240 (0.3412) loss 3.5528 (3.2454) grad_norm 2.4360 (2.2373) [2022-10-08 11:33:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][300/1251] eta 0:05:19 lr 0.000189 time 0.3241 (0.3365) loss 3.1897 (3.2376) grad_norm 2.3113 (2.2329) [2022-10-08 11:34:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][400/1251] eta 0:04:44 lr 0.000188 time 0.3252 (0.3348) loss 3.5194 (3.2422) grad_norm 2.1099 (2.2357) [2022-10-08 11:34:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][500/1251] eta 0:04:10 lr 0.000188 time 0.3296 (0.3336) loss 3.4034 (3.2440) grad_norm 2.0723 (2.2190) [2022-10-08 11:35:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][600/1251] eta 0:03:36 lr 0.000188 time 0.3337 (0.3330) loss 3.1889 (3.2431) grad_norm 2.1095 (2.2206) [2022-10-08 11:35:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][700/1251] eta 0:03:03 lr 0.000187 time 0.3362 (0.3327) loss 3.2570 (3.2416) grad_norm 1.9630 (2.2205) [2022-10-08 11:36:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][800/1251] eta 0:02:29 lr 0.000187 time 0.3265 (0.3324) loss 3.1759 (3.2439) grad_norm 2.2298 (2.2314) [2022-10-08 11:36:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][900/1251] eta 0:01:56 lr 0.000187 time 0.3379 (0.3323) loss 3.3854 (3.2460) grad_norm 2.2651 (2.2324) [2022-10-08 11:37:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][1000/1251] eta 0:01:23 lr 0.000186 time 0.3300 (0.3322) loss 3.1993 (3.2443) grad_norm 2.0720 (2.2296) [2022-10-08 11:37:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][1100/1251] eta 0:00:50 lr 0.000186 time 0.3303 (0.3323) loss 3.1331 (3.2463) grad_norm 2.3780 (2.2269) [2022-10-08 11:38:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [216/300][1200/1251] eta 0:00:16 lr 0.000186 time 0.3335 (0.3324) loss 3.3002 (3.2483) grad_norm 2.0796 (2.2248) [2022-10-08 11:38:45 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 216 training takes 0:06:56 [2022-10-08 11:38:48 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.276 (2.276) Loss 0.8754 (0.8754) Acc@1 79.785 (79.785) Acc@5 95.117 (95.117) [2022-10-08 11:38:59 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.854 Acc@5 94.828 [2022-10-08 11:38:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-08 11:38:59 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 78.91% [2022-10-08 11:39:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][0/1251] eta 0:49:22 lr 0.000185 time 2.3678 (2.3678) loss 3.0491 (3.0491) grad_norm 2.5385 (2.5385) [2022-10-08 11:39:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][100/1251] eta 0:06:48 lr 0.000185 time 0.3296 (0.3550) loss 3.1373 (3.2215) grad_norm 2.0300 (2.1987) [2022-10-08 11:40:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][200/1251] eta 0:06:00 lr 0.000185 time 0.3370 (0.3426) loss 3.6176 (3.2434) grad_norm 2.4987 (2.2477) [2022-10-08 11:40:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][300/1251] eta 0:05:21 lr 0.000185 time 0.3357 (0.3385) loss 3.3002 (3.2512) grad_norm 1.9855 (2.2359) [2022-10-08 11:41:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][400/1251] eta 0:04:46 lr 0.000184 time 0.3346 (0.3364) loss 3.3674 (3.2478) grad_norm 2.7401 (2.2589) [2022-10-08 11:41:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][500/1251] eta 0:04:11 lr 0.000184 time 0.3257 (0.3345) loss 3.0719 (3.2474) grad_norm 2.1192 (2.2510) [2022-10-08 11:42:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][600/1251] eta 0:03:36 lr 0.000184 time 0.3237 (0.3332) loss 3.1091 (3.2496) grad_norm 2.2697 (2.2573) [2022-10-08 11:42:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][700/1251] eta 0:03:03 lr 0.000183 time 0.3249 (0.3322) loss 3.1454 (3.2494) grad_norm 2.1023 (2.2610) [2022-10-08 11:43:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][800/1251] eta 0:02:29 lr 0.000183 time 0.3287 (0.3316) loss 3.0665 (3.2474) grad_norm 2.1285 (2.2606) [2022-10-08 11:43:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][900/1251] eta 0:01:56 lr 0.000183 time 0.3398 (0.3313) loss 3.0447 (3.2458) grad_norm 2.0185 (2.2698) [2022-10-08 11:44:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][1000/1251] eta 0:01:23 lr 0.000182 time 0.3300 (0.3310) loss 3.3217 (3.2445) grad_norm 2.5116 (2.2806) [2022-10-08 11:45:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][1100/1251] eta 0:00:49 lr 0.000182 time 0.3311 (0.3310) loss 3.1419 (3.2459) grad_norm 2.1118 (2.2828) [2022-10-08 11:45:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [217/300][1200/1251] eta 0:00:16 lr 0.000182 time 0.3337 (0.3310) loss 2.9811 (3.2452) grad_norm 1.9972 (2.2855) [2022-10-08 11:45:53 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 217 training takes 0:06:54 [2022-10-08 11:45:56 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.985 (2.985) Loss 0.8462 (0.8462) Acc@1 80.176 (80.176) Acc@5 94.824 (94.824) [2022-10-08 11:46:07 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.270 Acc@5 94.682 [2022-10-08 11:46:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-08 11:46:07 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 11:46:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][0/1251] eta 1:07:28 lr 0.000182 time 3.2362 (3.2362) loss 3.2303 (3.2303) grad_norm 2.2586 (2.2586) [2022-10-08 11:46:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][100/1251] eta 0:06:49 lr 0.000181 time 0.3277 (0.3558) loss 3.2418 (3.2401) grad_norm 2.2048 (2.2662) [2022-10-08 11:47:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][200/1251] eta 0:05:58 lr 0.000181 time 0.3283 (0.3415) loss 3.3061 (3.2469) grad_norm 2.1346 (2.2786) [2022-10-08 11:47:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][300/1251] eta 0:05:20 lr 0.000181 time 0.3270 (0.3368) loss 3.6304 (3.2380) grad_norm 2.5470 (2.2508) [2022-10-08 11:48:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][400/1251] eta 0:04:44 lr 0.000180 time 0.3269 (0.3345) loss 3.3742 (3.2475) grad_norm 2.2568 (2.2677) [2022-10-08 11:48:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][500/1251] eta 0:04:09 lr 0.000180 time 0.3238 (0.3326) loss 3.4597 (3.2442) grad_norm 1.9491 (2.2719) [2022-10-08 11:49:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][600/1251] eta 0:03:35 lr 0.000180 time 0.3329 (0.3313) loss 3.3239 (3.2413) grad_norm 2.0377 (2.2718) [2022-10-08 11:49:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][700/1251] eta 0:03:02 lr 0.000179 time 0.3254 (0.3304) loss 3.4541 (3.2405) grad_norm 2.5534 (2.2752) [2022-10-08 11:50:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][800/1251] eta 0:02:28 lr 0.000179 time 0.3223 (0.3297) loss 3.1141 (3.2415) grad_norm 2.2048 (2.2820) [2022-10-08 11:51:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][900/1251] eta 0:01:55 lr 0.000179 time 0.3276 (0.3291) loss 3.4153 (3.2403) grad_norm 2.5493 (2.2881) [2022-10-08 11:51:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][1000/1251] eta 0:01:22 lr 0.000178 time 0.3223 (0.3287) loss 2.9706 (3.2383) grad_norm 2.1604 (2.2872) [2022-10-08 11:52:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][1100/1251] eta 0:00:49 lr 0.000178 time 0.3297 (0.3283) loss 2.9578 (3.2382) grad_norm 2.0731 (2.2881) [2022-10-08 11:52:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [218/300][1200/1251] eta 0:00:16 lr 0.000178 time 0.3256 (0.3280) loss 3.3532 (3.2400) grad_norm 2.2215 (2.2857) [2022-10-08 11:52:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 218 training takes 0:06:50 [2022-10-08 11:53:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.279 (2.279) Loss 0.8849 (0.8849) Acc@1 77.734 (77.734) Acc@5 95.801 (95.801) [2022-10-08 11:53:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 78.950 Acc@5 94.770 [2022-10-08 11:53:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-08 11:53:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 11:53:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][0/1251] eta 1:06:54 lr 0.000178 time 3.2091 (3.2091) loss 3.4115 (3.4115) grad_norm 2.4401 (2.4401) [2022-10-08 11:53:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][100/1251] eta 0:06:48 lr 0.000177 time 0.3252 (0.3550) loss 3.0932 (3.2088) grad_norm 1.9372 (2.3068) [2022-10-08 11:54:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][200/1251] eta 0:05:57 lr 0.000177 time 0.3255 (0.3406) loss 3.4634 (3.2142) grad_norm 3.1331 (2.2804) [2022-10-08 11:54:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][300/1251] eta 0:05:20 lr 0.000177 time 0.3452 (0.3366) loss 3.0843 (3.2190) grad_norm 1.9925 (2.2908) [2022-10-08 11:55:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][400/1251] eta 0:04:44 lr 0.000176 time 0.3258 (0.3341) loss 3.2370 (3.2235) grad_norm 2.5271 (2.2972) [2022-10-08 11:55:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][500/1251] eta 0:04:09 lr 0.000176 time 0.3243 (0.3324) loss 3.3032 (3.2221) grad_norm 2.9184 (2.3007) [2022-10-08 11:56:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][600/1251] eta 0:03:35 lr 0.000176 time 0.3267 (0.3314) loss 3.0761 (3.2256) grad_norm 2.2934 (2.2942) [2022-10-08 11:57:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][700/1251] eta 0:03:02 lr 0.000175 time 0.3282 (0.3308) loss 3.4112 (3.2314) grad_norm 2.9972 (2.2968) [2022-10-08 11:57:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][800/1251] eta 0:02:28 lr 0.000175 time 0.3279 (0.3304) loss 3.2260 (3.2327) grad_norm 2.4945 (2.2965) [2022-10-08 11:58:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][900/1251] eta 0:01:55 lr 0.000175 time 0.3266 (0.3303) loss 3.4106 (3.2314) grad_norm 2.0874 (2.2959) [2022-10-08 11:58:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][1000/1251] eta 0:01:22 lr 0.000175 time 0.3292 (0.3303) loss 3.1931 (3.2334) grad_norm 3.1881 (2.2979) [2022-10-08 11:59:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][1100/1251] eta 0:00:49 lr 0.000174 time 0.3349 (0.3303) loss 3.2968 (3.2346) grad_norm 2.2366 (2.3136) [2022-10-08 11:59:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [219/300][1200/1251] eta 0:00:16 lr 0.000174 time 0.3453 (0.3305) loss 3.0877 (3.2337) grad_norm 2.3201 (2.3074) [2022-10-08 12:00:05 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 219 training takes 0:06:53 [2022-10-08 12:00:08 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.737 (2.737) Loss 0.9244 (0.9244) Acc@1 78.906 (78.906) Acc@5 94.434 (94.434) [2022-10-08 12:00:19 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.166 Acc@5 94.766 [2022-10-08 12:00:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-08 12:00:19 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 12:00:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][0/1251] eta 1:00:59 lr 0.000174 time 2.9249 (2.9249) loss 3.5778 (3.5778) grad_norm 1.8601 (1.8601) [2022-10-08 12:00:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][100/1251] eta 0:06:49 lr 0.000173 time 0.3401 (0.3558) loss 3.4243 (3.2362) grad_norm 2.1120 (2.3182) [2022-10-08 12:01:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][200/1251] eta 0:05:59 lr 0.000173 time 0.3227 (0.3420) loss 2.9645 (3.2279) grad_norm 2.7685 (2.2872) [2022-10-08 12:02:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][300/1251] eta 0:05:20 lr 0.000173 time 0.3259 (0.3370) loss 3.0477 (3.2192) grad_norm 2.3513 (2.2825) [2022-10-08 12:02:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][400/1251] eta 0:04:44 lr 0.000173 time 0.3216 (0.3344) loss 2.9283 (3.2155) grad_norm 1.8478 (2.2921) [2022-10-08 12:03:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][500/1251] eta 0:04:09 lr 0.000172 time 0.3249 (0.3329) loss 3.4935 (3.2188) grad_norm 2.4243 (2.3032) [2022-10-08 12:03:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][600/1251] eta 0:03:36 lr 0.000172 time 0.3239 (0.3318) loss 2.9733 (3.2191) grad_norm 2.0406 (2.2943) [2022-10-08 12:04:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][700/1251] eta 0:03:02 lr 0.000172 time 0.3281 (0.3311) loss 3.2992 (3.2241) grad_norm 2.2195 (2.3008) [2022-10-08 12:04:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][800/1251] eta 0:02:29 lr 0.000171 time 0.3188 (0.3308) loss 3.3117 (3.2261) grad_norm 2.5802 (2.3046) [2022-10-08 12:05:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][900/1251] eta 0:01:55 lr 0.000171 time 0.3235 (0.3305) loss 3.3638 (3.2296) grad_norm 2.2333 (2.3053) [2022-10-08 12:05:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][1000/1251] eta 0:01:22 lr 0.000171 time 0.3360 (0.3303) loss 3.1245 (3.2290) grad_norm 2.3295 (2.3017) [2022-10-08 12:06:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][1100/1251] eta 0:00:49 lr 0.000170 time 0.3213 (0.3302) loss 3.1020 (3.2272) grad_norm 2.2781 (2.2996) [2022-10-08 12:06:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [220/300][1200/1251] eta 0:00:16 lr 0.000170 time 0.3254 (0.3302) loss 3.3943 (3.2268) grad_norm 2.9247 (2.3020) [2022-10-08 12:07:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 220 training takes 0:06:53 [2022-10-08 12:07:12 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_220 saving...... [2022-10-08 12:07:13 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_220 saved !!! [2022-10-08 12:07:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.484 (2.484) Loss 0.9114 (0.9114) Acc@1 79.102 (79.102) Acc@5 94.922 (94.922) [2022-10-08 12:07:26 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.056 Acc@5 94.740 [2022-10-08 12:07:26 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-08 12:07:26 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 12:07:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][0/1251] eta 0:46:39 lr 0.000170 time 2.2378 (2.2378) loss 3.0914 (3.0914) grad_norm 2.4475 (2.4475) [2022-10-08 12:08:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][100/1251] eta 0:06:43 lr 0.000170 time 0.3258 (0.3502) loss 3.2048 (3.2301) grad_norm 2.6285 (2.3524) [2022-10-08 12:08:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][200/1251] eta 0:05:55 lr 0.000169 time 0.3286 (0.3380) loss 2.9408 (3.2326) grad_norm 2.4020 (2.3691) [2022-10-08 12:09:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][300/1251] eta 0:05:18 lr 0.000169 time 0.3272 (0.3344) loss 3.1014 (3.2170) grad_norm 2.1275 (2.3516) [2022-10-08 12:09:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][400/1251] eta 0:04:42 lr 0.000169 time 0.3275 (0.3321) loss 3.3527 (3.2171) grad_norm 2.4328 (2.3461) [2022-10-08 12:10:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][500/1251] eta 0:04:08 lr 0.000168 time 0.3200 (0.3307) loss 3.0489 (3.2160) grad_norm 2.4761 (2.3456) [2022-10-08 12:10:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][600/1251] eta 0:03:34 lr 0.000168 time 0.3279 (0.3297) loss 3.4734 (3.2181) grad_norm 2.2899 (2.3374) [2022-10-08 12:11:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][700/1251] eta 0:03:01 lr 0.000168 time 0.3217 (0.3291) loss 3.0106 (3.2186) grad_norm 2.1331 (2.3419) [2022-10-08 12:11:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][800/1251] eta 0:02:28 lr 0.000168 time 0.3243 (0.3285) loss 3.5443 (3.2195) grad_norm 2.4391 (2.3345) [2022-10-08 12:12:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][900/1251] eta 0:01:55 lr 0.000167 time 0.3235 (0.3281) loss 3.0490 (3.2185) grad_norm 2.0237 (2.3320) [2022-10-08 12:12:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][1000/1251] eta 0:01:22 lr 0.000167 time 0.3273 (0.3278) loss 3.3624 (3.2190) grad_norm 2.2413 (2.3289) [2022-10-08 12:13:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][1100/1251] eta 0:00:49 lr 0.000167 time 0.3234 (0.3276) loss 3.1228 (3.2188) grad_norm 2.1994 (2.3254) [2022-10-08 12:14:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [221/300][1200/1251] eta 0:00:16 lr 0.000166 time 0.3328 (0.3275) loss 3.0117 (3.2178) grad_norm 2.1670 (2.3216) [2022-10-08 12:14:16 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 221 training takes 0:06:50 [2022-10-08 12:14:20 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.172 (3.172) Loss 0.8256 (0.8256) Acc@1 81.543 (81.543) Acc@5 94.629 (94.629) [2022-10-08 12:14:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.234 Acc@5 94.780 [2022-10-08 12:14:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-08 12:14:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 12:14:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][0/1251] eta 1:06:31 lr 0.000166 time 3.1905 (3.1905) loss 3.2038 (3.2038) grad_norm 2.3240 (2.3240) [2022-10-08 12:15:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][100/1251] eta 0:06:49 lr 0.000166 time 0.3344 (0.3555) loss 3.2762 (3.1791) grad_norm 2.3194 (2.3493) [2022-10-08 12:15:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][200/1251] eta 0:05:58 lr 0.000166 time 0.3257 (0.3412) loss 3.2289 (3.2068) grad_norm 2.8700 (2.3228) [2022-10-08 12:16:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][300/1251] eta 0:05:19 lr 0.000165 time 0.3218 (0.3363) loss 3.1682 (3.2185) grad_norm 1.9494 (2.3254) [2022-10-08 12:16:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][400/1251] eta 0:04:43 lr 0.000165 time 0.3278 (0.3335) loss 3.2092 (3.2170) grad_norm 2.3242 (2.3335) [2022-10-08 12:17:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][500/1251] eta 0:04:09 lr 0.000165 time 0.3268 (0.3321) loss 3.1071 (3.2183) grad_norm 2.1997 (2.3370) [2022-10-08 12:17:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][600/1251] eta 0:03:35 lr 0.000164 time 0.3246 (0.3312) loss 3.4073 (3.2129) grad_norm 1.9755 (2.3301) [2022-10-08 12:18:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][700/1251] eta 0:03:02 lr 0.000164 time 0.3334 (0.3307) loss 3.2676 (3.2146) grad_norm 2.0771 (2.3276) [2022-10-08 12:18:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][800/1251] eta 0:02:29 lr 0.000164 time 0.3347 (0.3305) loss 3.2865 (3.2187) grad_norm 2.0372 (2.3254) [2022-10-08 12:19:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][900/1251] eta 0:01:55 lr 0.000163 time 0.3319 (0.3304) loss 3.1782 (3.2198) grad_norm 2.4934 (2.3186) [2022-10-08 12:20:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][1000/1251] eta 0:01:22 lr 0.000163 time 0.3341 (0.3303) loss 3.0091 (3.2184) grad_norm 2.0610 (2.3197) [2022-10-08 12:20:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][1100/1251] eta 0:00:49 lr 0.000163 time 0.3332 (0.3303) loss 3.4902 (3.2160) grad_norm 2.3673 (2.3204) [2022-10-08 12:21:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [222/300][1200/1251] eta 0:00:16 lr 0.000163 time 0.3342 (0.3303) loss 2.9865 (3.2176) grad_norm 2.2940 (2.3219) [2022-10-08 12:21:24 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 222 training takes 0:06:53 [2022-10-08 12:21:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.659 (2.659) Loss 0.8257 (0.8257) Acc@1 79.492 (79.492) Acc@5 95.996 (95.996) [2022-10-08 12:21:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.134 Acc@5 94.708 [2022-10-08 12:21:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-08 12:21:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 12:21:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][0/1251] eta 0:47:10 lr 0.000162 time 2.2623 (2.2623) loss 2.9548 (2.9548) grad_norm 1.9387 (1.9387) [2022-10-08 12:22:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][100/1251] eta 0:06:45 lr 0.000162 time 0.3279 (0.3523) loss 3.2692 (3.1970) grad_norm 2.0480 (2.3323) [2022-10-08 12:22:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][200/1251] eta 0:05:56 lr 0.000162 time 0.3238 (0.3390) loss 3.2936 (3.1984) grad_norm 2.0594 (2.3818) [2022-10-08 12:23:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][300/1251] eta 0:05:18 lr 0.000161 time 0.3261 (0.3345) loss 3.3999 (3.2026) grad_norm 2.1270 (2.3493) [2022-10-08 12:23:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][400/1251] eta 0:04:42 lr 0.000161 time 0.3278 (0.3323) loss 2.9688 (3.2051) grad_norm 2.1969 (2.3500) [2022-10-08 12:24:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][500/1251] eta 0:04:08 lr 0.000161 time 0.3222 (0.3310) loss 3.4610 (3.2082) grad_norm 2.9644 (2.3386) [2022-10-08 12:24:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][600/1251] eta 0:03:34 lr 0.000161 time 0.3244 (0.3302) loss 3.1622 (3.2069) grad_norm 1.7595 (2.3390) [2022-10-08 12:25:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][700/1251] eta 0:03:01 lr 0.000160 time 0.3241 (0.3295) loss 3.1798 (3.2101) grad_norm 2.2088 (2.3331) [2022-10-08 12:26:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][800/1251] eta 0:02:28 lr 0.000160 time 0.3224 (0.3289) loss 3.4222 (3.2097) grad_norm 2.3136 (2.3355) [2022-10-08 12:26:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][900/1251] eta 0:01:55 lr 0.000160 time 0.3275 (0.3286) loss 3.4787 (3.2129) grad_norm 2.5212 (2.3338) [2022-10-08 12:27:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][1000/1251] eta 0:01:22 lr 0.000159 time 0.3225 (0.3282) loss 3.2189 (3.2119) grad_norm 2.3097 (2.3319) [2022-10-08 12:27:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][1100/1251] eta 0:00:49 lr 0.000159 time 0.3342 (0.3279) loss 3.0073 (3.2104) grad_norm 2.2659 (2.3281) [2022-10-08 12:28:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [223/300][1200/1251] eta 0:00:16 lr 0.000159 time 0.3219 (0.3278) loss 3.0320 (3.2101) grad_norm 2.6652 (2.3324) [2022-10-08 12:28:28 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 223 training takes 0:06:50 [2022-10-08 12:28:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.298 (3.298) Loss 0.8802 (0.8802) Acc@1 79.395 (79.395) Acc@5 94.336 (94.336) [2022-10-08 12:28:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.246 Acc@5 94.802 [2022-10-08 12:28:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-08 12:28:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.27% [2022-10-08 12:28:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][0/1251] eta 1:08:46 lr 0.000159 time 3.2988 (3.2988) loss 2.9407 (2.9407) grad_norm 1.9818 (1.9818) [2022-10-08 12:29:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][100/1251] eta 0:06:50 lr 0.000158 time 0.3292 (0.3567) loss 3.1789 (3.1936) grad_norm 2.5982 (2.3389) [2022-10-08 12:29:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][200/1251] eta 0:06:00 lr 0.000158 time 0.3219 (0.3429) loss 3.2102 (3.1900) grad_norm 2.2367 (2.3398) [2022-10-08 12:30:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][300/1251] eta 0:05:20 lr 0.000158 time 0.3279 (0.3374) loss 3.2609 (3.1953) grad_norm 2.1849 (2.3409) [2022-10-08 12:30:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][400/1251] eta 0:04:44 lr 0.000157 time 0.3243 (0.3346) loss 3.3980 (3.1966) grad_norm 2.2452 (2.3495) [2022-10-08 12:31:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][500/1251] eta 0:04:10 lr 0.000157 time 0.3298 (0.3329) loss 3.2049 (3.1962) grad_norm 2.1990 (2.3640) [2022-10-08 12:32:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][600/1251] eta 0:03:35 lr 0.000157 time 0.3235 (0.3317) loss 3.5147 (3.1981) grad_norm 2.4337 (2.3781) [2022-10-08 12:32:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][700/1251] eta 0:03:02 lr 0.000157 time 0.3321 (0.3308) loss 3.4670 (3.2013) grad_norm 2.5893 (2.3694) [2022-10-08 12:33:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][800/1251] eta 0:02:28 lr 0.000156 time 0.3273 (0.3301) loss 2.7834 (3.2007) grad_norm 2.1725 (2.3910) [2022-10-08 12:33:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][900/1251] eta 0:01:55 lr 0.000156 time 0.3242 (0.3295) loss 3.0022 (3.2026) grad_norm 2.1083 (2.3808) [2022-10-08 12:34:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][1000/1251] eta 0:01:22 lr 0.000156 time 0.3234 (0.3292) loss 3.2342 (3.2042) grad_norm 1.9173 (2.3750) [2022-10-08 12:34:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][1100/1251] eta 0:00:49 lr 0.000155 time 0.3314 (0.3289) loss 3.4003 (3.2032) grad_norm 2.7864 (2.3844) [2022-10-08 12:35:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [224/300][1200/1251] eta 0:00:16 lr 0.000155 time 0.3270 (0.3287) loss 3.1313 (3.2052) grad_norm 2.4588 (2.3794) [2022-10-08 12:35:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 224 training takes 0:06:51 [2022-10-08 12:35:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.969 (2.969) Loss 0.8591 (0.8591) Acc@1 80.078 (80.078) Acc@5 95.215 (95.215) [2022-10-08 12:35:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.308 Acc@5 94.846 [2022-10-08 12:35:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-08 12:35:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.31% [2022-10-08 12:35:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][0/1251] eta 0:56:17 lr 0.000155 time 2.6997 (2.6997) loss 3.0312 (3.0312) grad_norm 3.0103 (3.0103) [2022-10-08 12:36:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][100/1251] eta 0:06:44 lr 0.000155 time 0.3230 (0.3516) loss 3.0563 (3.1850) grad_norm 2.1717 (2.3656) [2022-10-08 12:36:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][200/1251] eta 0:05:55 lr 0.000154 time 0.3243 (0.3385) loss 3.3055 (3.1779) grad_norm 1.9176 (2.3346) [2022-10-08 12:37:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][300/1251] eta 0:05:17 lr 0.000154 time 0.3245 (0.3342) loss 2.9618 (3.1818) grad_norm 2.0482 (2.3162) [2022-10-08 12:38:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][400/1251] eta 0:04:42 lr 0.000154 time 0.3249 (0.3319) loss 3.1755 (3.1781) grad_norm 2.1433 (2.3212) [2022-10-08 12:38:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][500/1251] eta 0:04:08 lr 0.000154 time 0.3271 (0.3307) loss 3.3295 (3.1819) grad_norm 2.4011 (2.3257) [2022-10-08 12:39:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][600/1251] eta 0:03:34 lr 0.000153 time 0.3239 (0.3300) loss 3.2415 (3.1831) grad_norm 2.1026 (2.3304) [2022-10-08 12:39:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][700/1251] eta 0:03:01 lr 0.000153 time 0.3248 (0.3294) loss 2.9051 (3.1853) grad_norm 2.0561 (2.3275) [2022-10-08 12:40:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][800/1251] eta 0:02:28 lr 0.000153 time 0.3296 (0.3291) loss 3.2077 (3.1898) grad_norm 2.2026 (2.3320) [2022-10-08 12:40:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][900/1251] eta 0:01:55 lr 0.000152 time 0.3239 (0.3288) loss 3.2869 (3.1927) grad_norm 2.3198 (2.3380) [2022-10-08 12:41:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][1000/1251] eta 0:01:22 lr 0.000152 time 0.3445 (0.3288) loss 3.3818 (3.1954) grad_norm 2.4717 (2.3416) [2022-10-08 12:41:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][1100/1251] eta 0:00:49 lr 0.000152 time 0.3487 (0.3289) loss 3.1181 (3.2001) grad_norm 1.9600 (2.3523) [2022-10-08 12:42:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [225/300][1200/1251] eta 0:00:16 lr 0.000151 time 0.3320 (0.3291) loss 3.2516 (3.2006) grad_norm 2.6679 (2.3552) [2022-10-08 12:42:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 225 training takes 0:06:52 [2022-10-08 12:42:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.761 (2.761) Loss 0.8715 (0.8715) Acc@1 80.762 (80.762) Acc@5 93.848 (93.848) [2022-10-08 12:42:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.190 Acc@5 94.790 [2022-10-08 12:42:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-08 12:42:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.31% [2022-10-08 12:42:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][0/1251] eta 0:59:02 lr 0.000151 time 2.8313 (2.8313) loss 3.1741 (3.1741) grad_norm 2.1020 (2.1020) [2022-10-08 12:43:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][100/1251] eta 0:06:44 lr 0.000151 time 0.3242 (0.3511) loss 3.2393 (3.1707) grad_norm 3.2199 (2.4202) [2022-10-08 12:44:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][200/1251] eta 0:05:55 lr 0.000151 time 0.3259 (0.3379) loss 3.5142 (3.1796) grad_norm 2.2459 (2.3837) [2022-10-08 12:44:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][300/1251] eta 0:05:17 lr 0.000150 time 0.3240 (0.3336) loss 3.2331 (3.1847) grad_norm 2.4146 (2.3455) [2022-10-08 12:45:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][400/1251] eta 0:04:41 lr 0.000150 time 0.3288 (0.3313) loss 3.4252 (3.1915) grad_norm 2.4524 (2.3631) [2022-10-08 12:45:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][500/1251] eta 0:04:07 lr 0.000150 time 0.3229 (0.3299) loss 3.2322 (3.1950) grad_norm 2.3618 (2.3695) [2022-10-08 12:46:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][600/1251] eta 0:03:34 lr 0.000150 time 0.3232 (0.3290) loss 3.2207 (3.1969) grad_norm 2.1637 (2.3713) [2022-10-08 12:46:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][700/1251] eta 0:03:00 lr 0.000149 time 0.3227 (0.3283) loss 3.1745 (3.2041) grad_norm 2.1429 (2.3706) [2022-10-08 12:47:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][800/1251] eta 0:02:27 lr 0.000149 time 0.3249 (0.3278) loss 3.1825 (3.2040) grad_norm 2.3777 (2.3632) [2022-10-08 12:47:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][900/1251] eta 0:01:54 lr 0.000149 time 0.3229 (0.3274) loss 3.2987 (3.2071) grad_norm 2.1655 (2.3734) [2022-10-08 12:48:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][1000/1251] eta 0:01:22 lr 0.000148 time 0.3243 (0.3271) loss 3.1855 (3.2041) grad_norm 2.2712 (2.3710) [2022-10-08 12:48:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][1100/1251] eta 0:00:49 lr 0.000148 time 0.3307 (0.3269) loss 3.3391 (3.2046) grad_norm 2.4335 (2.3722) [2022-10-08 12:49:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [226/300][1200/1251] eta 0:00:16 lr 0.000148 time 0.3238 (0.3269) loss 3.1534 (3.2051) grad_norm 2.3123 (2.3789) [2022-10-08 12:49:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 226 training takes 0:06:49 [2022-10-08 12:49:45 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.081 (3.081) Loss 0.8284 (0.8284) Acc@1 80.176 (80.176) Acc@5 95.020 (95.020) [2022-10-08 12:49:56 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.338 Acc@5 94.854 [2022-10-08 12:49:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-08 12:49:56 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.34% [2022-10-08 12:49:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][0/1251] eta 0:47:37 lr 0.000148 time 2.2845 (2.2845) loss 3.2416 (3.2416) grad_norm 2.6041 (2.6041) [2022-10-08 12:50:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][100/1251] eta 0:06:47 lr 0.000147 time 0.3232 (0.3539) loss 3.3401 (3.1863) grad_norm 2.2426 (2.3287) [2022-10-08 12:51:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][200/1251] eta 0:05:57 lr 0.000147 time 0.3269 (0.3399) loss 3.1506 (3.1932) grad_norm 3.2061 (2.3731) [2022-10-08 12:51:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][300/1251] eta 0:05:18 lr 0.000147 time 0.3265 (0.3352) loss 3.3684 (3.1895) grad_norm 2.3949 (2.3885) [2022-10-08 12:52:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][400/1251] eta 0:04:43 lr 0.000147 time 0.3250 (0.3328) loss 3.3623 (3.1831) grad_norm 2.5873 (2.3755) [2022-10-08 12:52:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][500/1251] eta 0:04:08 lr 0.000146 time 0.3254 (0.3314) loss 3.2482 (3.1898) grad_norm 2.0679 (2.3746) [2022-10-08 12:53:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][600/1251] eta 0:03:35 lr 0.000146 time 0.3219 (0.3304) loss 3.2615 (3.1877) grad_norm 2.4170 (2.3829) [2022-10-08 12:53:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][700/1251] eta 0:03:01 lr 0.000146 time 0.3262 (0.3296) loss 3.1330 (3.1915) grad_norm 2.3200 (2.3802) [2022-10-08 12:54:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][800/1251] eta 0:02:28 lr 0.000145 time 0.3203 (0.3291) loss 3.1130 (3.1934) grad_norm 2.3912 (2.3798) [2022-10-08 12:54:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][900/1251] eta 0:01:55 lr 0.000145 time 0.3253 (0.3287) loss 3.1753 (3.1983) grad_norm 2.0662 (2.3796) [2022-10-08 12:55:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][1000/1251] eta 0:01:22 lr 0.000145 time 0.3230 (0.3283) loss 3.4185 (3.1961) grad_norm 2.2839 (2.3847) [2022-10-08 12:55:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][1100/1251] eta 0:00:49 lr 0.000145 time 0.3270 (0.3281) loss 3.0228 (3.1960) grad_norm 2.4661 (2.3830) [2022-10-08 12:56:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [227/300][1200/1251] eta 0:00:16 lr 0.000144 time 0.3256 (0.3281) loss 3.1314 (3.1957) grad_norm 3.4789 (2.3773) [2022-10-08 12:56:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 227 training takes 0:06:50 [2022-10-08 12:56:50 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.094 (3.094) Loss 0.9208 (0.9208) Acc@1 78.906 (78.906) Acc@5 94.434 (94.434) [2022-10-08 12:57:00 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.488 Acc@5 94.836 [2022-10-08 12:57:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-08 12:57:00 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.49% [2022-10-08 12:57:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][0/1251] eta 0:55:49 lr 0.000144 time 2.6773 (2.6773) loss 3.3201 (3.3201) grad_norm 3.0511 (3.0511) [2022-10-08 12:57:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][100/1251] eta 0:06:44 lr 0.000144 time 0.3223 (0.3513) loss 3.2296 (3.1792) grad_norm 2.0862 (2.4108) [2022-10-08 12:58:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][200/1251] eta 0:05:55 lr 0.000144 time 0.3218 (0.3384) loss 3.0498 (3.1716) grad_norm 2.4131 (2.3843) [2022-10-08 12:58:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][300/1251] eta 0:05:17 lr 0.000143 time 0.3231 (0.3340) loss 3.2082 (3.1948) grad_norm 2.2638 (2.3813) [2022-10-08 12:59:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][400/1251] eta 0:04:42 lr 0.000143 time 0.3237 (0.3320) loss 3.2048 (3.1910) grad_norm 2.5625 (2.3844) [2022-10-08 12:59:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][500/1251] eta 0:04:08 lr 0.000143 time 0.3248 (0.3308) loss 3.2542 (3.1870) grad_norm 2.3858 (2.3763) [2022-10-08 13:00:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][600/1251] eta 0:03:34 lr 0.000142 time 0.3225 (0.3299) loss 3.0811 (3.1878) grad_norm 2.2163 (2.3755) [2022-10-08 13:00:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][700/1251] eta 0:03:01 lr 0.000142 time 0.3254 (0.3293) loss 3.0808 (3.1914) grad_norm 2.3846 (2.3844) [2022-10-08 13:01:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][800/1251] eta 0:02:28 lr 0.000142 time 0.3275 (0.3288) loss 3.0902 (3.1895) grad_norm 2.3847 (2.3839) [2022-10-08 13:01:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][900/1251] eta 0:01:55 lr 0.000142 time 0.3246 (0.3284) loss 3.0871 (3.1864) grad_norm 2.8418 (2.3889) [2022-10-08 13:02:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][1000/1251] eta 0:01:22 lr 0.000141 time 0.3320 (0.3282) loss 2.9189 (3.1849) grad_norm 2.4188 (2.3894) [2022-10-08 13:03:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][1100/1251] eta 0:00:49 lr 0.000141 time 0.3258 (0.3281) loss 3.1544 (3.1869) grad_norm 2.1492 (2.3931) [2022-10-08 13:03:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [228/300][1200/1251] eta 0:00:16 lr 0.000141 time 0.3238 (0.3281) loss 3.2593 (3.1894) grad_norm 2.1720 (2.3988) [2022-10-08 13:03:51 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 228 training takes 0:06:50 [2022-10-08 13:03:55 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.271 (3.271) Loss 0.8720 (0.8720) Acc@1 79.883 (79.883) Acc@5 95.312 (95.312) [2022-10-08 13:04:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.460 Acc@5 94.854 [2022-10-08 13:04:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-08 13:04:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.49% [2022-10-08 13:04:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][0/1251] eta 1:01:07 lr 0.000141 time 2.9313 (2.9313) loss 3.0904 (3.0904) grad_norm 2.2967 (2.2967) [2022-10-08 13:04:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][100/1251] eta 0:06:48 lr 0.000140 time 0.3255 (0.3546) loss 3.3756 (3.1953) grad_norm 2.7421 (2.4239) [2022-10-08 13:05:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][200/1251] eta 0:05:58 lr 0.000140 time 0.3236 (0.3408) loss 3.0127 (3.1730) grad_norm 2.1770 (2.3991) [2022-10-08 13:05:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][300/1251] eta 0:05:19 lr 0.000140 time 0.3313 (0.3364) loss 3.2354 (3.1724) grad_norm 2.6833 (2.4599) [2022-10-08 13:06:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][400/1251] eta 0:04:44 lr 0.000140 time 0.3240 (0.3340) loss 3.5061 (3.1690) grad_norm 2.4176 (2.4577) [2022-10-08 13:06:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][500/1251] eta 0:04:09 lr 0.000139 time 0.3309 (0.3326) loss 3.3382 (3.1758) grad_norm 2.3626 (2.4540) [2022-10-08 13:07:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][600/1251] eta 0:03:35 lr 0.000139 time 0.3272 (0.3316) loss 2.9065 (3.1795) grad_norm 2.1011 (2.4414) [2022-10-08 13:07:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][700/1251] eta 0:03:02 lr 0.000139 time 0.3265 (0.3309) loss 3.1262 (3.1803) grad_norm 2.2385 (2.4407) [2022-10-08 13:08:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][800/1251] eta 0:02:28 lr 0.000138 time 0.3239 (0.3302) loss 3.3713 (3.1793) grad_norm 2.6493 (2.4385) [2022-10-08 13:09:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][900/1251] eta 0:01:55 lr 0.000138 time 0.3324 (0.3297) loss 3.3989 (3.1797) grad_norm 2.1057 (2.4386) [2022-10-08 13:09:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][1000/1251] eta 0:01:22 lr 0.000138 time 0.3236 (0.3294) loss 3.3170 (3.1798) grad_norm 2.2328 (2.4373) [2022-10-08 13:10:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][1100/1251] eta 0:00:49 lr 0.000138 time 0.3302 (0.3291) loss 3.1862 (3.1803) grad_norm 2.5132 (2.4342) [2022-10-08 13:10:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [229/300][1200/1251] eta 0:00:16 lr 0.000137 time 0.3343 (0.3291) loss 3.3154 (3.1833) grad_norm 2.5074 (2.4340) [2022-10-08 13:10:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 229 training takes 0:06:52 [2022-10-08 13:11:00 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.238 (3.238) Loss 0.9011 (0.9011) Acc@1 78.711 (78.711) Acc@5 94.434 (94.434) [2022-10-08 13:11:11 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.430 Acc@5 94.756 [2022-10-08 13:11:11 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-08 13:11:11 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.49% [2022-10-08 13:11:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][0/1251] eta 1:01:48 lr 0.000137 time 2.9644 (2.9644) loss 3.1042 (3.1042) grad_norm 2.6776 (2.6776) [2022-10-08 13:11:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][100/1251] eta 0:06:49 lr 0.000137 time 0.3258 (0.3559) loss 3.1516 (3.1581) grad_norm 2.2711 (2.4189) [2022-10-08 13:12:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][200/1251] eta 0:05:59 lr 0.000137 time 0.3258 (0.3420) loss 3.4754 (3.1631) grad_norm 2.7652 (2.4630) [2022-10-08 13:12:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][300/1251] eta 0:05:21 lr 0.000136 time 0.3254 (0.3376) loss 2.9749 (3.1654) grad_norm 2.2879 (2.4480) [2022-10-08 13:13:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][400/1251] eta 0:04:45 lr 0.000136 time 0.3304 (0.3355) loss 3.1004 (3.1656) grad_norm 2.1202 (2.4350) [2022-10-08 13:13:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][500/1251] eta 0:04:11 lr 0.000136 time 0.3295 (0.3344) loss 3.4742 (3.1629) grad_norm 2.2718 (2.4258) [2022-10-08 13:14:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][600/1251] eta 0:03:37 lr 0.000135 time 0.3304 (0.3338) loss 3.3579 (3.1685) grad_norm 2.2636 (2.4545) [2022-10-08 13:15:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][700/1251] eta 0:03:03 lr 0.000135 time 0.3307 (0.3334) loss 3.2740 (3.1669) grad_norm 2.1906 (2.4573) [2022-10-08 13:15:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][800/1251] eta 0:02:30 lr 0.000135 time 0.3311 (0.3332) loss 3.1215 (3.1662) grad_norm 2.5098 (2.4495) [2022-10-08 13:16:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][900/1251] eta 0:01:56 lr 0.000135 time 0.3281 (0.3331) loss 3.2819 (3.1659) grad_norm 2.5612 (2.4449) [2022-10-08 13:16:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][1000/1251] eta 0:01:23 lr 0.000134 time 0.3318 (0.3331) loss 3.3637 (3.1665) grad_norm 2.2181 (2.4424) [2022-10-08 13:17:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][1100/1251] eta 0:00:50 lr 0.000134 time 0.3305 (0.3333) loss 3.0505 (3.1657) grad_norm 2.2544 (2.4376) [2022-10-08 13:17:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [230/300][1200/1251] eta 0:00:16 lr 0.000134 time 0.3354 (0.3333) loss 3.3876 (3.1677) grad_norm 2.2562 (2.4414) [2022-10-08 13:18:08 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 230 training takes 0:06:57 [2022-10-08 13:18:08 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_230 saving...... [2022-10-08 13:18:09 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_230 saved !!! [2022-10-08 13:18:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.592 (2.592) Loss 0.8664 (0.8664) Acc@1 80.859 (80.859) Acc@5 94.238 (94.238) [2022-10-08 13:18:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.614 Acc@5 94.830 [2022-10-08 13:18:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-08 13:18:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.61% [2022-10-08 13:18:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][0/1251] eta 0:59:24 lr 0.000134 time 2.8496 (2.8496) loss 3.2328 (3.2328) grad_norm 2.3736 (2.3736) [2022-10-08 13:18:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][100/1251] eta 0:06:48 lr 0.000133 time 0.3290 (0.3546) loss 3.3939 (3.1819) grad_norm 2.4122 (2.4499) [2022-10-08 13:19:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][200/1251] eta 0:05:58 lr 0.000133 time 0.3262 (0.3415) loss 3.1366 (3.1787) grad_norm 2.3638 (2.4768) [2022-10-08 13:20:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][300/1251] eta 0:05:20 lr 0.000133 time 0.3280 (0.3372) loss 2.9587 (3.1761) grad_norm 2.2481 (2.4677) [2022-10-08 13:20:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][400/1251] eta 0:04:45 lr 0.000133 time 0.3252 (0.3350) loss 2.9395 (3.1784) grad_norm 2.2952 (2.4554) [2022-10-08 13:21:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][500/1251] eta 0:04:10 lr 0.000132 time 0.3318 (0.3336) loss 3.3711 (3.1735) grad_norm 2.4234 (2.4739) [2022-10-08 13:21:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][600/1251] eta 0:03:36 lr 0.000132 time 0.3327 (0.3326) loss 3.1579 (3.1756) grad_norm 2.1484 (2.4678) [2022-10-08 13:22:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][700/1251] eta 0:03:02 lr 0.000132 time 0.3250 (0.3319) loss 3.1115 (3.1753) grad_norm 2.1147 (2.4668) [2022-10-08 13:22:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][800/1251] eta 0:02:29 lr 0.000132 time 0.3255 (0.3314) loss 3.4033 (3.1749) grad_norm 2.6486 (2.4657) [2022-10-08 13:23:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][900/1251] eta 0:01:56 lr 0.000131 time 0.3277 (0.3311) loss 3.1054 (3.1736) grad_norm 2.2770 (2.4648) [2022-10-08 13:23:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][1000/1251] eta 0:01:23 lr 0.000131 time 0.3308 (0.3308) loss 3.0418 (3.1722) grad_norm 2.3921 (2.4737) [2022-10-08 13:24:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][1100/1251] eta 0:00:49 lr 0.000131 time 0.3277 (0.3307) loss 2.9688 (3.1711) grad_norm 2.6167 (2.4682) [2022-10-08 13:24:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [231/300][1200/1251] eta 0:00:16 lr 0.000130 time 0.3295 (0.3306) loss 3.0134 (3.1720) grad_norm 1.9871 (2.4630) [2022-10-08 13:25:16 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 231 training takes 0:06:54 [2022-10-08 13:25:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.261 (3.261) Loss 0.8509 (0.8509) Acc@1 78.809 (78.809) Acc@5 95.215 (95.215) [2022-10-08 13:25:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.596 Acc@5 94.842 [2022-10-08 13:25:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-08 13:25:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.61% [2022-10-08 13:25:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][0/1251] eta 1:05:50 lr 0.000130 time 3.1576 (3.1576) loss 2.9531 (2.9531) grad_norm 2.3146 (2.3146) [2022-10-08 13:26:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][100/1251] eta 0:06:48 lr 0.000130 time 0.3295 (0.3546) loss 3.0186 (3.1531) grad_norm 2.2153 (2.4323) [2022-10-08 13:26:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][200/1251] eta 0:05:58 lr 0.000130 time 0.3259 (0.3407) loss 3.4921 (3.1506) grad_norm 3.0837 (2.4204) [2022-10-08 13:27:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][300/1251] eta 0:05:19 lr 0.000129 time 0.3212 (0.3362) loss 2.8767 (3.1535) grad_norm 2.4578 (2.4258) [2022-10-08 13:27:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][400/1251] eta 0:04:43 lr 0.000129 time 0.3282 (0.3336) loss 3.0631 (3.1559) grad_norm 2.6736 (2.4372) [2022-10-08 13:28:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][500/1251] eta 0:04:09 lr 0.000129 time 0.3264 (0.3319) loss 2.9641 (3.1573) grad_norm 2.2645 (2.4436) [2022-10-08 13:28:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][600/1251] eta 0:03:35 lr 0.000129 time 0.3247 (0.3308) loss 3.2457 (3.1618) grad_norm 2.7057 (2.4567) [2022-10-08 13:29:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][700/1251] eta 0:03:01 lr 0.000128 time 0.3309 (0.3300) loss 3.2526 (3.1640) grad_norm 2.3721 (2.4535) [2022-10-08 13:29:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][800/1251] eta 0:02:28 lr 0.000128 time 0.3258 (0.3295) loss 2.8598 (3.1627) grad_norm 2.1750 (2.4510) [2022-10-08 13:30:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][900/1251] eta 0:01:55 lr 0.000128 time 0.3229 (0.3290) loss 3.1387 (3.1641) grad_norm 2.3619 (2.4532) [2022-10-08 13:30:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][1000/1251] eta 0:01:22 lr 0.000128 time 0.3185 (0.3286) loss 3.2497 (3.1653) grad_norm 2.6667 (2.4518) [2022-10-08 13:31:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][1100/1251] eta 0:00:49 lr 0.000127 time 0.3241 (0.3285) loss 3.0453 (3.1664) grad_norm 2.6234 (2.4655) [2022-10-08 13:32:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [232/300][1200/1251] eta 0:00:16 lr 0.000127 time 0.3216 (0.3282) loss 3.2360 (3.1665) grad_norm 2.3176 (2.4741) [2022-10-08 13:32:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 232 training takes 0:06:50 [2022-10-08 13:32:23 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.627 (2.627) Loss 0.8800 (0.8800) Acc@1 79.199 (79.199) Acc@5 95.605 (95.605) [2022-10-08 13:32:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.766 Acc@5 94.970 [2022-10-08 13:32:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-08 13:32:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.77% [2022-10-08 13:32:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][0/1251] eta 0:49:12 lr 0.000127 time 2.3601 (2.3601) loss 3.1828 (3.1828) grad_norm 2.2147 (2.2147) [2022-10-08 13:33:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][100/1251] eta 0:06:44 lr 0.000127 time 0.3217 (0.3515) loss 3.3796 (3.1557) grad_norm 2.3292 (2.5366) [2022-10-08 13:33:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][200/1251] eta 0:05:55 lr 0.000126 time 0.3296 (0.3384) loss 3.1712 (3.1585) grad_norm 2.5687 (2.4919) [2022-10-08 13:34:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][300/1251] eta 0:05:17 lr 0.000126 time 0.3287 (0.3342) loss 3.1798 (3.1707) grad_norm 2.8047 (2.5107) [2022-10-08 13:34:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][400/1251] eta 0:04:42 lr 0.000126 time 0.3238 (0.3320) loss 3.4306 (3.1727) grad_norm 2.6438 (2.5091) [2022-10-08 13:35:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][500/1251] eta 0:04:08 lr 0.000126 time 0.3262 (0.3307) loss 3.1455 (3.1710) grad_norm 2.2942 (2.5052) [2022-10-08 13:35:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][600/1251] eta 0:03:34 lr 0.000125 time 0.3249 (0.3301) loss 3.5072 (3.1661) grad_norm 2.7941 (2.5097) [2022-10-08 13:36:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][700/1251] eta 0:03:01 lr 0.000125 time 0.3309 (0.3299) loss 3.2592 (3.1678) grad_norm 2.4998 (2.5005) [2022-10-08 13:36:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][800/1251] eta 0:02:28 lr 0.000125 time 0.3406 (0.3299) loss 3.2099 (3.1637) grad_norm 2.3654 (2.4985) [2022-10-08 13:37:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][900/1251] eta 0:01:55 lr 0.000125 time 0.3266 (0.3300) loss 2.9693 (3.1634) grad_norm 2.8936 (2.4980) [2022-10-08 13:38:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][1000/1251] eta 0:01:22 lr 0.000124 time 0.3336 (0.3300) loss 2.9141 (3.1663) grad_norm 2.2244 (2.5010) [2022-10-08 13:38:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][1100/1251] eta 0:00:49 lr 0.000124 time 0.3394 (0.3301) loss 3.1811 (3.1661) grad_norm 2.5619 (2.4976) [2022-10-08 13:39:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [233/300][1200/1251] eta 0:00:16 lr 0.000124 time 0.3331 (0.3302) loss 3.3015 (3.1693) grad_norm 2.8815 (2.5023) [2022-10-08 13:39:28 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 233 training takes 0:06:53 [2022-10-08 13:39:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.107 (3.107) Loss 0.9036 (0.9036) Acc@1 79.395 (79.395) Acc@5 94.336 (94.336) [2022-10-08 13:39:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.874 Acc@5 94.982 [2022-10-08 13:39:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-08 13:39:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 13:39:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][0/1251] eta 0:55:21 lr 0.000124 time 2.6553 (2.6553) loss 3.0852 (3.0852) grad_norm 2.2300 (2.2300) [2022-10-08 13:40:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][100/1251] eta 0:06:44 lr 0.000123 time 0.3242 (0.3513) loss 2.9817 (3.1539) grad_norm 2.4198 (2.5967) [2022-10-08 13:40:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][200/1251] eta 0:05:57 lr 0.000123 time 0.3360 (0.3398) loss 2.8240 (3.1650) grad_norm 2.2429 (2.5167) [2022-10-08 13:41:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][300/1251] eta 0:05:19 lr 0.000123 time 0.3236 (0.3357) loss 3.1602 (3.1638) grad_norm 2.5895 (2.5308) [2022-10-08 13:41:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][400/1251] eta 0:04:43 lr 0.000123 time 0.3282 (0.3335) loss 3.1119 (3.1633) grad_norm 3.0838 (2.5284) [2022-10-08 13:42:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][500/1251] eta 0:04:09 lr 0.000122 time 0.3234 (0.3322) loss 3.3006 (3.1579) grad_norm 2.1462 (2.5206) [2022-10-08 13:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][600/1251] eta 0:03:35 lr 0.000122 time 0.3325 (0.3312) loss 3.1076 (3.1609) grad_norm 2.4107 (2.5126) [2022-10-08 13:43:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][700/1251] eta 0:03:02 lr 0.000122 time 0.3228 (0.3305) loss 2.9452 (3.1645) grad_norm 2.7681 (2.5098) [2022-10-08 13:44:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][800/1251] eta 0:02:28 lr 0.000121 time 0.3272 (0.3299) loss 3.0944 (3.1668) grad_norm 2.6819 (2.5150) [2022-10-08 13:44:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][900/1251] eta 0:01:55 lr 0.000121 time 0.3223 (0.3295) loss 2.9502 (3.1649) grad_norm 2.4430 (2.5124) [2022-10-08 13:45:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][1000/1251] eta 0:01:22 lr 0.000121 time 0.3222 (0.3290) loss 3.2077 (3.1610) grad_norm 2.7107 (2.5193) [2022-10-08 13:45:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][1100/1251] eta 0:00:49 lr 0.000121 time 0.3230 (0.3287) loss 3.1854 (3.1621) grad_norm 2.2044 (2.5203) [2022-10-08 13:46:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [234/300][1200/1251] eta 0:00:16 lr 0.000120 time 0.3262 (0.3285) loss 3.1256 (3.1619) grad_norm 2.9177 (2.5223) [2022-10-08 13:46:34 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 234 training takes 0:06:51 [2022-10-08 13:46:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.684 (2.684) Loss 0.8546 (0.8546) Acc@1 80.371 (80.371) Acc@5 95.117 (95.117) [2022-10-08 13:46:47 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.688 Acc@5 95.038 [2022-10-08 13:46:47 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-08 13:46:47 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 13:46:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][0/1251] eta 1:07:27 lr 0.000120 time 3.2350 (3.2350) loss 3.3214 (3.3214) grad_norm 2.5596 (2.5596) [2022-10-08 13:47:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][100/1251] eta 0:06:50 lr 0.000120 time 0.3268 (0.3564) loss 3.0653 (3.1510) grad_norm 2.7807 (2.5047) [2022-10-08 13:47:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][200/1251] eta 0:05:59 lr 0.000120 time 0.3289 (0.3422) loss 3.1513 (3.1500) grad_norm 2.1650 (2.4889) [2022-10-08 13:48:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][300/1251] eta 0:05:20 lr 0.000120 time 0.3266 (0.3372) loss 3.4319 (3.1536) grad_norm 2.6192 (2.4966) [2022-10-08 13:49:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][400/1251] eta 0:04:44 lr 0.000119 time 0.3346 (0.3348) loss 3.3006 (3.1509) grad_norm 2.7439 (2.5040) [2022-10-08 13:49:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][500/1251] eta 0:04:10 lr 0.000119 time 0.3264 (0.3332) loss 2.9243 (3.1448) grad_norm 5.7564 (2.5209) [2022-10-08 13:50:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][600/1251] eta 0:03:36 lr 0.000119 time 0.3282 (0.3321) loss 3.1842 (3.1439) grad_norm 2.7441 (2.5148) [2022-10-08 13:50:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][700/1251] eta 0:03:02 lr 0.000118 time 0.3295 (0.3313) loss 3.1927 (3.1425) grad_norm 3.2685 (2.5166) [2022-10-08 13:51:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][800/1251] eta 0:02:29 lr 0.000118 time 0.3249 (0.3307) loss 2.9657 (3.1402) grad_norm 2.1937 (2.5307) [2022-10-08 13:51:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][900/1251] eta 0:01:55 lr 0.000118 time 0.3263 (0.3301) loss 3.3093 (3.1442) grad_norm 2.7886 (2.5391) [2022-10-08 13:52:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][1000/1251] eta 0:01:22 lr 0.000118 time 0.3257 (0.3298) loss 3.2486 (3.1461) grad_norm 2.4175 (2.5435) [2022-10-08 13:52:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][1100/1251] eta 0:00:49 lr 0.000117 time 0.3260 (0.3293) loss 3.0170 (3.1491) grad_norm 2.2729 (2.5441) [2022-10-08 13:53:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [235/300][1200/1251] eta 0:00:16 lr 0.000117 time 0.3259 (0.3292) loss 3.1731 (3.1496) grad_norm 2.7699 (2.5430) [2022-10-08 13:53:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 235 training takes 0:06:52 [2022-10-08 13:53:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.109 (3.109) Loss 0.7849 (0.7849) Acc@1 80.566 (80.566) Acc@5 95.801 (95.801) [2022-10-08 13:53:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.752 Acc@5 95.000 [2022-10-08 13:53:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-08 13:53:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 13:53:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][0/1251] eta 1:02:14 lr 0.000117 time 2.9853 (2.9853) loss 3.3966 (3.3966) grad_norm 2.6426 (2.6426) [2022-10-08 13:54:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][100/1251] eta 0:06:46 lr 0.000117 time 0.3226 (0.3530) loss 3.0984 (3.1305) grad_norm 2.0522 (2.5931) [2022-10-08 13:55:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][200/1251] eta 0:05:57 lr 0.000117 time 0.3344 (0.3398) loss 3.2976 (3.1296) grad_norm 2.2374 (2.5737) [2022-10-08 13:55:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][300/1251] eta 0:05:18 lr 0.000116 time 0.3258 (0.3351) loss 3.3337 (3.1396) grad_norm 2.5021 (2.5814) [2022-10-08 13:56:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][400/1251] eta 0:04:43 lr 0.000116 time 0.3219 (0.3328) loss 3.4706 (3.1483) grad_norm 2.6891 (2.5510) [2022-10-08 13:56:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][500/1251] eta 0:04:08 lr 0.000116 time 0.3243 (0.3314) loss 2.9226 (3.1524) grad_norm 2.5064 (2.5653) [2022-10-08 13:57:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][600/1251] eta 0:03:35 lr 0.000116 time 0.3268 (0.3304) loss 3.4920 (3.1493) grad_norm 2.8566 (2.5606) [2022-10-08 13:57:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][700/1251] eta 0:03:01 lr 0.000115 time 0.3245 (0.3299) loss 2.8249 (3.1462) grad_norm 2.3397 (2.5592) [2022-10-08 13:58:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][800/1251] eta 0:02:28 lr 0.000115 time 0.3260 (0.3296) loss 3.0216 (3.1454) grad_norm 2.1226 (2.5618) [2022-10-08 13:58:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][900/1251] eta 0:01:55 lr 0.000115 time 0.3327 (0.3294) loss 3.1312 (3.1446) grad_norm 3.0947 (2.5622) [2022-10-08 13:59:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][1000/1251] eta 0:01:22 lr 0.000115 time 0.3283 (0.3294) loss 3.4081 (3.1478) grad_norm 2.5846 (2.5658) [2022-10-08 13:59:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][1100/1251] eta 0:00:49 lr 0.000114 time 0.3231 (0.3294) loss 3.2746 (3.1491) grad_norm 2.5411 (2.5695) [2022-10-08 14:00:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [236/300][1200/1251] eta 0:00:16 lr 0.000114 time 0.3307 (0.3295) loss 3.1099 (3.1494) grad_norm 2.2470 (2.5667) [2022-10-08 14:00:46 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 236 training takes 0:06:52 [2022-10-08 14:00:49 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.860 (2.860) Loss 0.8319 (0.8319) Acc@1 79.688 (79.688) Acc@5 94.727 (94.727) [2022-10-08 14:01:00 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.740 Acc@5 95.002 [2022-10-08 14:01:00 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-08 14:01:00 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 14:01:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][0/1251] eta 0:45:58 lr 0.000114 time 2.2048 (2.2048) loss 3.3922 (3.3922) grad_norm 2.8203 (2.8203) [2022-10-08 14:01:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][100/1251] eta 0:06:44 lr 0.000114 time 0.3212 (0.3516) loss 2.7729 (3.1182) grad_norm 2.6977 (2.5815) [2022-10-08 14:02:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][200/1251] eta 0:05:56 lr 0.000113 time 0.3307 (0.3389) loss 2.8479 (3.1232) grad_norm 2.6421 (2.5595) [2022-10-08 14:02:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][300/1251] eta 0:05:18 lr 0.000113 time 0.3285 (0.3346) loss 3.2590 (3.1285) grad_norm 2.6546 (2.5483) [2022-10-08 14:03:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][400/1251] eta 0:04:42 lr 0.000113 time 0.3214 (0.3324) loss 3.1274 (3.1296) grad_norm 2.2852 (2.5719) [2022-10-08 14:03:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][500/1251] eta 0:04:08 lr 0.000113 time 0.3258 (0.3310) loss 3.2687 (3.1346) grad_norm 2.6071 (2.5703) [2022-10-08 14:04:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][600/1251] eta 0:03:34 lr 0.000112 time 0.3224 (0.3301) loss 2.7591 (3.1391) grad_norm 2.5025 (2.5987) [2022-10-08 14:04:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][700/1251] eta 0:03:01 lr 0.000112 time 0.3268 (0.3295) loss 3.2565 (3.1392) grad_norm 3.3348 (2.6030) [2022-10-08 14:05:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][800/1251] eta 0:02:28 lr 0.000112 time 0.3268 (0.3289) loss 3.3252 (3.1416) grad_norm 3.3235 (2.6037) [2022-10-08 14:05:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][900/1251] eta 0:01:55 lr 0.000112 time 0.3252 (0.3285) loss 3.0044 (3.1422) grad_norm 2.6995 (2.6025) [2022-10-08 14:06:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][1000/1251] eta 0:01:22 lr 0.000111 time 0.3251 (0.3281) loss 3.0576 (3.1413) grad_norm 2.2839 (2.6007) [2022-10-08 14:07:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][1100/1251] eta 0:00:49 lr 0.000111 time 0.3264 (0.3280) loss 3.1095 (3.1410) grad_norm 3.6581 (2.5974) [2022-10-08 14:07:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [237/300][1200/1251] eta 0:00:16 lr 0.000111 time 0.3221 (0.3279) loss 3.1085 (3.1425) grad_norm 2.2249 (2.6001) [2022-10-08 14:07:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 237 training takes 0:06:50 [2022-10-08 14:07:53 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.512 (2.512) Loss 0.8054 (0.8054) Acc@1 80.273 (80.273) Acc@5 95.898 (95.898) [2022-10-08 14:08:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.846 Acc@5 95.010 [2022-10-08 14:08:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-08 14:08:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 14:08:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][0/1251] eta 1:03:47 lr 0.000111 time 3.0598 (3.0598) loss 3.1703 (3.1703) grad_norm 2.6667 (2.6667) [2022-10-08 14:08:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][100/1251] eta 0:06:47 lr 0.000110 time 0.3285 (0.3538) loss 3.2068 (3.1434) grad_norm 3.0002 (2.6232) [2022-10-08 14:09:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][200/1251] eta 0:05:57 lr 0.000110 time 0.3255 (0.3402) loss 2.9386 (3.1350) grad_norm 2.8239 (2.6282) [2022-10-08 14:09:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][300/1251] eta 0:05:18 lr 0.000110 time 0.3297 (0.3353) loss 3.0044 (3.1385) grad_norm 2.3589 (2.6487) [2022-10-08 14:10:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][400/1251] eta 0:04:43 lr 0.000110 time 0.3222 (0.3329) loss 3.0423 (3.1440) grad_norm 2.6526 (2.6262) [2022-10-08 14:10:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][500/1251] eta 0:04:08 lr 0.000109 time 0.3261 (0.3314) loss 3.1984 (3.1382) grad_norm 2.1159 (2.6281) [2022-10-08 14:11:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][600/1251] eta 0:03:35 lr 0.000109 time 0.3236 (0.3303) loss 3.0521 (3.1412) grad_norm 2.2832 (2.6132) [2022-10-08 14:11:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][700/1251] eta 0:03:01 lr 0.000109 time 0.3290 (0.3295) loss 3.1123 (3.1420) grad_norm 2.9021 (2.6080) [2022-10-08 14:12:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][800/1251] eta 0:02:28 lr 0.000109 time 0.3243 (0.3289) loss 3.2500 (3.1422) grad_norm 3.5098 (2.6124) [2022-10-08 14:13:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][900/1251] eta 0:01:55 lr 0.000108 time 0.3248 (0.3286) loss 2.9848 (3.1407) grad_norm 2.7048 (2.6076) [2022-10-08 14:13:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][1000/1251] eta 0:01:22 lr 0.000108 time 0.3246 (0.3282) loss 3.0054 (3.1395) grad_norm 2.8924 (2.6109) [2022-10-08 14:14:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][1100/1251] eta 0:00:49 lr 0.000108 time 0.3258 (0.3281) loss 3.3731 (3.1423) grad_norm 2.5355 (2.6178) [2022-10-08 14:14:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [238/300][1200/1251] eta 0:00:16 lr 0.000108 time 0.3296 (0.3281) loss 3.2386 (3.1466) grad_norm 2.2705 (2.6254) [2022-10-08 14:14:55 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 238 training takes 0:06:50 [2022-10-08 14:14:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.669 (2.669) Loss 0.8259 (0.8259) Acc@1 80.762 (80.762) Acc@5 95.020 (95.020) [2022-10-08 14:15:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.808 Acc@5 94.986 [2022-10-08 14:15:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-08 14:15:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.87% [2022-10-08 14:15:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][0/1251] eta 0:59:54 lr 0.000108 time 2.8736 (2.8736) loss 3.0546 (3.0546) grad_norm 2.3716 (2.3716) [2022-10-08 14:15:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][100/1251] eta 0:06:45 lr 0.000107 time 0.3291 (0.3526) loss 3.0145 (3.1103) grad_norm 2.4440 (2.6048) [2022-10-08 14:16:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][200/1251] eta 0:05:56 lr 0.000107 time 0.3241 (0.3392) loss 3.0082 (3.1405) grad_norm 2.2616 (2.6638) [2022-10-08 14:16:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][300/1251] eta 0:05:18 lr 0.000107 time 0.3258 (0.3350) loss 3.2187 (3.1366) grad_norm 2.5418 (2.6453) [2022-10-08 14:17:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][400/1251] eta 0:04:43 lr 0.000107 time 0.3231 (0.3329) loss 3.2065 (3.1386) grad_norm 3.4102 (2.6269) [2022-10-08 14:17:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][500/1251] eta 0:04:09 lr 0.000106 time 0.3305 (0.3318) loss 2.6683 (3.1385) grad_norm 2.2168 (2.6244) [2022-10-08 14:18:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][600/1251] eta 0:03:35 lr 0.000106 time 0.3215 (0.3309) loss 3.3079 (3.1430) grad_norm 2.4755 (2.6187) [2022-10-08 14:19:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][700/1251] eta 0:03:01 lr 0.000106 time 0.3293 (0.3303) loss 3.4057 (3.1408) grad_norm 2.4821 (2.6118) [2022-10-08 14:19:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][800/1251] eta 0:02:28 lr 0.000106 time 0.3245 (0.3298) loss 3.3834 (3.1403) grad_norm 2.5627 (2.6094) [2022-10-08 14:20:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][900/1251] eta 0:01:55 lr 0.000105 time 0.3278 (0.3294) loss 2.9784 (3.1403) grad_norm 2.6605 (2.6072) [2022-10-08 14:20:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][1000/1251] eta 0:01:22 lr 0.000105 time 0.3274 (0.3293) loss 3.0701 (3.1361) grad_norm 2.2938 (2.5994) [2022-10-08 14:21:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][1100/1251] eta 0:00:49 lr 0.000105 time 0.3332 (0.3292) loss 3.2267 (3.1376) grad_norm 2.5130 (2.5977) [2022-10-08 14:21:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [239/300][1200/1251] eta 0:00:16 lr 0.000105 time 0.3244 (0.3292) loss 3.1549 (3.1394) grad_norm 2.2763 (2.5983) [2022-10-08 14:22:01 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 239 training takes 0:06:52 [2022-10-08 14:22:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.580 (2.580) Loss 0.8716 (0.8716) Acc@1 78.906 (78.906) Acc@5 94.336 (94.336) [2022-10-08 14:22:14 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.982 Acc@5 94.944 [2022-10-08 14:22:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-08 14:22:14 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 79.98% [2022-10-08 14:22:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][0/1251] eta 0:56:18 lr 0.000105 time 2.7004 (2.7004) loss 3.0809 (3.0809) grad_norm 3.4741 (3.4741) [2022-10-08 14:22:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][100/1251] eta 0:06:42 lr 0.000104 time 0.3273 (0.3499) loss 3.1895 (3.1355) grad_norm 3.1820 (2.6845) [2022-10-08 14:23:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][200/1251] eta 0:05:55 lr 0.000104 time 0.3264 (0.3379) loss 2.9354 (3.1253) grad_norm 2.1935 (2.6639) [2022-10-08 14:23:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][300/1251] eta 0:05:17 lr 0.000104 time 0.3226 (0.3339) loss 3.2554 (3.1322) grad_norm 3.2156 (2.6376) [2022-10-08 14:24:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][400/1251] eta 0:04:42 lr 0.000104 time 0.3288 (0.3318) loss 3.3766 (3.1350) grad_norm 2.5689 (2.6166) [2022-10-08 14:25:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][500/1251] eta 0:04:08 lr 0.000103 time 0.3235 (0.3304) loss 3.1240 (3.1332) grad_norm 3.6553 (2.6201) [2022-10-08 14:25:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][600/1251] eta 0:03:34 lr 0.000103 time 0.3243 (0.3295) loss 3.3551 (3.1346) grad_norm 2.5576 (2.6166) [2022-10-08 14:26:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][700/1251] eta 0:03:01 lr 0.000103 time 0.3230 (0.3289) loss 3.1017 (3.1325) grad_norm 2.7701 (2.6137) [2022-10-08 14:26:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][800/1251] eta 0:02:28 lr 0.000103 time 0.3247 (0.3285) loss 2.8250 (3.1338) grad_norm 2.4760 (2.6144) [2022-10-08 14:27:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][900/1251] eta 0:01:55 lr 0.000102 time 0.3277 (0.3282) loss 3.3154 (3.1392) grad_norm 2.3647 (2.6270) [2022-10-08 14:27:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][1000/1251] eta 0:01:22 lr 0.000102 time 0.3357 (0.3280) loss 2.9676 (3.1368) grad_norm 2.7679 (2.6339) [2022-10-08 14:28:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][1100/1251] eta 0:00:49 lr 0.000102 time 0.3244 (0.3280) loss 3.2401 (3.1364) grad_norm 3.1002 (2.6280) [2022-10-08 14:28:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [240/300][1200/1251] eta 0:00:16 lr 0.000102 time 0.3290 (0.3281) loss 3.0813 (3.1360) grad_norm 2.6295 (2.6304) [2022-10-08 14:29:05 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 240 training takes 0:06:50 [2022-10-08 14:29:05 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_240 saving...... [2022-10-08 14:29:05 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_240 saved !!! [2022-10-08 14:29:08 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.666 (2.666) Loss 0.8412 (0.8412) Acc@1 79.980 (79.980) Acc@5 94.922 (94.922) [2022-10-08 14:29:19 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.078 Acc@5 95.100 [2022-10-08 14:29:19 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-08 14:29:19 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.08% [2022-10-08 14:29:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][0/1251] eta 1:01:34 lr 0.000102 time 2.9531 (2.9531) loss 2.9348 (2.9348) grad_norm 2.4449 (2.4449) [2022-10-08 14:29:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][100/1251] eta 0:06:46 lr 0.000101 time 0.3280 (0.3535) loss 2.7398 (3.1113) grad_norm 2.4634 (2.6491) [2022-10-08 14:30:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][200/1251] eta 0:05:57 lr 0.000101 time 0.3280 (0.3399) loss 3.6662 (3.1175) grad_norm 2.9973 (2.6313) [2022-10-08 14:30:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][300/1251] eta 0:05:18 lr 0.000101 time 0.3243 (0.3353) loss 3.1021 (3.1191) grad_norm 2.4751 (2.6329) [2022-10-08 14:31:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][400/1251] eta 0:04:43 lr 0.000101 time 0.3239 (0.3329) loss 2.9295 (3.1223) grad_norm 2.6236 (2.6499) [2022-10-08 14:32:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][500/1251] eta 0:04:08 lr 0.000100 time 0.3269 (0.3314) loss 3.1299 (3.1228) grad_norm 2.9823 (2.6501) [2022-10-08 14:32:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][600/1251] eta 0:03:35 lr 0.000100 time 0.3271 (0.3303) loss 3.2561 (3.1210) grad_norm 2.5114 (2.6337) [2022-10-08 14:33:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][700/1251] eta 0:03:01 lr 0.000100 time 0.4306 (0.3296) loss 3.0329 (3.1231) grad_norm 2.4919 (2.6424) [2022-10-08 14:33:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][800/1251] eta 0:02:28 lr 0.000100 time 0.3256 (0.3289) loss 3.2211 (3.1230) grad_norm 2.4322 (2.6438) [2022-10-08 14:34:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][900/1251] eta 0:01:55 lr 0.000099 time 0.3251 (0.3284) loss 3.2498 (3.1224) grad_norm 2.8627 (2.6438) [2022-10-08 14:34:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][1000/1251] eta 0:01:22 lr 0.000099 time 0.3216 (0.3281) loss 2.9870 (3.1221) grad_norm 2.4387 (2.6471) [2022-10-08 14:35:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][1100/1251] eta 0:00:49 lr 0.000099 time 0.3279 (0.3279) loss 3.3363 (3.1214) grad_norm 2.5777 (2.6461) [2022-10-08 14:35:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [241/300][1200/1251] eta 0:00:16 lr 0.000099 time 0.3264 (0.3278) loss 3.0762 (3.1223) grad_norm 2.4635 (2.6413) [2022-10-08 14:36:09 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 241 training takes 0:06:50 [2022-10-08 14:36:12 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.955 (2.955) Loss 0.8581 (0.8581) Acc@1 79.102 (79.102) Acc@5 95.215 (95.215) [2022-10-08 14:36:23 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.054 Acc@5 95.166 [2022-10-08 14:36:23 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-08 14:36:23 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.08% [2022-10-08 14:36:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][0/1251] eta 1:10:15 lr 0.000099 time 3.3694 (3.3694) loss 3.2737 (3.2737) grad_norm 2.3091 (2.3091) [2022-10-08 14:36:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][100/1251] eta 0:06:50 lr 0.000098 time 0.3288 (0.3567) loss 3.4640 (3.1184) grad_norm 3.0618 (2.5754) [2022-10-08 14:37:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][200/1251] eta 0:05:59 lr 0.000098 time 0.3365 (0.3420) loss 3.4058 (3.1147) grad_norm 2.8426 (2.6067) [2022-10-08 14:38:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][300/1251] eta 0:05:20 lr 0.000098 time 0.3271 (0.3370) loss 2.9655 (3.1195) grad_norm 3.1889 (2.6112) [2022-10-08 14:38:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][400/1251] eta 0:04:44 lr 0.000098 time 0.3352 (0.3346) loss 2.9785 (3.1177) grad_norm 2.7995 (2.6006) [2022-10-08 14:39:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][500/1251] eta 0:04:10 lr 0.000097 time 0.3265 (0.3335) loss 3.3764 (3.1182) grad_norm 3.2828 (2.6136) [2022-10-08 14:39:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][600/1251] eta 0:03:36 lr 0.000097 time 0.3309 (0.3328) loss 3.2316 (3.1206) grad_norm 2.8931 (2.6206) [2022-10-08 14:40:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][700/1251] eta 0:03:03 lr 0.000097 time 0.3289 (0.3323) loss 3.1107 (3.1194) grad_norm 2.5909 (2.6234) [2022-10-08 14:40:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][800/1251] eta 0:02:29 lr 0.000097 time 0.3348 (0.3322) loss 2.9239 (3.1190) grad_norm 2.7418 (2.6214) [2022-10-08 14:41:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][900/1251] eta 0:01:56 lr 0.000096 time 0.3333 (0.3319) loss 2.9357 (3.1185) grad_norm 2.2982 (2.6217) [2022-10-08 14:41:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][1000/1251] eta 0:01:23 lr 0.000096 time 0.3374 (0.3319) loss 3.2349 (3.1203) grad_norm 2.7056 (2.6232) [2022-10-08 14:42:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][1100/1251] eta 0:00:50 lr 0.000096 time 0.3366 (0.3320) loss 2.6324 (3.1188) grad_norm 2.2763 (2.6279) [2022-10-08 14:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [242/300][1200/1251] eta 0:00:16 lr 0.000096 time 0.3357 (0.3320) loss 2.9910 (3.1169) grad_norm 2.4161 (2.6237) [2022-10-08 14:43:18 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 242 training takes 0:06:55 [2022-10-08 14:43:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.232 (3.232) Loss 0.8573 (0.8573) Acc@1 79.395 (79.395) Acc@5 94.629 (94.629) [2022-10-08 14:43:32 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.002 Acc@5 95.174 [2022-10-08 14:43:32 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-08 14:43:32 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.08% [2022-10-08 14:43:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][0/1251] eta 0:59:57 lr 0.000096 time 2.8753 (2.8753) loss 3.1806 (3.1806) grad_norm 2.9711 (2.9711) [2022-10-08 14:44:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][100/1251] eta 0:06:45 lr 0.000095 time 0.3262 (0.3524) loss 3.3361 (3.1322) grad_norm 2.1367 (2.5931) [2022-10-08 14:44:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][200/1251] eta 0:05:56 lr 0.000095 time 0.3220 (0.3392) loss 3.1036 (3.1120) grad_norm 3.6396 (2.6349) [2022-10-08 14:45:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][300/1251] eta 0:05:18 lr 0.000095 time 0.3290 (0.3349) loss 3.2282 (3.1052) grad_norm 2.5710 (2.6716) [2022-10-08 14:45:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][400/1251] eta 0:04:43 lr 0.000095 time 0.3305 (0.3327) loss 3.1026 (3.1095) grad_norm 2.9220 (2.6603) [2022-10-08 14:46:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][500/1251] eta 0:04:08 lr 0.000094 time 0.3303 (0.3313) loss 3.1727 (3.1124) grad_norm 2.5131 (2.6591) [2022-10-08 14:46:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][600/1251] eta 0:03:35 lr 0.000094 time 0.3293 (0.3305) loss 3.0753 (3.1121) grad_norm 2.3338 (2.6525) [2022-10-08 14:47:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][700/1251] eta 0:03:01 lr 0.000094 time 0.3304 (0.3297) loss 2.9879 (3.1089) grad_norm 2.9243 (2.6511) [2022-10-08 14:47:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][800/1251] eta 0:02:28 lr 0.000094 time 0.3260 (0.3293) loss 3.2207 (3.1114) grad_norm 2.4313 (2.6550) [2022-10-08 14:48:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][900/1251] eta 0:01:55 lr 0.000094 time 0.3414 (0.3292) loss 2.9597 (3.1084) grad_norm 2.7206 (2.6505) [2022-10-08 14:49:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][1000/1251] eta 0:01:22 lr 0.000093 time 0.3262 (0.3291) loss 2.9169 (3.1103) grad_norm 2.7592 (2.6530) [2022-10-08 14:49:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][1100/1251] eta 0:00:49 lr 0.000093 time 0.3333 (0.3291) loss 3.3136 (3.1099) grad_norm 2.8025 (2.6498) [2022-10-08 14:50:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [243/300][1200/1251] eta 0:00:16 lr 0.000093 time 0.3351 (0.3292) loss 3.0390 (3.1102) grad_norm 2.9747 (2.6528) [2022-10-08 14:50:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 243 training takes 0:06:52 [2022-10-08 14:50:27 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.759 (2.759) Loss 0.8904 (0.8904) Acc@1 78.223 (78.223) Acc@5 95.117 (95.117) [2022-10-08 14:50:38 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 79.996 Acc@5 95.144 [2022-10-08 14:50:38 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-08 14:50:38 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.08% [2022-10-08 14:50:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][0/1251] eta 1:09:07 lr 0.000093 time 3.3153 (3.3153) loss 2.8259 (2.8259) grad_norm 2.4924 (2.4924) [2022-10-08 14:51:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][100/1251] eta 0:06:49 lr 0.000092 time 0.3300 (0.3558) loss 2.9355 (3.1162) grad_norm 2.9234 (2.7207) [2022-10-08 14:51:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][200/1251] eta 0:05:58 lr 0.000092 time 0.3311 (0.3414) loss 2.8575 (3.1103) grad_norm 3.2020 (2.6968) [2022-10-08 14:52:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][300/1251] eta 0:05:19 lr 0.000092 time 0.3253 (0.3364) loss 3.1702 (3.1124) grad_norm 2.6257 (2.7202) [2022-10-08 14:52:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][400/1251] eta 0:04:44 lr 0.000092 time 0.3260 (0.3338) loss 3.1181 (3.1087) grad_norm 2.8662 (2.7247) [2022-10-08 14:53:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][500/1251] eta 0:04:09 lr 0.000092 time 0.3226 (0.3322) loss 3.0310 (3.1068) grad_norm 2.5288 (2.7100) [2022-10-08 14:53:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][600/1251] eta 0:03:35 lr 0.000091 time 0.3257 (0.3313) loss 3.3855 (3.1078) grad_norm 2.8295 (2.7025) [2022-10-08 14:54:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][700/1251] eta 0:03:01 lr 0.000091 time 0.3241 (0.3302) loss 3.0316 (3.1051) grad_norm 2.4885 (2.7020) [2022-10-08 14:55:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][800/1251] eta 0:02:28 lr 0.000091 time 0.3274 (0.3294) loss 2.9405 (3.1036) grad_norm 2.4247 (2.6892) [2022-10-08 14:55:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][900/1251] eta 0:01:55 lr 0.000091 time 0.3215 (0.3289) loss 3.3877 (3.1060) grad_norm 2.9791 (2.6883) [2022-10-08 14:56:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][1000/1251] eta 0:01:22 lr 0.000090 time 0.3246 (0.3285) loss 2.9621 (3.1109) grad_norm 3.0129 (2.6875) [2022-10-08 14:56:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][1100/1251] eta 0:00:49 lr 0.000090 time 0.3290 (0.3282) loss 3.4404 (3.1085) grad_norm 2.4840 (2.6869) [2022-10-08 14:57:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [244/300][1200/1251] eta 0:00:16 lr 0.000090 time 0.3279 (0.3280) loss 2.9591 (3.1096) grad_norm 2.5790 (2.6889) [2022-10-08 14:57:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 244 training takes 0:06:50 [2022-10-08 14:57:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.966 (2.966) Loss 0.8188 (0.8188) Acc@1 78.613 (78.613) Acc@5 95.117 (95.117) [2022-10-08 14:57:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.104 Acc@5 95.138 [2022-10-08 14:57:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-08 14:57:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.10% [2022-10-08 14:57:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][0/1251] eta 1:02:33 lr 0.000090 time 3.0003 (3.0003) loss 3.0939 (3.0939) grad_norm 2.5822 (2.5822) [2022-10-08 14:58:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][100/1251] eta 0:06:46 lr 0.000090 time 0.3270 (0.3533) loss 3.0367 (3.0902) grad_norm 2.3543 (2.7720) [2022-10-08 14:58:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][200/1251] eta 0:05:57 lr 0.000089 time 0.3258 (0.3400) loss 2.9836 (3.0995) grad_norm 2.5125 (2.6986) [2022-10-08 14:59:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][300/1251] eta 0:05:19 lr 0.000089 time 0.3309 (0.3359) loss 3.0976 (3.1022) grad_norm 2.8487 (2.7252) [2022-10-08 14:59:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][400/1251] eta 0:04:44 lr 0.000089 time 0.3299 (0.3338) loss 3.1536 (3.1017) grad_norm 2.7112 (2.7255) [2022-10-08 15:00:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][500/1251] eta 0:04:09 lr 0.000089 time 0.3364 (0.3324) loss 2.8394 (3.1011) grad_norm 3.0112 (2.7076) [2022-10-08 15:01:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][600/1251] eta 0:03:35 lr 0.000089 time 0.3282 (0.3315) loss 3.0991 (3.0979) grad_norm 2.6568 (2.7036) [2022-10-08 15:01:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][700/1251] eta 0:03:02 lr 0.000088 time 0.3262 (0.3309) loss 3.2791 (3.0998) grad_norm 3.3643 (2.7028) [2022-10-08 15:02:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][800/1251] eta 0:02:28 lr 0.000088 time 0.3270 (0.3303) loss 3.0558 (3.0980) grad_norm 2.6137 (2.6988) [2022-10-08 15:02:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][900/1251] eta 0:01:55 lr 0.000088 time 0.3252 (0.3298) loss 2.9651 (3.0976) grad_norm 2.5566 (2.6924) [2022-10-08 15:03:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][1000/1251] eta 0:01:22 lr 0.000088 time 0.3301 (0.3294) loss 3.1929 (3.0974) grad_norm 3.0845 (2.6934) [2022-10-08 15:03:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][1100/1251] eta 0:00:49 lr 0.000087 time 0.3207 (0.3290) loss 3.0288 (3.0962) grad_norm 2.8821 (2.7004) [2022-10-08 15:04:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [245/300][1200/1251] eta 0:00:16 lr 0.000087 time 0.3254 (0.3288) loss 3.3187 (3.0996) grad_norm 2.8366 (2.6989) [2022-10-08 15:04:35 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 245 training takes 0:06:51 [2022-10-08 15:04:38 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.791 (2.791) Loss 0.8886 (0.8886) Acc@1 80.176 (80.176) Acc@5 94.922 (94.922) [2022-10-08 15:04:49 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.104 Acc@5 95.104 [2022-10-08 15:04:49 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-08 15:04:49 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.10% [2022-10-08 15:04:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][0/1251] eta 1:02:41 lr 0.000087 time 3.0071 (3.0071) loss 3.2312 (3.2312) grad_norm 3.2011 (3.2011) [2022-10-08 15:05:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][100/1251] eta 0:06:45 lr 0.000087 time 0.3232 (0.3522) loss 2.9304 (3.0940) grad_norm 2.6538 (2.7813) [2022-10-08 15:05:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][200/1251] eta 0:05:56 lr 0.000087 time 0.3228 (0.3387) loss 3.1382 (3.0926) grad_norm 2.9638 (2.7410) [2022-10-08 15:06:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][300/1251] eta 0:05:17 lr 0.000086 time 0.3251 (0.3342) loss 3.2291 (3.1027) grad_norm 2.2128 (2.7260) [2022-10-08 15:07:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][400/1251] eta 0:04:42 lr 0.000086 time 0.3228 (0.3320) loss 3.0401 (3.0947) grad_norm 2.7437 (2.7072) [2022-10-08 15:07:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][500/1251] eta 0:04:08 lr 0.000086 time 0.3226 (0.3305) loss 2.9333 (3.0990) grad_norm 2.7424 (2.7334) [2022-10-08 15:08:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][600/1251] eta 0:03:34 lr 0.000086 time 0.3214 (0.3294) loss 3.2610 (3.0958) grad_norm 2.3990 (2.7363) [2022-10-08 15:08:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][700/1251] eta 0:03:01 lr 0.000086 time 0.3230 (0.3287) loss 3.1055 (3.0946) grad_norm 2.2346 (2.7482) [2022-10-08 15:09:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][800/1251] eta 0:02:28 lr 0.000085 time 0.3272 (0.3283) loss 3.2051 (3.0983) grad_norm 2.7829 (2.7501) [2022-10-08 15:09:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][900/1251] eta 0:01:55 lr 0.000085 time 0.3321 (0.3280) loss 3.1693 (3.0981) grad_norm 2.6758 (2.7540) [2022-10-08 15:10:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][1000/1251] eta 0:01:22 lr 0.000085 time 0.3286 (0.3279) loss 3.0749 (3.0981) grad_norm 2.8022 (2.7553) [2022-10-08 15:10:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][1100/1251] eta 0:00:49 lr 0.000085 time 0.3295 (0.3279) loss 3.1963 (3.0976) grad_norm 3.0106 (2.7527) [2022-10-08 15:11:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [246/300][1200/1251] eta 0:00:16 lr 0.000084 time 0.3259 (0.3281) loss 3.3963 (3.0990) grad_norm 2.5391 (2.7577) [2022-10-08 15:11:40 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 246 training takes 0:06:50 [2022-10-08 15:11:43 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.234 (3.234) Loss 0.8700 (0.8700) Acc@1 78.809 (78.809) Acc@5 94.629 (94.629) [2022-10-08 15:11:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.238 Acc@5 95.122 [2022-10-08 15:11:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-08 15:11:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.24% [2022-10-08 15:11:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][0/1251] eta 1:07:07 lr 0.000084 time 3.2197 (3.2197) loss 2.9229 (2.9229) grad_norm 2.6382 (2.6382) [2022-10-08 15:12:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][100/1251] eta 0:06:48 lr 0.000084 time 0.3214 (0.3547) loss 3.3774 (3.0756) grad_norm 2.5446 (2.6967) [2022-10-08 15:13:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][200/1251] eta 0:05:59 lr 0.000084 time 0.3238 (0.3418) loss 3.3452 (3.0935) grad_norm 2.2782 (2.7591) [2022-10-08 15:13:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][300/1251] eta 0:05:20 lr 0.000084 time 0.3281 (0.3372) loss 3.2088 (3.0960) grad_norm 2.9250 (2.7641) [2022-10-08 15:14:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][400/1251] eta 0:04:44 lr 0.000083 time 0.3252 (0.3349) loss 3.1922 (3.0942) grad_norm 2.2757 (2.7435) [2022-10-08 15:14:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][500/1251] eta 0:04:10 lr 0.000083 time 0.3218 (0.3334) loss 3.0864 (3.0896) grad_norm 2.9300 (2.7323) [2022-10-08 15:15:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][600/1251] eta 0:03:36 lr 0.000083 time 0.3232 (0.3319) loss 3.1005 (3.0903) grad_norm 2.7094 (2.7315) [2022-10-08 15:15:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][700/1251] eta 0:03:02 lr 0.000083 time 0.3296 (0.3309) loss 2.9572 (3.0907) grad_norm 2.7280 (2.7241) [2022-10-08 15:16:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][800/1251] eta 0:02:29 lr 0.000083 time 0.3259 (0.3305) loss 2.9081 (3.0923) grad_norm 2.7650 (2.7236) [2022-10-08 15:16:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][900/1251] eta 0:01:55 lr 0.000082 time 0.3226 (0.3298) loss 2.9055 (3.0928) grad_norm 2.5762 (2.7210) [2022-10-08 15:17:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][1000/1251] eta 0:01:22 lr 0.000082 time 0.3239 (0.3293) loss 3.1908 (3.0948) grad_norm 3.0088 (2.7274) [2022-10-08 15:17:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][1100/1251] eta 0:00:49 lr 0.000082 time 0.3362 (0.3291) loss 3.1478 (3.0937) grad_norm 2.6434 (2.7261) [2022-10-08 15:18:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [247/300][1200/1251] eta 0:00:16 lr 0.000082 time 0.3339 (0.3292) loss 2.9684 (3.0918) grad_norm 2.8449 (2.7271) [2022-10-08 15:18:45 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 247 training takes 0:06:52 [2022-10-08 15:18:48 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.919 (2.919) Loss 0.7947 (0.7947) Acc@1 82.812 (82.812) Acc@5 95.215 (95.215) [2022-10-08 15:18:59 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.194 Acc@5 95.146 [2022-10-08 15:18:59 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-08 15:18:59 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.24% [2022-10-08 15:19:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][0/1251] eta 0:58:16 lr 0.000082 time 2.7950 (2.7950) loss 3.2202 (3.2202) grad_norm 2.8478 (2.8478) [2022-10-08 15:19:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][100/1251] eta 0:06:44 lr 0.000081 time 0.3302 (0.3516) loss 2.9409 (3.0806) grad_norm 2.9148 (2.6619) [2022-10-08 15:20:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][200/1251] eta 0:05:56 lr 0.000081 time 0.3239 (0.3389) loss 3.3145 (3.0961) grad_norm 2.6984 (2.7642) [2022-10-08 15:20:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][300/1251] eta 0:05:18 lr 0.000081 time 0.3258 (0.3344) loss 2.7792 (3.0927) grad_norm 2.5723 (2.7533) [2022-10-08 15:21:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][400/1251] eta 0:04:42 lr 0.000081 time 0.3250 (0.3321) loss 3.2255 (3.0941) grad_norm 2.7011 (2.7492) [2022-10-08 15:21:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][500/1251] eta 0:04:08 lr 0.000081 time 0.3270 (0.3307) loss 3.3902 (3.0989) grad_norm 2.5286 (2.7425) [2022-10-08 15:22:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][600/1251] eta 0:03:34 lr 0.000080 time 0.3260 (0.3300) loss 3.2399 (3.0983) grad_norm 2.8263 (2.7390) [2022-10-08 15:22:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][700/1251] eta 0:03:01 lr 0.000080 time 0.3270 (0.3293) loss 3.1106 (3.0989) grad_norm 2.6127 (2.7470) [2022-10-08 15:23:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][800/1251] eta 0:02:28 lr 0.000080 time 0.3355 (0.3288) loss 3.1647 (3.0958) grad_norm 2.6413 (2.7465) [2022-10-08 15:23:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][900/1251] eta 0:01:55 lr 0.000080 time 0.3207 (0.3285) loss 3.2089 (3.0973) grad_norm 2.3937 (2.7391) [2022-10-08 15:24:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][1000/1251] eta 0:01:22 lr 0.000079 time 0.3300 (0.3284) loss 3.0655 (3.0994) grad_norm 2.7334 (2.7326) [2022-10-08 15:25:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][1100/1251] eta 0:00:49 lr 0.000079 time 0.3274 (0.3284) loss 3.2087 (3.0999) grad_norm 2.9195 (2.7276) [2022-10-08 15:25:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [248/300][1200/1251] eta 0:00:16 lr 0.000079 time 0.3231 (0.3285) loss 3.3374 (3.1001) grad_norm 2.7147 (2.7355) [2022-10-08 15:25:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 248 training takes 0:06:51 [2022-10-08 15:25:53 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.918 (2.918) Loss 0.8967 (0.8967) Acc@1 79.492 (79.492) Acc@5 94.434 (94.434) [2022-10-08 15:26:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.292 Acc@5 95.202 [2022-10-08 15:26:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-08 15:26:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.29% [2022-10-08 15:26:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][0/1251] eta 1:10:20 lr 0.000079 time 3.3735 (3.3735) loss 2.9878 (2.9878) grad_norm 2.3223 (2.3223) [2022-10-08 15:26:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][100/1251] eta 0:06:51 lr 0.000079 time 0.3275 (0.3579) loss 3.3383 (3.0648) grad_norm 3.0917 (2.7010) [2022-10-08 15:27:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][200/1251] eta 0:06:00 lr 0.000079 time 0.3290 (0.3431) loss 3.1532 (3.0730) grad_norm 2.5338 (2.7537) [2022-10-08 15:27:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][300/1251] eta 0:05:21 lr 0.000078 time 0.3258 (0.3380) loss 2.9745 (3.0848) grad_norm 2.4239 (2.7347) [2022-10-08 15:28:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][400/1251] eta 0:04:45 lr 0.000078 time 0.3242 (0.3355) loss 3.0738 (3.0791) grad_norm 2.7371 (2.7247) [2022-10-08 15:28:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][500/1251] eta 0:04:10 lr 0.000078 time 0.3271 (0.3337) loss 3.2348 (3.0786) grad_norm 2.7700 (2.7406) [2022-10-08 15:29:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][600/1251] eta 0:03:36 lr 0.000078 time 0.3277 (0.3325) loss 3.0793 (3.0816) grad_norm 2.7276 (2.7416) [2022-10-08 15:29:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][700/1251] eta 0:03:02 lr 0.000077 time 0.3272 (0.3315) loss 3.2592 (3.0901) grad_norm 3.0683 (2.7434) [2022-10-08 15:30:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][800/1251] eta 0:02:29 lr 0.000077 time 0.3263 (0.3308) loss 2.9335 (3.0889) grad_norm 2.8967 (2.7449) [2022-10-08 15:31:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][900/1251] eta 0:01:55 lr 0.000077 time 0.3232 (0.3301) loss 3.1383 (3.0900) grad_norm 2.3081 (2.7440) [2022-10-08 15:31:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][1000/1251] eta 0:01:22 lr 0.000077 time 0.3196 (0.3298) loss 3.2019 (3.0907) grad_norm 2.6671 (2.7469) [2022-10-08 15:32:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][1100/1251] eta 0:00:49 lr 0.000077 time 0.3261 (0.3296) loss 3.0220 (3.0907) grad_norm 2.6308 (2.7428) [2022-10-08 15:32:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [249/300][1200/1251] eta 0:00:16 lr 0.000076 time 0.3344 (0.3296) loss 3.1031 (3.0901) grad_norm 2.3046 (2.7413) [2022-10-08 15:32:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 249 training takes 0:06:52 [2022-10-08 15:32:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.583 (2.583) Loss 0.8547 (0.8547) Acc@1 80.176 (80.176) Acc@5 95.020 (95.020) [2022-10-08 15:33:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.310 Acc@5 95.200 [2022-10-08 15:33:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-08 15:33:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.31% [2022-10-08 15:33:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][0/1251] eta 0:59:37 lr 0.000076 time 2.8596 (2.8596) loss 3.1543 (3.1543) grad_norm 2.2347 (2.2347) [2022-10-08 15:33:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][100/1251] eta 0:06:44 lr 0.000076 time 0.3211 (0.3513) loss 3.4015 (3.0882) grad_norm 2.7539 (2.7190) [2022-10-08 15:34:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][200/1251] eta 0:05:55 lr 0.000076 time 0.3235 (0.3382) loss 3.1523 (3.0836) grad_norm 2.3297 (2.7620) [2022-10-08 15:34:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][300/1251] eta 0:05:17 lr 0.000076 time 0.3248 (0.3341) loss 3.2112 (3.0734) grad_norm 3.1179 (2.7417) [2022-10-08 15:35:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][400/1251] eta 0:04:42 lr 0.000075 time 0.3205 (0.3322) loss 3.2486 (3.0786) grad_norm 2.4858 (2.7350) [2022-10-08 15:35:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][500/1251] eta 0:04:08 lr 0.000075 time 0.3242 (0.3306) loss 3.1566 (3.0816) grad_norm 2.5868 (2.7341) [2022-10-08 15:36:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][600/1251] eta 0:03:34 lr 0.000075 time 0.3230 (0.3294) loss 3.1220 (3.0855) grad_norm 2.5610 (2.7398) [2022-10-08 15:37:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][700/1251] eta 0:03:01 lr 0.000075 time 0.3215 (0.3286) loss 3.0729 (3.0867) grad_norm 6.1250 (2.7554) [2022-10-08 15:37:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][800/1251] eta 0:02:27 lr 0.000075 time 0.3226 (0.3279) loss 2.9325 (3.0862) grad_norm 2.4416 (2.7633) [2022-10-08 15:38:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][900/1251] eta 0:01:54 lr 0.000074 time 0.3244 (0.3275) loss 3.4250 (3.0904) grad_norm 2.5388 (2.7579) [2022-10-08 15:38:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][1000/1251] eta 0:01:22 lr 0.000074 time 0.3317 (0.3274) loss 3.0481 (3.0897) grad_norm 2.9169 (2.7570) [2022-10-08 15:39:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][1100/1251] eta 0:00:49 lr 0.000074 time 0.3206 (0.3275) loss 2.9749 (3.0887) grad_norm 2.4595 (2.7652) [2022-10-08 15:39:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [250/300][1200/1251] eta 0:00:16 lr 0.000074 time 0.3279 (0.3276) loss 3.2326 (3.0877) grad_norm 3.0103 (2.7630) [2022-10-08 15:40:00 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 250 training takes 0:06:50 [2022-10-08 15:40:00 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_250 saving...... [2022-10-08 15:40:01 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_250 saved !!! [2022-10-08 15:40:04 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.118 (3.118) Loss 0.8611 (0.8611) Acc@1 80.469 (80.469) Acc@5 94.629 (94.629) [2022-10-08 15:40:14 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.252 Acc@5 95.164 [2022-10-08 15:40:14 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-08 15:40:14 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.31% [2022-10-08 15:40:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][0/1251] eta 0:56:47 lr 0.000074 time 2.7242 (2.7242) loss 3.0698 (3.0698) grad_norm 2.4320 (2.4320) [2022-10-08 15:40:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][100/1251] eta 0:06:45 lr 0.000074 time 0.3332 (0.3520) loss 3.0272 (3.0892) grad_norm 3.0279 (2.7188) [2022-10-08 15:41:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][200/1251] eta 0:05:56 lr 0.000073 time 0.3256 (0.3396) loss 3.0822 (3.0895) grad_norm 2.6928 (2.7764) [2022-10-08 15:41:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][300/1251] eta 0:05:18 lr 0.000073 time 0.3310 (0.3354) loss 2.4202 (3.0784) grad_norm 3.1153 (2.7698) [2022-10-08 15:42:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][400/1251] eta 0:04:43 lr 0.000073 time 0.3287 (0.3333) loss 3.0440 (3.0729) grad_norm 2.8073 (2.7612) [2022-10-08 15:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][500/1251] eta 0:04:09 lr 0.000073 time 0.3282 (0.3319) loss 3.1697 (3.0746) grad_norm 2.5044 (2.7660) [2022-10-08 15:43:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][600/1251] eta 0:03:35 lr 0.000073 time 0.3249 (0.3310) loss 2.9411 (3.0718) grad_norm 2.4175 (2.7581) [2022-10-08 15:44:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][700/1251] eta 0:03:02 lr 0.000072 time 0.3253 (0.3304) loss 3.1558 (3.0738) grad_norm 2.8076 (2.7570) [2022-10-08 15:44:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][800/1251] eta 0:02:28 lr 0.000072 time 0.3319 (0.3301) loss 3.1070 (3.0761) grad_norm 3.0695 (2.7597) [2022-10-08 15:45:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][900/1251] eta 0:01:55 lr 0.000072 time 0.3355 (0.3299) loss 2.9201 (3.0745) grad_norm 3.6334 (2.7695) [2022-10-08 15:45:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][1000/1251] eta 0:01:22 lr 0.000072 time 0.3299 (0.3299) loss 2.8571 (3.0745) grad_norm 2.4542 (2.7697) [2022-10-08 15:46:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][1100/1251] eta 0:00:49 lr 0.000072 time 0.3332 (0.3300) loss 2.7301 (3.0741) grad_norm 2.9903 (2.7684) [2022-10-08 15:46:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [251/300][1200/1251] eta 0:00:16 lr 0.000071 time 0.3348 (0.3301) loss 3.2312 (3.0755) grad_norm 2.5997 (2.7638) [2022-10-08 15:47:08 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 251 training takes 0:06:53 [2022-10-08 15:47:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.917 (2.917) Loss 0.8033 (0.8033) Acc@1 81.445 (81.445) Acc@5 95.215 (95.215) [2022-10-08 15:47:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.370 Acc@5 95.194 [2022-10-08 15:47:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-08 15:47:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.37% [2022-10-08 15:47:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][0/1251] eta 0:54:46 lr 0.000071 time 2.6272 (2.6272) loss 3.0797 (3.0797) grad_norm 2.3178 (2.3178) [2022-10-08 15:47:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][100/1251] eta 0:06:43 lr 0.000071 time 0.3325 (0.3507) loss 3.2006 (3.0898) grad_norm 3.0674 (2.8120) [2022-10-08 15:48:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][200/1251] eta 0:05:55 lr 0.000071 time 0.3271 (0.3384) loss 3.2378 (3.0777) grad_norm 2.6980 (2.7943) [2022-10-08 15:49:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][300/1251] eta 0:05:18 lr 0.000071 time 0.3360 (0.3344) loss 3.2911 (3.0781) grad_norm 2.5168 (2.7843) [2022-10-08 15:49:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][400/1251] eta 0:04:42 lr 0.000070 time 0.3318 (0.3322) loss 3.0698 (3.0808) grad_norm 2.5762 (2.7860) [2022-10-08 15:50:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][500/1251] eta 0:04:08 lr 0.000070 time 0.3273 (0.3309) loss 3.1821 (3.0748) grad_norm 2.9469 (2.7769) [2022-10-08 15:50:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][600/1251] eta 0:03:34 lr 0.000070 time 0.3275 (0.3301) loss 3.0801 (3.0749) grad_norm 2.8669 (2.7761) [2022-10-08 15:51:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][700/1251] eta 0:03:01 lr 0.000070 time 0.3340 (0.3295) loss 2.8948 (3.0719) grad_norm 2.7786 (2.7882) [2022-10-08 15:51:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][800/1251] eta 0:02:28 lr 0.000070 time 0.3281 (0.3291) loss 2.8722 (3.0719) grad_norm 3.4509 (2.7810) [2022-10-08 15:52:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][900/1251] eta 0:01:55 lr 0.000069 time 0.3305 (0.3289) loss 2.8210 (3.0730) grad_norm 2.6943 (2.7806) [2022-10-08 15:52:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][1000/1251] eta 0:01:22 lr 0.000069 time 0.3276 (0.3287) loss 3.0552 (3.0715) grad_norm 2.4373 (2.7826) [2022-10-08 15:53:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][1100/1251] eta 0:00:49 lr 0.000069 time 0.3234 (0.3287) loss 2.9794 (3.0705) grad_norm 2.5844 (2.7822) [2022-10-08 15:53:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [252/300][1200/1251] eta 0:00:16 lr 0.000069 time 0.3264 (0.3289) loss 3.0493 (3.0704) grad_norm 3.1636 (2.7816) [2022-10-08 15:54:13 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 252 training takes 0:06:51 [2022-10-08 15:54:16 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.649 (2.649) Loss 0.9228 (0.9228) Acc@1 78.027 (78.027) Acc@5 94.336 (94.336) [2022-10-08 15:54:27 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.332 Acc@5 95.272 [2022-10-08 15:54:27 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-08 15:54:27 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.37% [2022-10-08 15:54:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][0/1251] eta 1:03:44 lr 0.000069 time 3.0571 (3.0571) loss 3.2499 (3.2499) grad_norm 3.0988 (3.0988) [2022-10-08 15:55:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][100/1251] eta 0:06:45 lr 0.000069 time 0.3203 (0.3527) loss 3.0377 (3.0676) grad_norm 2.5239 (2.7410) [2022-10-08 15:55:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][200/1251] eta 0:05:56 lr 0.000068 time 0.3233 (0.3393) loss 2.9089 (3.0560) grad_norm 2.8684 (2.7630) [2022-10-08 15:56:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][300/1251] eta 0:05:19 lr 0.000068 time 0.3242 (0.3357) loss 3.3127 (3.0647) grad_norm 3.0765 (2.7693) [2022-10-08 15:56:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][400/1251] eta 0:04:43 lr 0.000068 time 0.3280 (0.3331) loss 3.2058 (3.0656) grad_norm 3.1848 (2.7897) [2022-10-08 15:57:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][500/1251] eta 0:04:08 lr 0.000068 time 0.3267 (0.3315) loss 2.9680 (3.0655) grad_norm 2.4777 (2.7890) [2022-10-08 15:57:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][600/1251] eta 0:03:35 lr 0.000068 time 0.3271 (0.3304) loss 3.2313 (3.0690) grad_norm 2.5191 (2.7838) [2022-10-08 15:58:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][700/1251] eta 0:03:01 lr 0.000067 time 0.3255 (0.3297) loss 3.2523 (3.0720) grad_norm 2.8502 (2.7845) [2022-10-08 15:58:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][800/1251] eta 0:02:28 lr 0.000067 time 0.3258 (0.3292) loss 2.9469 (3.0693) grad_norm 4.0176 (2.7791) [2022-10-08 15:59:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][900/1251] eta 0:01:55 lr 0.000067 time 0.3211 (0.3289) loss 2.8689 (3.0703) grad_norm 2.6226 (2.7847) [2022-10-08 15:59:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][1000/1251] eta 0:01:22 lr 0.000067 time 0.3313 (0.3287) loss 2.9658 (3.0693) grad_norm 2.5119 (2.7856) [2022-10-08 16:00:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][1100/1251] eta 0:00:49 lr 0.000067 time 0.3248 (0.3286) loss 2.9950 (3.0676) grad_norm 2.7211 (2.7903) [2022-10-08 16:01:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [253/300][1200/1251] eta 0:00:16 lr 0.000066 time 0.3332 (0.3286) loss 2.9232 (3.0670) grad_norm 2.9001 (2.7914) [2022-10-08 16:01:19 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 253 training takes 0:06:51 [2022-10-08 16:01:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.079 (3.079) Loss 0.8345 (0.8345) Acc@1 79.980 (79.980) Acc@5 94.727 (94.727) [2022-10-08 16:01:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.404 Acc@5 95.202 [2022-10-08 16:01:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-08 16:01:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.40% [2022-10-08 16:01:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][0/1251] eta 1:02:20 lr 0.000066 time 2.9900 (2.9900) loss 3.4113 (3.4113) grad_norm 2.7581 (2.7581) [2022-10-08 16:02:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][100/1251] eta 0:06:44 lr 0.000066 time 0.3223 (0.3517) loss 2.8468 (3.0811) grad_norm 3.1558 (2.8137) [2022-10-08 16:02:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][200/1251] eta 0:05:55 lr 0.000066 time 0.3284 (0.3384) loss 3.2486 (3.0815) grad_norm 2.7242 (2.8484) [2022-10-08 16:03:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][300/1251] eta 0:05:17 lr 0.000066 time 0.3261 (0.3339) loss 2.5418 (3.0696) grad_norm 3.2760 (2.8674) [2022-10-08 16:03:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][400/1251] eta 0:04:42 lr 0.000066 time 0.3254 (0.3316) loss 3.0458 (3.0695) grad_norm 3.0217 (2.8544) [2022-10-08 16:04:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][500/1251] eta 0:04:07 lr 0.000065 time 0.3278 (0.3302) loss 2.8237 (3.0664) grad_norm 2.2322 (2.8392) [2022-10-08 16:04:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][600/1251] eta 0:03:34 lr 0.000065 time 0.3322 (0.3293) loss 3.0245 (3.0607) grad_norm 3.0717 (2.8385) [2022-10-08 16:05:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][700/1251] eta 0:03:01 lr 0.000065 time 0.3259 (0.3286) loss 3.2196 (3.0618) grad_norm 2.9528 (2.8529) [2022-10-08 16:05:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][800/1251] eta 0:02:28 lr 0.000065 time 0.3218 (0.3283) loss 2.8944 (3.0566) grad_norm 2.6179 (2.8505) [2022-10-08 16:06:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][900/1251] eta 0:01:55 lr 0.000065 time 0.3257 (0.3281) loss 3.1550 (3.0592) grad_norm 2.8319 (2.8515) [2022-10-08 16:07:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][1000/1251] eta 0:01:22 lr 0.000064 time 0.3219 (0.3280) loss 3.0442 (3.0619) grad_norm 3.3982 (2.8479) [2022-10-08 16:07:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][1100/1251] eta 0:00:49 lr 0.000064 time 0.3354 (0.3278) loss 2.8902 (3.0590) grad_norm 2.6399 (2.8492) [2022-10-08 16:08:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [254/300][1200/1251] eta 0:00:16 lr 0.000064 time 0.3304 (0.3279) loss 3.1083 (3.0597) grad_norm 2.3952 (2.8499) [2022-10-08 16:08:23 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 254 training takes 0:06:50 [2022-10-08 16:08:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.230 (2.230) Loss 0.7667 (0.7667) Acc@1 81.152 (81.152) Acc@5 96.191 (96.191) [2022-10-08 16:08:37 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.506 Acc@5 95.256 [2022-10-08 16:08:37 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-08 16:08:37 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.51% [2022-10-08 16:08:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][0/1251] eta 0:55:35 lr 0.000064 time 2.6659 (2.6659) loss 3.0751 (3.0751) grad_norm 2.8269 (2.8269) [2022-10-08 16:09:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][100/1251] eta 0:06:45 lr 0.000064 time 0.3227 (0.3520) loss 3.0992 (3.1051) grad_norm 2.6243 (2.8428) [2022-10-08 16:09:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][200/1251] eta 0:05:56 lr 0.000064 time 0.3289 (0.3391) loss 2.9660 (3.0785) grad_norm 3.2199 (2.8372) [2022-10-08 16:10:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][300/1251] eta 0:05:18 lr 0.000063 time 0.3194 (0.3346) loss 2.9707 (3.0730) grad_norm 2.5378 (2.8466) [2022-10-08 16:10:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][400/1251] eta 0:04:42 lr 0.000063 time 0.3267 (0.3322) loss 3.2462 (3.0718) grad_norm 2.6170 (2.8388) [2022-10-08 16:11:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][500/1251] eta 0:04:08 lr 0.000063 time 0.3237 (0.3310) loss 2.9884 (3.0690) grad_norm 2.4933 (2.8542) [2022-10-08 16:11:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][600/1251] eta 0:03:34 lr 0.000063 time 0.3252 (0.3301) loss 3.1016 (3.0674) grad_norm 2.4475 (2.8374) [2022-10-08 16:12:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][700/1251] eta 0:03:01 lr 0.000063 time 0.3231 (0.3295) loss 3.1190 (3.0693) grad_norm 2.8225 (2.8417) [2022-10-08 16:13:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][800/1251] eta 0:02:28 lr 0.000062 time 0.3219 (0.3291) loss 3.0615 (3.0658) grad_norm 2.6884 (2.8360) [2022-10-08 16:13:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][900/1251] eta 0:01:55 lr 0.000062 time 0.3268 (0.3289) loss 2.8535 (3.0645) grad_norm 2.7357 (2.8413) [2022-10-08 16:14:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][1000/1251] eta 0:01:22 lr 0.000062 time 0.3327 (0.3289) loss 3.0057 (3.0666) grad_norm 3.3530 (2.8398) [2022-10-08 16:14:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][1100/1251] eta 0:00:49 lr 0.000062 time 0.3232 (0.3288) loss 3.3205 (3.0669) grad_norm 2.6289 (2.8360) [2022-10-08 16:15:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [255/300][1200/1251] eta 0:00:16 lr 0.000062 time 0.3284 (0.3288) loss 3.1335 (3.0681) grad_norm 2.9754 (2.8369) [2022-10-08 16:15:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 255 training takes 0:06:51 [2022-10-08 16:15:32 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.876 (2.876) Loss 0.8432 (0.8432) Acc@1 79.980 (79.980) Acc@5 95.605 (95.605) [2022-10-08 16:15:42 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.548 Acc@5 95.214 [2022-10-08 16:15:42 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-08 16:15:42 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.55% [2022-10-08 16:15:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][0/1251] eta 0:49:46 lr 0.000062 time 2.3871 (2.3871) loss 3.3321 (3.3321) grad_norm 2.8718 (2.8718) [2022-10-08 16:16:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][100/1251] eta 0:06:47 lr 0.000061 time 0.3295 (0.3536) loss 3.0373 (3.0757) grad_norm 2.7884 (2.8688) [2022-10-08 16:16:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][200/1251] eta 0:05:59 lr 0.000061 time 0.3285 (0.3418) loss 3.2270 (3.0724) grad_norm 2.8534 (2.8656) [2022-10-08 16:17:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][300/1251] eta 0:05:20 lr 0.000061 time 0.3278 (0.3366) loss 3.1318 (3.0707) grad_norm 2.5321 (2.8691) [2022-10-08 16:17:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][400/1251] eta 0:04:44 lr 0.000061 time 0.3216 (0.3339) loss 2.7334 (3.0685) grad_norm 2.6539 (2.8626) [2022-10-08 16:18:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][500/1251] eta 0:04:09 lr 0.000061 time 0.3258 (0.3323) loss 3.2411 (3.0661) grad_norm 3.3589 (2.8693) [2022-10-08 16:19:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][600/1251] eta 0:03:35 lr 0.000061 time 0.3223 (0.3311) loss 3.1106 (3.0644) grad_norm 2.9121 (2.8784) [2022-10-08 16:19:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][700/1251] eta 0:03:02 lr 0.000060 time 0.3262 (0.3305) loss 3.0235 (3.0631) grad_norm 3.6960 (2.8750) [2022-10-08 16:20:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][800/1251] eta 0:02:28 lr 0.000060 time 0.3197 (0.3301) loss 2.8472 (3.0654) grad_norm 2.6804 (2.8808) [2022-10-08 16:20:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][900/1251] eta 0:01:55 lr 0.000060 time 0.3266 (0.3300) loss 3.1952 (3.0665) grad_norm 2.6379 (2.8943) [2022-10-08 16:21:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][1000/1251] eta 0:01:22 lr 0.000060 time 0.3293 (0.3300) loss 3.2645 (3.0642) grad_norm 2.5990 (2.8922) [2022-10-08 16:21:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][1100/1251] eta 0:00:49 lr 0.000060 time 0.3257 (0.3301) loss 3.0621 (3.0646) grad_norm 2.6431 (2.8861) [2022-10-08 16:22:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [256/300][1200/1251] eta 0:00:16 lr 0.000059 time 0.3337 (0.3302) loss 3.1982 (3.0637) grad_norm 3.2944 (2.8893) [2022-10-08 16:22:36 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 256 training takes 0:06:53 [2022-10-08 16:22:39 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.967 (2.967) Loss 0.8609 (0.8609) Acc@1 80.078 (80.078) Acc@5 95.508 (95.508) [2022-10-08 16:22:50 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.502 Acc@5 95.246 [2022-10-08 16:22:50 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-08 16:22:50 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.55% [2022-10-08 16:22:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][0/1251] eta 0:55:21 lr 0.000059 time 2.6555 (2.6555) loss 2.9710 (2.9710) grad_norm 3.2060 (3.2060) [2022-10-08 16:23:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][100/1251] eta 0:06:41 lr 0.000059 time 0.3248 (0.3489) loss 2.9572 (3.0547) grad_norm 2.9471 (2.8385) [2022-10-08 16:23:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][200/1251] eta 0:05:53 lr 0.000059 time 0.3226 (0.3368) loss 2.8473 (3.0551) grad_norm 3.1980 (2.8712) [2022-10-08 16:24:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][300/1251] eta 0:05:16 lr 0.000059 time 0.3322 (0.3329) loss 3.1579 (3.0574) grad_norm 2.7932 (2.8867) [2022-10-08 16:25:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][400/1251] eta 0:04:41 lr 0.000059 time 0.3247 (0.3308) loss 3.1403 (3.0526) grad_norm 3.1931 (2.9192) [2022-10-08 16:25:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][500/1251] eta 0:04:07 lr 0.000058 time 0.3253 (0.3295) loss 2.8957 (3.0554) grad_norm 2.8221 (2.9065) [2022-10-08 16:26:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][600/1251] eta 0:03:34 lr 0.000058 time 0.3255 (0.3289) loss 2.9420 (3.0539) grad_norm 2.7599 (2.9027) [2022-10-08 16:26:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][700/1251] eta 0:03:01 lr 0.000058 time 0.3279 (0.3288) loss 3.3633 (3.0545) grad_norm 2.9399 (2.9139) [2022-10-08 16:27:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][800/1251] eta 0:02:28 lr 0.000058 time 0.3308 (0.3289) loss 3.0202 (3.0573) grad_norm 2.6441 (2.9242) [2022-10-08 16:27:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][900/1251] eta 0:01:55 lr 0.000058 time 0.3308 (0.3289) loss 3.3930 (3.0590) grad_norm 2.5506 (2.9152) [2022-10-08 16:28:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][1000/1251] eta 0:01:22 lr 0.000058 time 0.3257 (0.3289) loss 3.1016 (3.0589) grad_norm 2.4509 (2.9143) [2022-10-08 16:28:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][1100/1251] eta 0:00:49 lr 0.000057 time 0.3271 (0.3289) loss 3.0625 (3.0587) grad_norm 3.0647 (2.9079) [2022-10-08 16:29:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [257/300][1200/1251] eta 0:00:16 lr 0.000057 time 0.3326 (0.3290) loss 3.3018 (3.0605) grad_norm 3.5339 (2.9041) [2022-10-08 16:29:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 257 training takes 0:06:52 [2022-10-08 16:29:44 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.852 (2.852) Loss 0.8080 (0.8080) Acc@1 81.250 (81.250) Acc@5 95.703 (95.703) [2022-10-08 16:29:55 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.662 Acc@5 95.276 [2022-10-08 16:29:55 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-08 16:29:55 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.66% [2022-10-08 16:29:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][0/1251] eta 0:58:56 lr 0.000057 time 2.8273 (2.8273) loss 3.1608 (3.1608) grad_norm 2.8150 (2.8150) [2022-10-08 16:30:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][100/1251] eta 0:06:45 lr 0.000057 time 0.3277 (0.3523) loss 3.0254 (3.0622) grad_norm 3.3498 (2.8445) [2022-10-08 16:31:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][200/1251] eta 0:05:56 lr 0.000057 time 0.3252 (0.3394) loss 3.3285 (3.0574) grad_norm 2.9512 (2.8633) [2022-10-08 16:31:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][300/1251] eta 0:05:18 lr 0.000057 time 0.3239 (0.3350) loss 3.1074 (3.0613) grad_norm 2.6771 (2.8647) [2022-10-08 16:32:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][400/1251] eta 0:04:43 lr 0.000056 time 0.3246 (0.3326) loss 2.9349 (3.0552) grad_norm 2.4937 (2.8633) [2022-10-08 16:32:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][500/1251] eta 0:04:08 lr 0.000056 time 0.3292 (0.3313) loss 2.6034 (3.0500) grad_norm 3.4672 (2.8767) [2022-10-08 16:33:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][600/1251] eta 0:03:35 lr 0.000056 time 0.3251 (0.3305) loss 3.0006 (3.0501) grad_norm 2.5423 (2.9054) [2022-10-08 16:33:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][700/1251] eta 0:03:01 lr 0.000056 time 0.3282 (0.3302) loss 3.1745 (3.0507) grad_norm 2.9233 (2.9016) [2022-10-08 16:34:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][800/1251] eta 0:02:28 lr 0.000056 time 0.3250 (0.3299) loss 3.0725 (3.0502) grad_norm 2.6338 (2.8984) [2022-10-08 16:34:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][900/1251] eta 0:01:55 lr 0.000056 time 0.3261 (0.3298) loss 2.9228 (3.0498) grad_norm 2.5389 (2.8947) [2022-10-08 16:35:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][1000/1251] eta 0:01:22 lr 0.000055 time 0.3255 (0.3297) loss 2.9812 (3.0489) grad_norm 3.1290 (2.8918) [2022-10-08 16:35:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][1100/1251] eta 0:00:49 lr 0.000055 time 0.3290 (0.3297) loss 3.1390 (3.0503) grad_norm 2.6047 (2.8969) [2022-10-08 16:36:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [258/300][1200/1251] eta 0:00:16 lr 0.000055 time 0.3362 (0.3298) loss 3.0859 (3.0489) grad_norm 3.0427 (2.9029) [2022-10-08 16:36:48 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 258 training takes 0:06:52 [2022-10-08 16:36:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.626 (2.626) Loss 0.8559 (0.8559) Acc@1 80.078 (80.078) Acc@5 93.945 (93.945) [2022-10-08 16:37:02 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.550 Acc@5 95.206 [2022-10-08 16:37:02 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 16:37:02 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.66% [2022-10-08 16:37:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][0/1251] eta 1:06:22 lr 0.000055 time 3.1831 (3.1831) loss 2.7659 (2.7659) grad_norm 2.7208 (2.7208) [2022-10-08 16:37:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][100/1251] eta 0:06:48 lr 0.000055 time 0.3210 (0.3549) loss 3.0681 (3.0399) grad_norm 2.5361 (2.9285) [2022-10-08 16:38:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][200/1251] eta 0:05:57 lr 0.000055 time 0.3245 (0.3400) loss 2.9804 (3.0375) grad_norm 2.7706 (2.8892) [2022-10-08 16:38:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][300/1251] eta 0:05:18 lr 0.000054 time 0.3259 (0.3350) loss 3.3773 (3.0404) grad_norm 3.0023 (2.8693) [2022-10-08 16:39:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][400/1251] eta 0:04:42 lr 0.000054 time 0.3234 (0.3325) loss 2.8025 (3.0446) grad_norm 2.6955 (2.8838) [2022-10-08 16:39:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][500/1251] eta 0:04:08 lr 0.000054 time 0.3259 (0.3308) loss 2.9625 (3.0443) grad_norm 3.5043 (2.8859) [2022-10-08 16:40:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][600/1251] eta 0:03:34 lr 0.000054 time 0.3249 (0.3297) loss 3.1433 (3.0436) grad_norm 2.9021 (2.9074) [2022-10-08 16:40:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][700/1251] eta 0:03:01 lr 0.000054 time 0.3231 (0.3290) loss 3.0352 (3.0395) grad_norm 3.1904 (2.9104) [2022-10-08 16:41:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][800/1251] eta 0:02:28 lr 0.000054 time 0.3310 (0.3286) loss 3.0030 (3.0439) grad_norm 3.3053 (2.9145) [2022-10-08 16:41:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][900/1251] eta 0:01:55 lr 0.000053 time 0.3241 (0.3284) loss 3.1710 (3.0415) grad_norm 2.6055 (2.9170) [2022-10-08 16:42:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][1000/1251] eta 0:01:22 lr 0.000053 time 0.3248 (0.3283) loss 2.8787 (3.0424) grad_norm 2.8474 (2.9116) [2022-10-08 16:43:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][1100/1251] eta 0:00:49 lr 0.000053 time 0.3261 (0.3283) loss 3.1487 (3.0430) grad_norm 2.8233 (2.9030) [2022-10-08 16:43:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [259/300][1200/1251] eta 0:00:16 lr 0.000053 time 0.3400 (0.3284) loss 2.7194 (3.0445) grad_norm 2.5913 (2.8977) [2022-10-08 16:43:53 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 259 training takes 0:06:51 [2022-10-08 16:43:56 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.912 (2.912) Loss 0.8323 (0.8323) Acc@1 80.176 (80.176) Acc@5 95.312 (95.312) [2022-10-08 16:44:07 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.598 Acc@5 95.254 [2022-10-08 16:44:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 16:44:07 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.66% [2022-10-08 16:44:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][0/1251] eta 0:50:39 lr 0.000053 time 2.4296 (2.4296) loss 2.8825 (2.8825) grad_norm 3.0353 (3.0353) [2022-10-08 16:44:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][100/1251] eta 0:06:45 lr 0.000053 time 0.3214 (0.3526) loss 2.8313 (3.0436) grad_norm 2.4042 (3.1091) [2022-10-08 16:45:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][200/1251] eta 0:05:56 lr 0.000052 time 0.3270 (0.3393) loss 3.0009 (3.0414) grad_norm 2.7308 (3.0432) [2022-10-08 16:45:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][300/1251] eta 0:05:18 lr 0.000052 time 0.3217 (0.3346) loss 2.7332 (3.0321) grad_norm 2.6884 (2.9879) [2022-10-08 16:46:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][400/1251] eta 0:04:42 lr 0.000052 time 0.3254 (0.3324) loss 3.0702 (3.0335) grad_norm 3.1266 (2.9809) [2022-10-08 16:46:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][500/1251] eta 0:04:08 lr 0.000052 time 0.3269 (0.3313) loss 2.7791 (3.0381) grad_norm 2.4108 (2.9671) [2022-10-08 16:47:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][600/1251] eta 0:03:35 lr 0.000052 time 0.3306 (0.3310) loss 2.9087 (3.0391) grad_norm 3.1291 (2.9621) [2022-10-08 16:47:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][700/1251] eta 0:03:02 lr 0.000052 time 0.3310 (0.3305) loss 2.9432 (3.0396) grad_norm 3.0488 (2.9817) [2022-10-08 16:48:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][800/1251] eta 0:02:28 lr 0.000051 time 0.3285 (0.3302) loss 2.6852 (3.0397) grad_norm 3.1082 (2.9862) [2022-10-08 16:49:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][900/1251] eta 0:01:55 lr 0.000051 time 0.3271 (0.3298) loss 2.9754 (3.0410) grad_norm 2.8102 (2.9815) [2022-10-08 16:49:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][1000/1251] eta 0:01:22 lr 0.000051 time 0.3331 (0.3297) loss 2.9031 (3.0413) grad_norm 2.7126 (2.9857) [2022-10-08 16:50:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][1100/1251] eta 0:00:49 lr 0.000051 time 0.3312 (0.3296) loss 3.0067 (3.0423) grad_norm 2.6666 (2.9902) [2022-10-08 16:50:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [260/300][1200/1251] eta 0:00:16 lr 0.000051 time 0.3262 (0.3295) loss 2.9903 (3.0434) grad_norm 3.0540 (2.9851) [2022-10-08 16:50:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 260 training takes 0:06:52 [2022-10-08 16:50:59 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_260 saving...... [2022-10-08 16:51:00 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_260 saved !!! [2022-10-08 16:51:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.045 (3.045) Loss 0.8381 (0.8381) Acc@1 80.371 (80.371) Acc@5 94.922 (94.922) [2022-10-08 16:51:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.566 Acc@5 95.278 [2022-10-08 16:51:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 16:51:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.66% [2022-10-08 16:51:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][0/1251] eta 1:02:31 lr 0.000051 time 2.9988 (2.9988) loss 2.7914 (2.7914) grad_norm 2.7223 (2.7223) [2022-10-08 16:51:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][100/1251] eta 0:06:46 lr 0.000051 time 0.3245 (0.3533) loss 2.8369 (3.0119) grad_norm 3.1212 (2.9542) [2022-10-08 16:52:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][200/1251] eta 0:05:57 lr 0.000050 time 0.3254 (0.3402) loss 3.0686 (3.0250) grad_norm 2.6603 (3.0100) [2022-10-08 16:52:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][300/1251] eta 0:05:19 lr 0.000050 time 0.3266 (0.3359) loss 2.9118 (3.0412) grad_norm 2.9700 (2.9840) [2022-10-08 16:53:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][400/1251] eta 0:04:44 lr 0.000050 time 0.3198 (0.3338) loss 2.8077 (3.0420) grad_norm 3.2379 (2.9767) [2022-10-08 16:54:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][500/1251] eta 0:04:09 lr 0.000050 time 0.3388 (0.3325) loss 3.0359 (3.0373) grad_norm 2.8246 (2.9774) [2022-10-08 16:54:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][600/1251] eta 0:03:35 lr 0.000050 time 0.3245 (0.3317) loss 3.0324 (3.0372) grad_norm 2.7512 (2.9639) [2022-10-08 16:55:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][700/1251] eta 0:03:02 lr 0.000050 time 0.3328 (0.3311) loss 3.1646 (3.0399) grad_norm 2.7068 (2.9565) [2022-10-08 16:55:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][800/1251] eta 0:02:29 lr 0.000049 time 0.3271 (0.3309) loss 2.9375 (3.0388) grad_norm 2.9591 (2.9572) [2022-10-08 16:56:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][900/1251] eta 0:01:56 lr 0.000049 time 0.3350 (0.3307) loss 3.0641 (3.0432) grad_norm 2.3512 (2.9578) [2022-10-08 16:56:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][1000/1251] eta 0:01:22 lr 0.000049 time 0.3282 (0.3306) loss 3.1622 (3.0448) grad_norm 2.6871 (2.9505) [2022-10-08 16:57:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][1100/1251] eta 0:00:49 lr 0.000049 time 0.3246 (0.3306) loss 3.0272 (3.0482) grad_norm 2.6853 (2.9539) [2022-10-08 16:57:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [261/300][1200/1251] eta 0:00:16 lr 0.000049 time 0.3249 (0.3307) loss 2.9614 (3.0488) grad_norm 2.8548 (2.9581) [2022-10-08 16:58:07 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 261 training takes 0:06:54 [2022-10-08 16:58:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.186 (3.186) Loss 0.8150 (0.8150) Acc@1 81.445 (81.445) Acc@5 95.410 (95.410) [2022-10-08 16:58:21 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.726 Acc@5 95.320 [2022-10-08 16:58:21 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-08 16:58:21 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 16:58:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][0/1251] eta 1:09:50 lr 0.000049 time 3.3496 (3.3496) loss 2.9681 (2.9681) grad_norm 2.9243 (2.9243) [2022-10-08 16:58:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][100/1251] eta 0:06:49 lr 0.000049 time 0.3290 (0.3554) loss 2.9770 (3.0399) grad_norm 3.0326 (2.9985) [2022-10-08 16:59:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][200/1251] eta 0:05:58 lr 0.000048 time 0.3250 (0.3407) loss 3.3797 (3.0544) grad_norm 2.6562 (2.9961) [2022-10-08 17:00:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][300/1251] eta 0:05:19 lr 0.000048 time 0.3272 (0.3358) loss 3.0132 (3.0455) grad_norm 2.7609 (2.9842) [2022-10-08 17:00:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][400/1251] eta 0:04:43 lr 0.000048 time 0.3246 (0.3333) loss 2.8213 (3.0428) grad_norm 2.7866 (2.9691) [2022-10-08 17:01:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][500/1251] eta 0:04:09 lr 0.000048 time 0.3255 (0.3319) loss 3.1537 (3.0389) grad_norm 3.3351 (2.9659) [2022-10-08 17:01:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][600/1251] eta 0:03:35 lr 0.000048 time 0.3216 (0.3310) loss 3.1611 (3.0383) grad_norm 2.9007 (2.9711) [2022-10-08 17:02:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][700/1251] eta 0:03:02 lr 0.000048 time 0.3291 (0.3305) loss 2.9301 (3.0353) grad_norm 2.6227 (2.9653) [2022-10-08 17:02:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][800/1251] eta 0:02:28 lr 0.000047 time 0.3265 (0.3303) loss 3.1871 (3.0356) grad_norm 2.8492 (2.9739) [2022-10-08 17:03:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][900/1251] eta 0:01:55 lr 0.000047 time 0.3276 (0.3300) loss 2.8538 (3.0357) grad_norm 2.6849 (2.9789) [2022-10-08 17:03:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][1000/1251] eta 0:01:22 lr 0.000047 time 0.3295 (0.3299) loss 3.0193 (3.0378) grad_norm 3.1011 (2.9795) [2022-10-08 17:04:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][1100/1251] eta 0:00:49 lr 0.000047 time 0.3292 (0.3298) loss 2.9294 (3.0369) grad_norm 2.9894 (2.9811) [2022-10-08 17:04:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [262/300][1200/1251] eta 0:00:16 lr 0.000047 time 0.3258 (0.3297) loss 2.8565 (3.0362) grad_norm 2.6627 (2.9861) [2022-10-08 17:05:14 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 262 training takes 0:06:52 [2022-10-08 17:05:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.898 (2.898) Loss 0.7704 (0.7704) Acc@1 81.738 (81.738) Acc@5 95.996 (95.996) [2022-10-08 17:05:28 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.606 Acc@5 95.310 [2022-10-08 17:05:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 17:05:28 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:05:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][0/1251] eta 0:49:12 lr 0.000047 time 2.3603 (2.3603) loss 2.9958 (2.9958) grad_norm 3.0683 (3.0683) [2022-10-08 17:06:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][100/1251] eta 0:06:48 lr 0.000047 time 0.3250 (0.3547) loss 3.1718 (3.0115) grad_norm 2.8176 (2.9807) [2022-10-08 17:06:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][200/1251] eta 0:05:58 lr 0.000046 time 0.3314 (0.3414) loss 2.7346 (3.0129) grad_norm 2.3971 (2.9772) [2022-10-08 17:07:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][300/1251] eta 0:05:20 lr 0.000046 time 0.3265 (0.3372) loss 3.1740 (3.0199) grad_norm 2.6400 (2.9758) [2022-10-08 17:07:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][400/1251] eta 0:04:45 lr 0.000046 time 0.3322 (0.3353) loss 3.1200 (3.0235) grad_norm 3.3607 (2.9884) [2022-10-08 17:08:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][500/1251] eta 0:04:10 lr 0.000046 time 0.3260 (0.3340) loss 3.1938 (3.0255) grad_norm 4.8436 (2.9915) [2022-10-08 17:08:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][600/1251] eta 0:03:36 lr 0.000046 time 0.3333 (0.3330) loss 2.5753 (3.0282) grad_norm 3.2741 (3.0010) [2022-10-08 17:09:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][700/1251] eta 0:03:02 lr 0.000046 time 0.3357 (0.3321) loss 3.0955 (3.0354) grad_norm 3.0156 (3.0195) [2022-10-08 17:09:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][800/1251] eta 0:02:29 lr 0.000045 time 0.3315 (0.3314) loss 2.9514 (3.0363) grad_norm 3.2481 (3.0049) [2022-10-08 17:10:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][900/1251] eta 0:01:56 lr 0.000045 time 0.3300 (0.3307) loss 2.9769 (3.0349) grad_norm 3.0941 (3.0118) [2022-10-08 17:10:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][1000/1251] eta 0:01:22 lr 0.000045 time 0.3260 (0.3303) loss 2.9283 (3.0353) grad_norm 2.5256 (3.0133) [2022-10-08 17:11:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][1100/1251] eta 0:00:49 lr 0.000045 time 0.3278 (0.3300) loss 2.8959 (3.0363) grad_norm 2.9029 (3.0163) [2022-10-08 17:12:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [263/300][1200/1251] eta 0:00:16 lr 0.000045 time 0.3241 (0.3297) loss 3.0168 (3.0364) grad_norm 2.6756 (3.0217) [2022-10-08 17:12:21 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 263 training takes 0:06:52 [2022-10-08 17:12:24 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.049 (3.049) Loss 0.8832 (0.8832) Acc@1 78.223 (78.223) Acc@5 95.410 (95.410) [2022-10-08 17:12:35 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.662 Acc@5 95.372 [2022-10-08 17:12:35 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-08 17:12:35 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:12:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][0/1251] eta 1:04:05 lr 0.000045 time 3.0743 (3.0743) loss 3.1906 (3.1906) grad_norm 2.8864 (2.8864) [2022-10-08 17:13:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][100/1251] eta 0:06:47 lr 0.000045 time 0.3257 (0.3541) loss 2.6837 (3.0301) grad_norm 3.0376 (3.0246) [2022-10-08 17:13:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][200/1251] eta 0:05:57 lr 0.000044 time 0.3281 (0.3403) loss 3.3392 (3.0203) grad_norm 2.7689 (3.0251) [2022-10-08 17:14:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][300/1251] eta 0:05:19 lr 0.000044 time 0.3301 (0.3355) loss 3.1976 (3.0275) grad_norm 2.9035 (3.0157) [2022-10-08 17:14:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][400/1251] eta 0:04:43 lr 0.000044 time 0.3319 (0.3330) loss 2.6729 (3.0189) grad_norm 2.9664 (2.9999) [2022-10-08 17:15:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][500/1251] eta 0:04:09 lr 0.000044 time 0.3255 (0.3317) loss 3.1795 (3.0215) grad_norm 2.6467 (3.0120) [2022-10-08 17:15:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][600/1251] eta 0:03:35 lr 0.000044 time 0.3268 (0.3307) loss 2.9411 (3.0217) grad_norm 2.7309 (3.0101) [2022-10-08 17:16:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][700/1251] eta 0:03:01 lr 0.000044 time 0.3235 (0.3301) loss 3.3479 (3.0214) grad_norm 3.5896 (3.0035) [2022-10-08 17:16:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][800/1251] eta 0:02:28 lr 0.000044 time 0.3249 (0.3296) loss 2.8591 (3.0201) grad_norm 2.9164 (3.0179) [2022-10-08 17:17:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][900/1251] eta 0:01:55 lr 0.000043 time 0.3267 (0.3292) loss 3.0196 (3.0217) grad_norm 3.8206 (3.0213) [2022-10-08 17:18:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][1000/1251] eta 0:01:22 lr 0.000043 time 0.3264 (0.3289) loss 3.0698 (3.0261) grad_norm 2.8349 (3.0267) [2022-10-08 17:18:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][1100/1251] eta 0:00:49 lr 0.000043 time 0.3277 (0.3290) loss 2.9376 (3.0246) grad_norm 2.7123 (3.0220) [2022-10-08 17:19:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [264/300][1200/1251] eta 0:00:16 lr 0.000043 time 0.3213 (0.3288) loss 2.8654 (3.0247) grad_norm 3.0483 (3.0237) [2022-10-08 17:19:26 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 264 training takes 0:06:51 [2022-10-08 17:19:29 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.447 (2.447) Loss 0.8996 (0.8996) Acc@1 78.906 (78.906) Acc@5 94.727 (94.727) [2022-10-08 17:19:40 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.642 Acc@5 95.312 [2022-10-08 17:19:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 17:19:40 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:19:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][0/1251] eta 1:05:27 lr 0.000043 time 3.1392 (3.1392) loss 3.2524 (3.2524) grad_norm 3.1597 (3.1597) [2022-10-08 17:20:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][100/1251] eta 0:06:47 lr 0.000043 time 0.3270 (0.3544) loss 3.1784 (3.0359) grad_norm 2.8926 (3.0635) [2022-10-08 17:20:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][200/1251] eta 0:05:57 lr 0.000043 time 0.3206 (0.3402) loss 3.0724 (3.0283) grad_norm 2.7556 (3.0702) [2022-10-08 17:21:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][300/1251] eta 0:05:19 lr 0.000042 time 0.3256 (0.3356) loss 2.8432 (3.0213) grad_norm 3.3266 (3.0692) [2022-10-08 17:21:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][400/1251] eta 0:04:43 lr 0.000042 time 0.3223 (0.3333) loss 3.0900 (3.0222) grad_norm 2.8383 (3.0562) [2022-10-08 17:22:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][500/1251] eta 0:04:09 lr 0.000042 time 0.3243 (0.3319) loss 2.8307 (3.0228) grad_norm 2.8787 (3.0279) [2022-10-08 17:22:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][600/1251] eta 0:03:35 lr 0.000042 time 0.3258 (0.3310) loss 3.0980 (3.0211) grad_norm 2.5369 (3.0171) [2022-10-08 17:23:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][700/1251] eta 0:03:02 lr 0.000042 time 0.3284 (0.3304) loss 3.1174 (3.0224) grad_norm 2.7174 (3.0172) [2022-10-08 17:24:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][800/1251] eta 0:02:28 lr 0.000042 time 0.3266 (0.3299) loss 3.0077 (3.0252) grad_norm 3.1107 (3.0202) [2022-10-08 17:24:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][900/1251] eta 0:01:55 lr 0.000042 time 0.3273 (0.3295) loss 3.0857 (3.0274) grad_norm 2.8724 (3.0187) [2022-10-08 17:25:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][1000/1251] eta 0:01:22 lr 0.000041 time 0.3261 (0.3291) loss 3.1019 (3.0266) grad_norm 3.3193 (3.0218) [2022-10-08 17:25:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][1100/1251] eta 0:00:49 lr 0.000041 time 0.3285 (0.3288) loss 3.1850 (3.0280) grad_norm 3.0560 (3.0177) [2022-10-08 17:26:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [265/300][1200/1251] eta 0:00:16 lr 0.000041 time 0.3215 (0.3285) loss 3.1989 (3.0286) grad_norm 3.1619 (3.0224) [2022-10-08 17:26:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 265 training takes 0:06:51 [2022-10-08 17:26:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.890 (2.890) Loss 0.8360 (0.8360) Acc@1 79.590 (79.590) Acc@5 95.215 (95.215) [2022-10-08 17:26:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.630 Acc@5 95.378 [2022-10-08 17:26:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-08 17:26:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:26:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][0/1251] eta 0:54:25 lr 0.000041 time 2.6101 (2.6101) loss 3.1012 (3.1012) grad_norm 3.0471 (3.0471) [2022-10-08 17:27:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][100/1251] eta 0:06:48 lr 0.000041 time 0.3267 (0.3546) loss 2.8825 (3.0208) grad_norm 3.4413 (3.0374) [2022-10-08 17:27:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][200/1251] eta 0:05:58 lr 0.000041 time 0.3262 (0.3414) loss 2.7716 (3.0296) grad_norm 3.1534 (3.0487) [2022-10-08 17:28:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][300/1251] eta 0:05:20 lr 0.000041 time 0.3227 (0.3365) loss 3.1085 (3.0301) grad_norm 2.9839 (3.0457) [2022-10-08 17:28:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][400/1251] eta 0:04:45 lr 0.000040 time 0.3269 (0.3354) loss 2.9783 (3.0306) grad_norm 2.5658 (3.0500) [2022-10-08 17:29:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][500/1251] eta 0:04:10 lr 0.000040 time 0.3222 (0.3337) loss 3.1974 (3.0324) grad_norm 2.6701 (3.0477) [2022-10-08 17:30:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][600/1251] eta 0:03:36 lr 0.000040 time 0.3252 (0.3326) loss 3.0492 (3.0319) grad_norm 2.6786 (3.0450) [2022-10-08 17:30:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][700/1251] eta 0:03:02 lr 0.000040 time 0.3294 (0.3320) loss 3.1598 (3.0313) grad_norm 3.0842 (3.0507) [2022-10-08 17:31:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][800/1251] eta 0:02:29 lr 0.000040 time 0.3278 (0.3316) loss 3.1483 (3.0261) grad_norm 3.5034 (3.0409) [2022-10-08 17:31:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][900/1251] eta 0:01:56 lr 0.000040 time 0.3388 (0.3312) loss 3.2962 (3.0278) grad_norm 3.2039 (3.0412) [2022-10-08 17:32:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][1000/1251] eta 0:01:23 lr 0.000040 time 0.3316 (0.3311) loss 2.9156 (3.0249) grad_norm 2.7128 (3.0428) [2022-10-08 17:32:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][1100/1251] eta 0:00:49 lr 0.000039 time 0.3323 (0.3310) loss 2.8720 (3.0271) grad_norm 3.6801 (3.0489) [2022-10-08 17:33:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [266/300][1200/1251] eta 0:00:16 lr 0.000039 time 0.3309 (0.3309) loss 2.9910 (3.0252) grad_norm 3.2093 (3.0529) [2022-10-08 17:33:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 266 training takes 0:06:54 [2022-10-08 17:33:41 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.333 (2.333) Loss 0.8972 (0.8972) Acc@1 78.809 (78.809) Acc@5 94.824 (94.824) [2022-10-08 17:33:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.706 Acc@5 95.392 [2022-10-08 17:33:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-08 17:33:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:33:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][0/1251] eta 1:03:13 lr 0.000039 time 3.0325 (3.0325) loss 3.0728 (3.0728) grad_norm 3.6030 (3.6030) [2022-10-08 17:34:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][100/1251] eta 0:06:51 lr 0.000039 time 0.3295 (0.3578) loss 2.9884 (3.0111) grad_norm 3.2071 (3.0883) [2022-10-08 17:35:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][200/1251] eta 0:06:01 lr 0.000039 time 0.3259 (0.3442) loss 3.1172 (3.0263) grad_norm 3.4793 (3.1112) [2022-10-08 17:35:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][300/1251] eta 0:05:22 lr 0.000039 time 0.3263 (0.3395) loss 2.8067 (3.0205) grad_norm 2.7670 (3.1363) [2022-10-08 17:36:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][400/1251] eta 0:04:46 lr 0.000039 time 0.3266 (0.3370) loss 2.9400 (3.0155) grad_norm 2.9932 (3.1050) [2022-10-08 17:36:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][500/1251] eta 0:04:11 lr 0.000039 time 0.3240 (0.3352) loss 2.7830 (3.0145) grad_norm 2.9014 (3.0997) [2022-10-08 17:37:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][600/1251] eta 0:03:37 lr 0.000038 time 0.3260 (0.3339) loss 3.0307 (3.0128) grad_norm 3.2854 (3.1139) [2022-10-08 17:37:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][700/1251] eta 0:03:03 lr 0.000038 time 0.3260 (0.3336) loss 3.0830 (3.0150) grad_norm 3.0479 (3.1178) [2022-10-08 17:38:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][800/1251] eta 0:02:30 lr 0.000038 time 0.3263 (0.3333) loss 3.0403 (3.0140) grad_norm 3.2578 (3.1151) [2022-10-08 17:38:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][900/1251] eta 0:01:56 lr 0.000038 time 0.3242 (0.3326) loss 2.8817 (3.0169) grad_norm 2.8601 (3.1083) [2022-10-08 17:39:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][1000/1251] eta 0:01:23 lr 0.000038 time 0.3226 (0.3321) loss 3.0260 (3.0191) grad_norm 2.9754 (3.1157) [2022-10-08 17:39:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][1100/1251] eta 0:00:50 lr 0.000038 time 0.3230 (0.3315) loss 2.9685 (3.0196) grad_norm 2.7859 (3.1093) [2022-10-08 17:40:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [267/300][1200/1251] eta 0:00:16 lr 0.000038 time 0.3232 (0.3310) loss 2.9152 (3.0190) grad_norm 2.8452 (3.1070) [2022-10-08 17:40:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 267 training takes 0:06:54 [2022-10-08 17:40:50 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.737 (2.737) Loss 0.7802 (0.7802) Acc@1 81.934 (81.934) Acc@5 96.289 (96.289) [2022-10-08 17:41:01 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.678 Acc@5 95.338 [2022-10-08 17:41:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-08 17:41:01 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.73% [2022-10-08 17:41:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][0/1251] eta 0:55:19 lr 0.000038 time 2.6536 (2.6536) loss 3.0543 (3.0543) grad_norm 3.0945 (3.0945) [2022-10-08 17:41:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][100/1251] eta 0:06:47 lr 0.000037 time 0.3334 (0.3540) loss 2.8910 (3.0115) grad_norm 2.9150 (3.1596) [2022-10-08 17:42:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][200/1251] eta 0:05:56 lr 0.000037 time 0.3232 (0.3391) loss 2.3908 (3.0131) grad_norm 2.8099 (3.0847) [2022-10-08 17:42:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][300/1251] eta 0:05:17 lr 0.000037 time 0.3232 (0.3343) loss 3.0859 (3.0160) grad_norm 3.2816 (3.1017) [2022-10-08 17:43:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][400/1251] eta 0:04:42 lr 0.000037 time 0.3272 (0.3322) loss 3.2978 (3.0177) grad_norm 3.1896 (3.1041) [2022-10-08 17:43:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][500/1251] eta 0:04:08 lr 0.000037 time 0.3223 (0.3307) loss 3.1021 (3.0176) grad_norm 3.3884 (3.1076) [2022-10-08 17:44:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][600/1251] eta 0:03:34 lr 0.000037 time 0.3231 (0.3296) loss 3.2100 (3.0180) grad_norm 3.8404 (3.0885) [2022-10-08 17:44:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][700/1251] eta 0:03:01 lr 0.000037 time 0.3342 (0.3288) loss 2.9683 (3.0185) grad_norm 3.1042 (3.0955) [2022-10-08 17:45:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][800/1251] eta 0:02:28 lr 0.000036 time 0.3241 (0.3283) loss 3.1085 (3.0181) grad_norm 2.8546 (3.1034) [2022-10-08 17:45:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][900/1251] eta 0:01:55 lr 0.000036 time 0.3232 (0.3279) loss 3.1162 (3.0188) grad_norm 3.2974 (3.1134) [2022-10-08 17:46:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][1000/1251] eta 0:01:22 lr 0.000036 time 0.3253 (0.3276) loss 3.0222 (3.0188) grad_norm 2.6738 (3.1120) [2022-10-08 17:47:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][1100/1251] eta 0:00:49 lr 0.000036 time 0.3216 (0.3274) loss 3.2313 (3.0198) grad_norm 2.9207 (3.1127) [2022-10-08 17:47:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [268/300][1200/1251] eta 0:00:16 lr 0.000036 time 0.3234 (0.3273) loss 2.8899 (3.0198) grad_norm 3.0272 (3.1107) [2022-10-08 17:47:50 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 268 training takes 0:06:49 [2022-10-08 17:47:53 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.785 (2.785) Loss 0.7661 (0.7661) Acc@1 83.008 (83.008) Acc@5 95.410 (95.410) [2022-10-08 17:48:04 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.798 Acc@5 95.356 [2022-10-08 17:48:04 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 17:48:04 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.80% [2022-10-08 17:48:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][0/1251] eta 1:01:41 lr 0.000036 time 2.9587 (2.9587) loss 3.2740 (3.2740) grad_norm 3.5993 (3.5993) [2022-10-08 17:48:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][100/1251] eta 0:06:46 lr 0.000036 time 0.3201 (0.3534) loss 3.1256 (2.9733) grad_norm 3.0593 (2.9984) [2022-10-08 17:49:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][200/1251] eta 0:05:57 lr 0.000036 time 0.3243 (0.3401) loss 2.7872 (2.9896) grad_norm 2.8507 (3.0620) [2022-10-08 17:49:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][300/1251] eta 0:05:18 lr 0.000035 time 0.3225 (0.3352) loss 3.1910 (2.9950) grad_norm 2.4672 (3.0895) [2022-10-08 17:50:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][400/1251] eta 0:04:43 lr 0.000035 time 0.3254 (0.3329) loss 2.9916 (2.9888) grad_norm 3.6029 (3.0934) [2022-10-08 17:50:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][500/1251] eta 0:04:08 lr 0.000035 time 0.3260 (0.3315) loss 2.9090 (2.9964) grad_norm 3.3670 (3.0920) [2022-10-08 17:51:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][600/1251] eta 0:03:35 lr 0.000035 time 0.3283 (0.3306) loss 2.9661 (2.9985) grad_norm 3.2078 (3.0983) [2022-10-08 17:51:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][700/1251] eta 0:03:01 lr 0.000035 time 0.3226 (0.3299) loss 3.1328 (3.0026) grad_norm 2.8315 (3.1029) [2022-10-08 17:52:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][800/1251] eta 0:02:28 lr 0.000035 time 0.3307 (0.3293) loss 3.0375 (3.0064) grad_norm 3.6740 (3.0903) [2022-10-08 17:53:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][900/1251] eta 0:01:55 lr 0.000035 time 0.3240 (0.3289) loss 2.9363 (3.0079) grad_norm 2.8862 (3.0952) [2022-10-08 17:53:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][1000/1251] eta 0:01:22 lr 0.000035 time 0.3250 (0.3286) loss 2.6843 (3.0087) grad_norm 2.6831 (3.0984) [2022-10-08 17:54:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][1100/1251] eta 0:00:49 lr 0.000034 time 0.3216 (0.3283) loss 2.7356 (3.0108) grad_norm 3.0928 (3.1010) [2022-10-08 17:54:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [269/300][1200/1251] eta 0:00:16 lr 0.000034 time 0.3244 (0.3281) loss 2.8895 (3.0113) grad_norm 2.6744 (3.0986) [2022-10-08 17:54:55 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 269 training takes 0:06:50 [2022-10-08 17:54:58 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.967 (2.967) Loss 0.7309 (0.7309) Acc@1 81.836 (81.836) Acc@5 96.875 (96.875) [2022-10-08 17:55:08 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.810 Acc@5 95.370 [2022-10-08 17:55:08 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 17:55:08 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.81% [2022-10-08 17:55:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][0/1251] eta 1:08:37 lr 0.000034 time 3.2916 (3.2916) loss 3.1977 (3.1977) grad_norm 2.9523 (2.9523) [2022-10-08 17:55:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][100/1251] eta 0:06:49 lr 0.000034 time 0.3320 (0.3558) loss 2.9100 (3.0010) grad_norm 2.8450 (3.1532) [2022-10-08 17:56:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][200/1251] eta 0:05:59 lr 0.000034 time 0.3298 (0.3417) loss 3.0445 (3.0101) grad_norm 3.3266 (3.1185) [2022-10-08 17:56:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][300/1251] eta 0:05:20 lr 0.000034 time 0.3289 (0.3368) loss 3.0835 (3.0023) grad_norm 3.1362 (3.1073) [2022-10-08 17:57:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][400/1251] eta 0:04:44 lr 0.000034 time 0.3287 (0.3343) loss 2.8695 (3.0030) grad_norm 2.7658 (3.0946) [2022-10-08 17:57:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][500/1251] eta 0:04:09 lr 0.000034 time 0.3226 (0.3327) loss 2.9684 (3.0026) grad_norm 3.6038 (3.1141) [2022-10-08 17:58:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][600/1251] eta 0:03:35 lr 0.000033 time 0.3255 (0.3315) loss 2.8319 (3.0013) grad_norm 2.7447 (3.1094) [2022-10-08 17:59:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][700/1251] eta 0:03:02 lr 0.000033 time 0.3234 (0.3306) loss 3.0720 (3.0020) grad_norm 2.8968 (3.1138) [2022-10-08 17:59:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][800/1251] eta 0:02:28 lr 0.000033 time 0.3300 (0.3299) loss 3.0921 (3.0043) grad_norm 3.2553 (3.1097) [2022-10-08 18:00:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][900/1251] eta 0:01:55 lr 0.000033 time 0.3212 (0.3295) loss 2.9735 (3.0054) grad_norm 6.8208 (3.1164) [2022-10-08 18:00:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][1000/1251] eta 0:01:22 lr 0.000033 time 0.3230 (0.3290) loss 2.8465 (3.0051) grad_norm 2.7841 (3.1253) [2022-10-08 18:01:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][1100/1251] eta 0:00:49 lr 0.000033 time 0.3241 (0.3285) loss 3.1014 (3.0076) grad_norm 2.8265 (3.1179) [2022-10-08 18:01:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [270/300][1200/1251] eta 0:00:16 lr 0.000033 time 0.3222 (0.3282) loss 3.0474 (3.0103) grad_norm 6.4028 (3.1204) [2022-10-08 18:01:59 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 270 training takes 0:06:50 [2022-10-08 18:01:59 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_270 saving...... [2022-10-08 18:02:00 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_270 saved !!! [2022-10-08 18:02:02 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.736 (2.736) Loss 0.7480 (0.7480) Acc@1 82.617 (82.617) Acc@5 96.777 (96.777) [2022-10-08 18:02:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.804 Acc@5 95.426 [2022-10-08 18:02:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 18:02:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.81% [2022-10-08 18:02:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][0/1251] eta 1:00:46 lr 0.000033 time 2.9149 (2.9149) loss 3.0078 (3.0078) grad_norm 3.1029 (3.1029) [2022-10-08 18:02:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][100/1251] eta 0:06:47 lr 0.000033 time 0.3329 (0.3542) loss 2.9136 (2.9961) grad_norm 2.7468 (3.1209) [2022-10-08 18:03:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][200/1251] eta 0:05:58 lr 0.000032 time 0.3334 (0.3411) loss 3.0354 (2.9966) grad_norm 2.8570 (3.1188) [2022-10-08 18:03:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][300/1251] eta 0:05:20 lr 0.000032 time 0.3231 (0.3366) loss 3.0590 (2.9995) grad_norm 3.3642 (3.1876) [2022-10-08 18:04:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][400/1251] eta 0:04:44 lr 0.000032 time 0.3291 (0.3343) loss 2.8946 (3.0002) grad_norm 3.3575 (3.1848) [2022-10-08 18:05:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][500/1251] eta 0:04:09 lr 0.000032 time 0.3290 (0.3329) loss 2.7427 (2.9951) grad_norm 3.8216 (3.1826) [2022-10-08 18:05:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][600/1251] eta 0:03:36 lr 0.000032 time 0.3319 (0.3322) loss 3.2981 (2.9975) grad_norm 3.2667 (3.1807) [2022-10-08 18:06:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][700/1251] eta 0:03:02 lr 0.000032 time 0.3246 (0.3317) loss 2.8321 (3.0030) grad_norm 2.6878 (3.1718) [2022-10-08 18:06:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][800/1251] eta 0:02:29 lr 0.000032 time 0.3316 (0.3314) loss 3.1484 (3.0035) grad_norm 3.2550 (3.1853) [2022-10-08 18:07:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][900/1251] eta 0:01:56 lr 0.000032 time 0.3254 (0.3310) loss 3.2009 (3.0047) grad_norm 2.7510 (3.1810) [2022-10-08 18:07:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][1000/1251] eta 0:01:22 lr 0.000031 time 0.3279 (0.3306) loss 2.8563 (3.0020) grad_norm 2.9097 (3.1907) [2022-10-08 18:08:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][1100/1251] eta 0:00:49 lr 0.000031 time 0.3252 (0.3303) loss 3.3235 (3.0012) grad_norm 3.0510 (3.1829) [2022-10-08 18:08:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [271/300][1200/1251] eta 0:00:16 lr 0.000031 time 0.3270 (0.3300) loss 3.0723 (2.9992) grad_norm 2.9683 (3.1887) [2022-10-08 18:09:06 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 271 training takes 0:06:53 [2022-10-08 18:09:10 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.375 (3.375) Loss 0.8631 (0.8631) Acc@1 81.152 (81.152) Acc@5 94.727 (94.727) [2022-10-08 18:09:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.858 Acc@5 95.374 [2022-10-08 18:09:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 18:09:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.86% [2022-10-08 18:09:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][0/1251] eta 1:00:36 lr 0.000031 time 2.9072 (2.9072) loss 2.9189 (2.9189) grad_norm 3.6890 (3.6890) [2022-10-08 18:09:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][100/1251] eta 0:06:42 lr 0.000031 time 0.3250 (0.3499) loss 3.0510 (2.9909) grad_norm 3.7266 (3.1314) [2022-10-08 18:10:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][200/1251] eta 0:05:54 lr 0.000031 time 0.3248 (0.3370) loss 3.1528 (2.9922) grad_norm 3.1028 (3.1814) [2022-10-08 18:11:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][300/1251] eta 0:05:16 lr 0.000031 time 0.3240 (0.3327) loss 2.7641 (2.9963) grad_norm 3.3883 (3.2049) [2022-10-08 18:11:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][400/1251] eta 0:04:41 lr 0.000031 time 0.3214 (0.3305) loss 3.0867 (3.0033) grad_norm 3.1837 (3.1881) [2022-10-08 18:12:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][500/1251] eta 0:04:07 lr 0.000031 time 0.3246 (0.3293) loss 2.7556 (3.0013) grad_norm 2.7432 (3.1930) [2022-10-08 18:12:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][600/1251] eta 0:03:33 lr 0.000030 time 0.3227 (0.3285) loss 2.8771 (3.0059) grad_norm 3.0627 (3.1818) [2022-10-08 18:13:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][700/1251] eta 0:03:00 lr 0.000030 time 0.3227 (0.3280) loss 2.9439 (3.0047) grad_norm 3.9001 (3.1867) [2022-10-08 18:13:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][800/1251] eta 0:02:27 lr 0.000030 time 0.3229 (0.3274) loss 2.9550 (3.0009) grad_norm 3.3176 (3.1873) [2022-10-08 18:14:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][900/1251] eta 0:01:54 lr 0.000030 time 0.3296 (0.3270) loss 3.1061 (2.9978) grad_norm 3.0726 (3.1807) [2022-10-08 18:14:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][1000/1251] eta 0:01:22 lr 0.000030 time 0.3273 (0.3267) loss 2.9110 (2.9994) grad_norm 3.2252 (3.2007) [2022-10-08 18:15:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][1100/1251] eta 0:00:49 lr 0.000030 time 0.3234 (0.3265) loss 3.0165 (3.0013) grad_norm 3.9170 (3.1958) [2022-10-08 18:15:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [272/300][1200/1251] eta 0:00:16 lr 0.000030 time 0.3238 (0.3263) loss 3.1444 (3.0006) grad_norm 2.9711 (3.1926) [2022-10-08 18:16:08 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 272 training takes 0:06:48 [2022-10-08 18:16:11 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.682 (2.682) Loss 0.7649 (0.7649) Acc@1 82.812 (82.812) Acc@5 95.801 (95.801) [2022-10-08 18:16:22 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.846 Acc@5 95.522 [2022-10-08 18:16:22 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 18:16:22 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.86% [2022-10-08 18:16:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][0/1251] eta 1:05:27 lr 0.000030 time 3.1395 (3.1395) loss 3.0342 (3.0342) grad_norm 3.3115 (3.3115) [2022-10-08 18:16:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][100/1251] eta 0:06:46 lr 0.000030 time 0.3255 (0.3532) loss 3.1475 (3.0017) grad_norm 3.9600 (3.1664) [2022-10-08 18:17:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][200/1251] eta 0:05:56 lr 0.000029 time 0.3255 (0.3393) loss 3.3671 (3.0124) grad_norm 2.9878 (3.1630) [2022-10-08 18:18:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][300/1251] eta 0:05:18 lr 0.000029 time 0.3206 (0.3346) loss 3.0323 (3.0165) grad_norm 3.3430 (3.1522) [2022-10-08 18:18:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][400/1251] eta 0:04:42 lr 0.000029 time 0.3265 (0.3322) loss 2.9271 (3.0138) grad_norm 2.8762 (3.1533) [2022-10-08 18:19:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][500/1251] eta 0:04:08 lr 0.000029 time 0.3235 (0.3307) loss 2.8562 (3.0145) grad_norm 3.2608 (3.1722) [2022-10-08 18:19:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][600/1251] eta 0:03:34 lr 0.000029 time 0.3246 (0.3297) loss 2.9858 (3.0141) grad_norm 3.2421 (3.1676) [2022-10-08 18:20:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][700/1251] eta 0:03:01 lr 0.000029 time 0.3239 (0.3290) loss 3.1712 (3.0139) grad_norm 2.7180 (3.1587) [2022-10-08 18:20:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][800/1251] eta 0:02:28 lr 0.000029 time 0.3215 (0.3286) loss 2.8890 (3.0090) grad_norm 3.0802 (3.1534) [2022-10-08 18:21:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][900/1251] eta 0:01:55 lr 0.000029 time 0.3242 (0.3281) loss 2.8976 (3.0073) grad_norm 3.2430 (3.1637) [2022-10-08 18:21:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][1000/1251] eta 0:01:22 lr 0.000029 time 0.3313 (0.3277) loss 3.2298 (3.0058) grad_norm 3.1229 (3.1766) [2022-10-08 18:22:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][1100/1251] eta 0:00:49 lr 0.000028 time 0.3255 (0.3274) loss 2.9811 (3.0055) grad_norm 2.8817 (3.1848) [2022-10-08 18:22:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [273/300][1200/1251] eta 0:00:16 lr 0.000028 time 0.3222 (0.3272) loss 2.9625 (3.0044) grad_norm 3.1181 (3.1774) [2022-10-08 18:23:12 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 273 training takes 0:06:49 [2022-10-08 18:23:14 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.780 (2.780) Loss 0.7695 (0.7695) Acc@1 82.520 (82.520) Acc@5 96.484 (96.484) [2022-10-08 18:23:25 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.868 Acc@5 95.422 [2022-10-08 18:23:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 18:23:25 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.87% [2022-10-08 18:23:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][0/1251] eta 0:48:40 lr 0.000028 time 2.3348 (2.3348) loss 3.1528 (3.1528) grad_norm 4.7317 (4.7317) [2022-10-08 18:24:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][100/1251] eta 0:06:45 lr 0.000028 time 0.3338 (0.3522) loss 3.0174 (2.9910) grad_norm 3.3289 (3.1921) [2022-10-08 18:24:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][200/1251] eta 0:05:56 lr 0.000028 time 0.3267 (0.3388) loss 2.8590 (2.9846) grad_norm 2.6432 (3.1786) [2022-10-08 18:25:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][300/1251] eta 0:05:18 lr 0.000028 time 0.3223 (0.3345) loss 2.9737 (2.9870) grad_norm 3.6656 (3.1872) [2022-10-08 18:25:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][400/1251] eta 0:04:42 lr 0.000028 time 0.3220 (0.3321) loss 2.6423 (2.9895) grad_norm 2.5475 (3.2105) [2022-10-08 18:26:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][500/1251] eta 0:04:08 lr 0.000028 time 0.3262 (0.3308) loss 2.9893 (2.9968) grad_norm 3.9082 (3.2202) [2022-10-08 18:26:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][600/1251] eta 0:03:34 lr 0.000028 time 0.3245 (0.3301) loss 2.9072 (2.9977) grad_norm 3.0705 (3.2163) [2022-10-08 18:27:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][700/1251] eta 0:03:01 lr 0.000027 time 0.3233 (0.3295) loss 2.9442 (3.0003) grad_norm 3.0412 (3.2284) [2022-10-08 18:27:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][800/1251] eta 0:02:28 lr 0.000027 time 0.3260 (0.3289) loss 3.2367 (3.0037) grad_norm 3.1326 (3.2158) [2022-10-08 18:28:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][900/1251] eta 0:01:55 lr 0.000027 time 0.3233 (0.3285) loss 2.9285 (3.0008) grad_norm 2.8370 (3.2201) [2022-10-08 18:28:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][1000/1251] eta 0:01:22 lr 0.000027 time 0.3230 (0.3283) loss 2.9239 (3.0018) grad_norm 2.9640 (3.2237) [2022-10-08 18:29:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][1100/1251] eta 0:00:49 lr 0.000027 time 0.3223 (0.3281) loss 2.7710 (3.0020) grad_norm 2.9529 (3.2207) [2022-10-08 18:29:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [274/300][1200/1251] eta 0:00:16 lr 0.000027 time 0.3244 (0.3278) loss 2.9165 (3.0024) grad_norm 3.4240 (3.2150) [2022-10-08 18:30:16 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 274 training takes 0:06:50 [2022-10-08 18:30:19 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.519 (3.519) Loss 0.7613 (0.7613) Acc@1 82.617 (82.617) Acc@5 96.484 (96.484) [2022-10-08 18:30:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.804 Acc@5 95.498 [2022-10-08 18:30:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 18:30:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.87% [2022-10-08 18:30:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][0/1251] eta 0:57:37 lr 0.000027 time 2.7636 (2.7636) loss 3.1385 (3.1385) grad_norm 2.8220 (2.8220) [2022-10-08 18:31:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][100/1251] eta 0:06:44 lr 0.000027 time 0.3229 (0.3516) loss 2.9872 (2.9945) grad_norm 2.9741 (3.1476) [2022-10-08 18:31:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][200/1251] eta 0:05:55 lr 0.000027 time 0.3227 (0.3383) loss 2.8697 (2.9894) grad_norm 2.6853 (3.2276) [2022-10-08 18:32:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][300/1251] eta 0:05:17 lr 0.000027 time 0.3295 (0.3339) loss 2.7131 (2.9945) grad_norm 3.5781 (3.2045) [2022-10-08 18:32:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][400/1251] eta 0:04:42 lr 0.000026 time 0.3265 (0.3317) loss 3.1838 (2.9944) grad_norm 2.8687 (3.2033) [2022-10-08 18:33:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][500/1251] eta 0:04:08 lr 0.000026 time 0.3242 (0.3303) loss 2.8289 (2.9957) grad_norm 3.0462 (3.1799) [2022-10-08 18:33:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][600/1251] eta 0:03:34 lr 0.000026 time 0.3255 (0.3294) loss 2.8507 (2.9975) grad_norm 3.3119 (3.1967) [2022-10-08 18:34:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][700/1251] eta 0:03:01 lr 0.000026 time 0.3280 (0.3288) loss 2.6256 (2.9962) grad_norm 3.0317 (3.1943) [2022-10-08 18:34:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][800/1251] eta 0:02:28 lr 0.000026 time 0.3271 (0.3283) loss 2.7619 (2.9944) grad_norm 3.0311 (3.1969) [2022-10-08 18:35:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][900/1251] eta 0:01:55 lr 0.000026 time 0.3264 (0.3279) loss 2.9879 (2.9924) grad_norm 3.1888 (3.1929) [2022-10-08 18:35:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][1000/1251] eta 0:01:22 lr 0.000026 time 0.3225 (0.3275) loss 3.0280 (2.9938) grad_norm 3.2027 (3.2077) [2022-10-08 18:36:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][1100/1251] eta 0:00:49 lr 0.000026 time 0.3268 (0.3273) loss 3.0790 (2.9955) grad_norm 3.1300 (3.2123) [2022-10-08 18:37:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [275/300][1200/1251] eta 0:00:16 lr 0.000026 time 0.3259 (0.3270) loss 3.1786 (2.9982) grad_norm 3.8194 (3.2177) [2022-10-08 18:37:19 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 275 training takes 0:06:49 [2022-10-08 18:37:22 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.404 (3.404) Loss 0.8030 (0.8030) Acc@1 80.762 (80.762) Acc@5 95.312 (95.312) [2022-10-08 18:37:33 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.774 Acc@5 95.410 [2022-10-08 18:37:33 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 18:37:33 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.87% [2022-10-08 18:37:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][0/1251] eta 0:54:40 lr 0.000026 time 2.6226 (2.6226) loss 2.9873 (2.9873) grad_norm 2.6866 (2.6866) [2022-10-08 18:38:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][100/1251] eta 0:06:41 lr 0.000025 time 0.3281 (0.3492) loss 2.9143 (2.9738) grad_norm 2.7369 (3.2500) [2022-10-08 18:38:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][200/1251] eta 0:05:54 lr 0.000025 time 0.3226 (0.3371) loss 3.0712 (3.0013) grad_norm 3.0147 (3.2346) [2022-10-08 18:39:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][300/1251] eta 0:05:16 lr 0.000025 time 0.3268 (0.3331) loss 2.9897 (2.9966) grad_norm 2.9476 (3.2129) [2022-10-08 18:39:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][400/1251] eta 0:04:41 lr 0.000025 time 0.3273 (0.3311) loss 2.7607 (3.0038) grad_norm 3.1294 (3.2314) [2022-10-08 18:40:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][500/1251] eta 0:04:07 lr 0.000025 time 0.3262 (0.3299) loss 2.8459 (3.0027) grad_norm 3.4833 (3.2418) [2022-10-08 18:40:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][600/1251] eta 0:03:34 lr 0.000025 time 0.3251 (0.3290) loss 3.0676 (3.0039) grad_norm 3.6276 (3.2307) [2022-10-08 18:41:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][700/1251] eta 0:03:01 lr 0.000025 time 0.3247 (0.3286) loss 3.0056 (2.9969) grad_norm 3.8320 (3.2393) [2022-10-08 18:41:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][800/1251] eta 0:02:27 lr 0.000025 time 0.3205 (0.3280) loss 2.8198 (2.9982) grad_norm 3.5427 (3.2470) [2022-10-08 18:42:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][900/1251] eta 0:01:54 lr 0.000025 time 0.3235 (0.3276) loss 3.0278 (2.9965) grad_norm 3.0347 (3.2421) [2022-10-08 18:43:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][1000/1251] eta 0:01:22 lr 0.000025 time 0.3213 (0.3273) loss 2.5731 (2.9950) grad_norm 2.7278 (3.2358) [2022-10-08 18:43:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][1100/1251] eta 0:00:49 lr 0.000024 time 0.3234 (0.3270) loss 3.0309 (2.9976) grad_norm 2.9580 (3.2312) [2022-10-08 18:44:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [276/300][1200/1251] eta 0:00:16 lr 0.000024 time 0.3310 (0.3268) loss 3.1051 (2.9982) grad_norm 4.0116 (3.2295) [2022-10-08 18:44:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 276 training takes 0:06:49 [2022-10-08 18:44:25 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.433 (3.433) Loss 0.8697 (0.8697) Acc@1 80.469 (80.469) Acc@5 94.922 (94.922) [2022-10-08 18:44:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.882 Acc@5 95.476 [2022-10-08 18:44:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 18:44:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.88% [2022-10-08 18:44:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][0/1251] eta 0:49:33 lr 0.000024 time 2.3769 (2.3769) loss 3.0219 (3.0219) grad_norm 2.9354 (2.9354) [2022-10-08 18:45:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][100/1251] eta 0:06:41 lr 0.000024 time 0.3255 (0.3489) loss 3.2186 (3.0166) grad_norm 3.3249 (3.2023) [2022-10-08 18:45:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][200/1251] eta 0:05:54 lr 0.000024 time 0.3269 (0.3371) loss 3.2495 (3.0145) grad_norm 3.5192 (3.2329) [2022-10-08 18:46:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][300/1251] eta 0:05:16 lr 0.000024 time 0.3234 (0.3333) loss 3.1758 (3.0070) grad_norm 3.7828 (3.2323) [2022-10-08 18:46:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][400/1251] eta 0:04:41 lr 0.000024 time 0.3278 (0.3314) loss 3.1904 (3.0111) grad_norm 3.7349 (3.2397) [2022-10-08 18:47:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][500/1251] eta 0:04:07 lr 0.000024 time 0.3218 (0.3302) loss 2.8397 (3.0021) grad_norm 3.1605 (3.2543) [2022-10-08 18:47:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][600/1251] eta 0:03:34 lr 0.000024 time 0.3200 (0.3294) loss 2.9913 (2.9983) grad_norm 3.1504 (3.2533) [2022-10-08 18:48:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][700/1251] eta 0:03:01 lr 0.000024 time 0.3231 (0.3289) loss 3.0472 (2.9968) grad_norm 3.0419 (3.2465) [2022-10-08 18:48:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][800/1251] eta 0:02:28 lr 0.000024 time 0.3243 (0.3285) loss 3.0958 (2.9954) grad_norm 3.9776 (3.2461) [2022-10-08 18:49:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][900/1251] eta 0:01:55 lr 0.000023 time 0.3239 (0.3281) loss 3.0263 (2.9928) grad_norm 4.1295 (3.2472) [2022-10-08 18:50:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][1000/1251] eta 0:01:22 lr 0.000023 time 0.3207 (0.3279) loss 2.7936 (2.9941) grad_norm 4.0740 (3.2488) [2022-10-08 18:50:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][1100/1251] eta 0:00:49 lr 0.000023 time 0.3231 (0.3277) loss 3.0328 (2.9960) grad_norm 3.2467 (3.2566) [2022-10-08 18:51:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [277/300][1200/1251] eta 0:00:16 lr 0.000023 time 0.3226 (0.3275) loss 3.0371 (2.9969) grad_norm 3.3837 (3.2574) [2022-10-08 18:51:26 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 277 training takes 0:06:50 [2022-10-08 18:51:29 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.667 (2.667) Loss 0.8643 (0.8643) Acc@1 80.176 (80.176) Acc@5 94.434 (94.434) [2022-10-08 18:51:40 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.838 Acc@5 95.446 [2022-10-08 18:51:40 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-08 18:51:40 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.88% [2022-10-08 18:51:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][0/1251] eta 0:58:28 lr 0.000023 time 2.8047 (2.8047) loss 3.2062 (3.2062) grad_norm 2.9317 (2.9317) [2022-10-08 18:52:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][100/1251] eta 0:06:43 lr 0.000023 time 0.3282 (0.3508) loss 3.1612 (3.0020) grad_norm 3.3766 (3.2054) [2022-10-08 18:52:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][200/1251] eta 0:05:55 lr 0.000023 time 0.3292 (0.3383) loss 3.0566 (2.9907) grad_norm 3.0688 (3.2239) [2022-10-08 18:53:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][300/1251] eta 0:05:17 lr 0.000023 time 0.3222 (0.3339) loss 3.2128 (2.9905) grad_norm 3.5861 (3.2356) [2022-10-08 18:53:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][400/1251] eta 0:04:42 lr 0.000023 time 0.3184 (0.3316) loss 2.9228 (2.9961) grad_norm 2.7832 (3.2483) [2022-10-08 18:54:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][500/1251] eta 0:04:08 lr 0.000023 time 0.3267 (0.3304) loss 2.6676 (2.9919) grad_norm 2.6936 (3.2378) [2022-10-08 18:54:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][600/1251] eta 0:03:34 lr 0.000023 time 0.3225 (0.3296) loss 3.1541 (2.9877) grad_norm 3.1527 (3.2518) [2022-10-08 18:55:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][700/1251] eta 0:03:01 lr 0.000022 time 0.3262 (0.3289) loss 3.2270 (2.9879) grad_norm 4.2287 (3.2556) [2022-10-08 18:56:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][800/1251] eta 0:02:28 lr 0.000022 time 0.3207 (0.3285) loss 3.2318 (2.9893) grad_norm 3.3857 (3.2541) [2022-10-08 18:56:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][900/1251] eta 0:01:55 lr 0.000022 time 0.3231 (0.3280) loss 2.9317 (2.9914) grad_norm 2.9125 (3.2465) [2022-10-08 18:57:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][1000/1251] eta 0:01:22 lr 0.000022 time 0.3261 (0.3277) loss 3.0139 (2.9914) grad_norm 2.6668 (3.2451) [2022-10-08 18:57:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][1100/1251] eta 0:00:49 lr 0.000022 time 0.3311 (0.3274) loss 3.0149 (2.9908) grad_norm 3.0898 (3.2446) [2022-10-08 18:58:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [278/300][1200/1251] eta 0:00:16 lr 0.000022 time 0.3236 (0.3271) loss 3.0616 (2.9918) grad_norm 3.1339 (3.2552) [2022-10-08 18:58:29 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 278 training takes 0:06:49 [2022-10-08 18:58:33 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.152 (3.152) Loss 0.8639 (0.8639) Acc@1 80.371 (80.371) Acc@5 94.824 (94.824) [2022-10-08 18:58:43 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.916 Acc@5 95.452 [2022-10-08 18:58:43 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 18:58:43 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.92% [2022-10-08 18:58:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][0/1251] eta 1:02:47 lr 0.000022 time 3.0113 (3.0113) loss 2.8503 (2.8503) grad_norm 2.8866 (2.8866) [2022-10-08 18:59:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][100/1251] eta 0:06:45 lr 0.000022 time 0.3271 (0.3523) loss 2.5622 (2.9873) grad_norm 3.1450 (3.2200) [2022-10-08 18:59:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][200/1251] eta 0:05:56 lr 0.000022 time 0.3227 (0.3388) loss 2.8159 (2.9918) grad_norm 3.1818 (3.2775) [2022-10-08 19:00:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][300/1251] eta 0:05:18 lr 0.000022 time 0.3276 (0.3344) loss 3.0480 (2.9916) grad_norm 2.7862 (3.2852) [2022-10-08 19:00:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][400/1251] eta 0:04:42 lr 0.000022 time 0.3244 (0.3323) loss 2.9180 (2.9906) grad_norm 3.2174 (3.2995) [2022-10-08 19:01:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][500/1251] eta 0:04:08 lr 0.000021 time 0.3241 (0.3312) loss 3.1176 (2.9849) grad_norm 3.2763 (3.2935) [2022-10-08 19:02:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][600/1251] eta 0:03:34 lr 0.000021 time 0.3241 (0.3302) loss 2.9814 (2.9913) grad_norm 3.1261 (3.2956) [2022-10-08 19:02:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][700/1251] eta 0:03:01 lr 0.000021 time 0.3228 (0.3294) loss 3.0379 (2.9926) grad_norm 3.4037 (3.3076) [2022-10-08 19:03:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][800/1251] eta 0:02:28 lr 0.000021 time 0.3233 (0.3288) loss 2.9473 (2.9913) grad_norm 3.4506 (3.2970) [2022-10-08 19:03:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][900/1251] eta 0:01:55 lr 0.000021 time 0.3252 (0.3283) loss 3.0622 (2.9924) grad_norm 3.3809 (3.2944) [2022-10-08 19:04:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][1000/1251] eta 0:01:22 lr 0.000021 time 0.3221 (0.3279) loss 3.1428 (2.9904) grad_norm 3.2625 (3.3007) [2022-10-08 19:04:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][1100/1251] eta 0:00:49 lr 0.000021 time 0.3212 (0.3276) loss 3.0571 (2.9903) grad_norm 3.3924 (3.3002) [2022-10-08 19:05:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [279/300][1200/1251] eta 0:00:16 lr 0.000021 time 0.3252 (0.3273) loss 2.8397 (2.9916) grad_norm 3.2356 (3.2945) [2022-10-08 19:05:33 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 279 training takes 0:06:49 [2022-10-08 19:05:36 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.873 (2.873) Loss 0.8016 (0.8016) Acc@1 80.762 (80.762) Acc@5 95.605 (95.605) [2022-10-08 19:05:46 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.994 Acc@5 95.494 [2022-10-08 19:05:46 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 19:05:46 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.99% [2022-10-08 19:05:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][0/1251] eta 1:00:59 lr 0.000021 time 2.9251 (2.9251) loss 2.8481 (2.8481) grad_norm 2.9544 (2.9544) [2022-10-08 19:06:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][100/1251] eta 0:06:47 lr 0.000021 time 0.3206 (0.3543) loss 2.8202 (2.9710) grad_norm 3.2629 (3.3711) [2022-10-08 19:06:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][200/1251] eta 0:05:58 lr 0.000021 time 0.3338 (0.3412) loss 2.9026 (2.9662) grad_norm 3.3140 (3.3571) [2022-10-08 19:07:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][300/1251] eta 0:05:20 lr 0.000021 time 0.3242 (0.3367) loss 3.1126 (2.9798) grad_norm 2.9404 (3.3324) [2022-10-08 19:08:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][400/1251] eta 0:04:44 lr 0.000020 time 0.3331 (0.3344) loss 2.7793 (2.9830) grad_norm 2.8926 (3.3088) [2022-10-08 19:08:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][500/1251] eta 0:04:10 lr 0.000020 time 0.3234 (0.3331) loss 3.0611 (2.9807) grad_norm 2.6899 (3.3083) [2022-10-08 19:09:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][600/1251] eta 0:03:36 lr 0.000020 time 0.3304 (0.3322) loss 2.9020 (2.9822) grad_norm 3.1036 (3.3212) [2022-10-08 19:09:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][700/1251] eta 0:03:02 lr 0.000020 time 0.3274 (0.3314) loss 2.8593 (2.9833) grad_norm 3.1004 (3.3336) [2022-10-08 19:10:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][800/1251] eta 0:02:29 lr 0.000020 time 0.3200 (0.3306) loss 3.0951 (2.9786) grad_norm 3.1181 (3.3352) [2022-10-08 19:10:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][900/1251] eta 0:01:55 lr 0.000020 time 0.3241 (0.3301) loss 2.8858 (2.9785) grad_norm 3.3831 (3.3346) [2022-10-08 19:11:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][1000/1251] eta 0:01:22 lr 0.000020 time 0.3244 (0.3297) loss 2.7732 (2.9774) grad_norm 3.0752 (3.3202) [2022-10-08 19:11:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][1100/1251] eta 0:00:49 lr 0.000020 time 0.3290 (0.3294) loss 3.1104 (2.9765) grad_norm 2.8460 (3.3195) [2022-10-08 19:12:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [280/300][1200/1251] eta 0:00:16 lr 0.000020 time 0.3229 (0.3292) loss 3.1065 (2.9744) grad_norm 3.9084 (3.3132) [2022-10-08 19:12:38 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 280 training takes 0:06:51 [2022-10-08 19:12:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_280 saving...... [2022-10-08 19:12:39 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_280 saved !!! [2022-10-08 19:12:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.834 (2.834) Loss 0.8599 (0.8599) Acc@1 79.199 (79.199) Acc@5 95.410 (95.410) [2022-10-08 19:12:52 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.902 Acc@5 95.500 [2022-10-08 19:12:52 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 19:12:52 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 80.99% [2022-10-08 19:12:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][0/1251] eta 1:09:10 lr 0.000020 time 3.3178 (3.3178) loss 2.9625 (2.9625) grad_norm 6.5648 (6.5648) [2022-10-08 19:13:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][100/1251] eta 0:06:47 lr 0.000020 time 0.3220 (0.3540) loss 2.9471 (3.0179) grad_norm 3.0905 (3.3671) [2022-10-08 19:14:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][200/1251] eta 0:05:56 lr 0.000020 time 0.3228 (0.3394) loss 2.7546 (2.9864) grad_norm 2.7844 (3.4153) [2022-10-08 19:14:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][300/1251] eta 0:05:17 lr 0.000020 time 0.3231 (0.3343) loss 3.0119 (2.9918) grad_norm 3.3089 (3.3570) [2022-10-08 19:15:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][400/1251] eta 0:04:42 lr 0.000019 time 0.3266 (0.3321) loss 2.7633 (2.9918) grad_norm 2.7113 (3.3463) [2022-10-08 19:15:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][500/1251] eta 0:04:08 lr 0.000019 time 0.3223 (0.3307) loss 3.2606 (2.9910) grad_norm 3.6043 (3.3712) [2022-10-08 19:16:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][600/1251] eta 0:03:34 lr 0.000019 time 0.3304 (0.3299) loss 2.9651 (2.9886) grad_norm 3.1552 (3.3725) [2022-10-08 19:16:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][700/1251] eta 0:03:01 lr 0.000019 time 0.3246 (0.3291) loss 2.9140 (2.9868) grad_norm 3.4924 (3.3476) [2022-10-08 19:17:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][800/1251] eta 0:02:28 lr 0.000019 time 0.3256 (0.3286) loss 3.1241 (2.9907) grad_norm 3.7016 (3.3457) [2022-10-08 19:17:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][900/1251] eta 0:01:55 lr 0.000019 time 0.3231 (0.3281) loss 2.8725 (2.9902) grad_norm 2.7606 (3.3414) [2022-10-08 19:18:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][1000/1251] eta 0:01:22 lr 0.000019 time 0.3263 (0.3278) loss 2.9565 (2.9871) grad_norm 3.0485 (3.3333) [2022-10-08 19:18:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][1100/1251] eta 0:00:49 lr 0.000019 time 0.3232 (0.3275) loss 2.9256 (2.9864) grad_norm 3.2186 (3.3351) [2022-10-08 19:19:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [281/300][1200/1251] eta 0:00:16 lr 0.000019 time 0.3258 (0.3273) loss 2.8131 (2.9861) grad_norm 3.8043 (3.3341) [2022-10-08 19:19:42 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 281 training takes 0:06:49 [2022-10-08 19:19:45 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.868 (2.868) Loss 0.8498 (0.8498) Acc@1 81.055 (81.055) Acc@5 95.117 (95.117) [2022-10-08 19:19:56 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.026 Acc@5 95.462 [2022-10-08 19:19:56 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 19:19:56 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.03% [2022-10-08 19:19:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][0/1251] eta 0:51:49 lr 0.000019 time 2.4860 (2.4860) loss 2.9560 (2.9560) grad_norm 3.2721 (3.2721) [2022-10-08 19:20:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][100/1251] eta 0:06:48 lr 0.000019 time 0.3326 (0.3553) loss 2.7858 (2.9689) grad_norm 2.9290 (3.2967) [2022-10-08 19:21:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][200/1251] eta 0:05:58 lr 0.000019 time 0.3243 (0.3413) loss 2.9991 (2.9726) grad_norm 3.0960 (3.2941) [2022-10-08 19:21:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][300/1251] eta 0:05:20 lr 0.000019 time 0.3246 (0.3368) loss 3.1008 (2.9902) grad_norm 2.6249 (3.3166) [2022-10-08 19:22:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][400/1251] eta 0:04:44 lr 0.000018 time 0.3268 (0.3349) loss 2.9018 (2.9862) grad_norm 3.5091 (3.3228) [2022-10-08 19:22:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][500/1251] eta 0:04:10 lr 0.000018 time 0.3275 (0.3332) loss 2.7856 (2.9846) grad_norm 3.3179 (3.3271) [2022-10-08 19:23:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][600/1251] eta 0:03:36 lr 0.000018 time 0.3246 (0.3320) loss 2.9005 (2.9841) grad_norm 3.9449 (3.3259) [2022-10-08 19:23:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][700/1251] eta 0:03:02 lr 0.000018 time 0.3207 (0.3310) loss 2.9881 (2.9836) grad_norm 3.4893 (3.3333) [2022-10-08 19:24:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][800/1251] eta 0:02:29 lr 0.000018 time 0.3249 (0.3304) loss 3.1246 (2.9837) grad_norm 3.8906 (3.3260) [2022-10-08 19:24:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][900/1251] eta 0:01:55 lr 0.000018 time 0.3294 (0.3299) loss 3.0044 (2.9860) grad_norm 3.3533 (3.3213) [2022-10-08 19:25:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][1000/1251] eta 0:01:22 lr 0.000018 time 0.3263 (0.3294) loss 3.0930 (2.9868) grad_norm 3.8158 (3.3212) [2022-10-08 19:25:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][1100/1251] eta 0:00:49 lr 0.000018 time 0.3205 (0.3290) loss 2.9034 (2.9880) grad_norm 3.4671 (3.3190) [2022-10-08 19:26:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [282/300][1200/1251] eta 0:00:16 lr 0.000018 time 0.3241 (0.3286) loss 2.9707 (2.9878) grad_norm 5.1137 (3.3170) [2022-10-08 19:26:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 282 training takes 0:06:51 [2022-10-08 19:26:50 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.972 (2.972) Loss 0.7765 (0.7765) Acc@1 81.445 (81.445) Acc@5 96.387 (96.387) [2022-10-08 19:27:01 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.884 Acc@5 95.416 [2022-10-08 19:27:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 19:27:01 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.03% [2022-10-08 19:27:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][0/1251] eta 1:07:55 lr 0.000018 time 3.2575 (3.2575) loss 2.6628 (2.6628) grad_norm 2.8586 (2.8586) [2022-10-08 19:27:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][100/1251] eta 0:06:48 lr 0.000018 time 0.3255 (0.3549) loss 2.9884 (2.9820) grad_norm 3.1133 (3.2425) [2022-10-08 19:28:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][200/1251] eta 0:05:58 lr 0.000018 time 0.3267 (0.3409) loss 3.0205 (2.9823) grad_norm 3.4107 (3.2538) [2022-10-08 19:28:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][300/1251] eta 0:05:19 lr 0.000018 time 0.3281 (0.3361) loss 2.7476 (2.9848) grad_norm 3.0201 (3.2803) [2022-10-08 19:29:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][400/1251] eta 0:04:43 lr 0.000018 time 0.3307 (0.3337) loss 3.3278 (2.9859) grad_norm 3.3540 (3.2801) [2022-10-08 19:29:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][500/1251] eta 0:04:09 lr 0.000017 time 0.3241 (0.3323) loss 2.9490 (2.9817) grad_norm 3.1365 (3.2978) [2022-10-08 19:30:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][600/1251] eta 0:03:35 lr 0.000017 time 0.3306 (0.3314) loss 2.8904 (2.9773) grad_norm 2.7259 (3.2936) [2022-10-08 19:30:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][700/1251] eta 0:03:02 lr 0.000017 time 0.3300 (0.3308) loss 2.8467 (2.9756) grad_norm 3.6391 (3.3025) [2022-10-08 19:31:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][800/1251] eta 0:02:28 lr 0.000017 time 0.3293 (0.3304) loss 3.0490 (2.9747) grad_norm 3.5325 (3.3030) [2022-10-08 19:31:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][900/1251] eta 0:01:55 lr 0.000017 time 0.3281 (0.3300) loss 2.9702 (2.9723) grad_norm 3.0640 (3.3161) [2022-10-08 19:32:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][1000/1251] eta 0:01:22 lr 0.000017 time 0.3259 (0.3297) loss 2.8783 (2.9742) grad_norm 3.1237 (3.3247) [2022-10-08 19:33:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][1100/1251] eta 0:00:49 lr 0.000017 time 0.3319 (0.3294) loss 2.6031 (2.9746) grad_norm 3.0038 (3.3250) [2022-10-08 19:33:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [283/300][1200/1251] eta 0:00:16 lr 0.000017 time 0.3281 (0.3292) loss 3.0013 (2.9771) grad_norm 3.3759 (3.3251) [2022-10-08 19:33:53 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 283 training takes 0:06:52 [2022-10-08 19:33:57 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.348 (3.348) Loss 0.8211 (0.8211) Acc@1 80.469 (80.469) Acc@5 94.922 (94.922) [2022-10-08 19:34:07 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 80.948 Acc@5 95.494 [2022-10-08 19:34:07 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-08 19:34:07 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.03% [2022-10-08 19:34:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][0/1251] eta 0:52:47 lr 0.000017 time 2.5323 (2.5323) loss 2.9466 (2.9466) grad_norm 3.1818 (3.1818) [2022-10-08 19:34:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][100/1251] eta 0:06:41 lr 0.000017 time 0.3261 (0.3489) loss 3.1020 (2.9681) grad_norm 4.1949 (3.3786) [2022-10-08 19:35:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][200/1251] eta 0:05:53 lr 0.000017 time 0.3269 (0.3366) loss 2.9458 (2.9763) grad_norm 2.9451 (3.3828) [2022-10-08 19:35:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][300/1251] eta 0:05:16 lr 0.000017 time 0.3241 (0.3325) loss 3.0839 (2.9824) grad_norm 2.7663 (3.3384) [2022-10-08 19:36:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][400/1251] eta 0:04:41 lr 0.000017 time 0.3247 (0.3307) loss 2.7864 (2.9726) grad_norm 3.3667 (3.3036) [2022-10-08 19:36:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][500/1251] eta 0:04:07 lr 0.000017 time 0.3281 (0.3296) loss 2.8197 (2.9669) grad_norm 2.7868 (3.2985) [2022-10-08 19:37:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][600/1251] eta 0:03:34 lr 0.000017 time 0.3228 (0.3287) loss 3.0855 (2.9691) grad_norm 2.7173 (3.2992) [2022-10-08 19:37:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][700/1251] eta 0:03:00 lr 0.000016 time 0.3230 (0.3282) loss 3.0386 (2.9688) grad_norm 3.0003 (3.3055) [2022-10-08 19:38:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][800/1251] eta 0:02:27 lr 0.000016 time 0.3228 (0.3279) loss 3.0505 (2.9702) grad_norm 3.0926 (3.2969) [2022-10-08 19:39:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][900/1251] eta 0:01:54 lr 0.000016 time 0.3250 (0.3276) loss 2.7744 (2.9710) grad_norm 2.8716 (3.3037) [2022-10-08 19:39:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][1000/1251] eta 0:01:22 lr 0.000016 time 0.3179 (0.3273) loss 2.8586 (2.9729) grad_norm 3.4548 (3.3160) [2022-10-08 19:40:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][1100/1251] eta 0:00:49 lr 0.000016 time 0.3277 (0.3271) loss 2.8542 (2.9727) grad_norm 2.5966 (3.3266) [2022-10-08 19:40:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [284/300][1200/1251] eta 0:00:16 lr 0.000016 time 0.3263 (0.3269) loss 3.0207 (2.9718) grad_norm 3.3786 (3.3202) [2022-10-08 19:40:57 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 284 training takes 0:06:49 [2022-10-08 19:40:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.408 (2.408) Loss 0.7829 (0.7829) Acc@1 82.129 (82.129) Acc@5 95.996 (95.996) [2022-10-08 19:41:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.044 Acc@5 95.494 [2022-10-08 19:41:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 19:41:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.04% [2022-10-08 19:41:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][0/1251] eta 1:01:29 lr 0.000016 time 2.9494 (2.9494) loss 3.3649 (3.3649) grad_norm 3.4869 (3.4869) [2022-10-08 19:41:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][100/1251] eta 0:06:44 lr 0.000016 time 0.3239 (0.3512) loss 3.0727 (2.9622) grad_norm 3.1706 (3.2602) [2022-10-08 19:42:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][200/1251] eta 0:05:55 lr 0.000016 time 0.3302 (0.3385) loss 3.0562 (2.9565) grad_norm 2.8879 (3.2688) [2022-10-08 19:42:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][300/1251] eta 0:05:18 lr 0.000016 time 0.3259 (0.3348) loss 2.6749 (2.9549) grad_norm 2.9898 (3.3562) [2022-10-08 19:43:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][400/1251] eta 0:04:42 lr 0.000016 time 0.3265 (0.3323) loss 2.8744 (2.9619) grad_norm 3.1227 (3.3615) [2022-10-08 19:43:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][500/1251] eta 0:04:08 lr 0.000016 time 0.3215 (0.3308) loss 2.9901 (2.9626) grad_norm 2.9096 (3.3530) [2022-10-08 19:44:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][600/1251] eta 0:03:34 lr 0.000016 time 0.3257 (0.3298) loss 3.2165 (2.9626) grad_norm 3.9483 (3.3282) [2022-10-08 19:45:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][700/1251] eta 0:03:01 lr 0.000016 time 0.3237 (0.3291) loss 3.1878 (2.9625) grad_norm 3.0845 (3.3351) [2022-10-08 19:45:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][800/1251] eta 0:02:28 lr 0.000016 time 0.3245 (0.3286) loss 3.0256 (2.9639) grad_norm 3.4924 (3.3287) [2022-10-08 19:46:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][900/1251] eta 0:01:55 lr 0.000016 time 0.3254 (0.3281) loss 3.1241 (2.9634) grad_norm 3.8791 (3.3293) [2022-10-08 19:46:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][1000/1251] eta 0:01:22 lr 0.000015 time 0.3268 (0.3278) loss 3.1421 (2.9662) grad_norm 4.2857 (3.3321) [2022-10-08 19:47:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][1100/1251] eta 0:00:49 lr 0.000015 time 0.3270 (0.3275) loss 3.1462 (2.9669) grad_norm 3.5073 (3.3370) [2022-10-08 19:47:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [285/300][1200/1251] eta 0:00:16 lr 0.000015 time 0.3206 (0.3272) loss 3.2471 (2.9685) grad_norm 3.2104 (3.3448) [2022-10-08 19:48:00 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 285 training takes 0:06:49 [2022-10-08 19:48:03 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.227 (3.227) Loss 0.8030 (0.8030) Acc@1 80.957 (80.957) Acc@5 95.410 (95.410) [2022-10-08 19:48:13 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.054 Acc@5 95.490 [2022-10-08 19:48:13 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 19:48:13 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.05% [2022-10-08 19:48:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][0/1251] eta 1:06:31 lr 0.000015 time 3.1910 (3.1910) loss 3.1245 (3.1245) grad_norm 3.3032 (3.3032) [2022-10-08 19:48:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][100/1251] eta 0:06:49 lr 0.000015 time 0.3264 (0.3554) loss 2.8617 (3.0089) grad_norm 3.0832 (3.3022) [2022-10-08 19:49:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][200/1251] eta 0:05:58 lr 0.000015 time 0.3288 (0.3414) loss 2.9777 (2.9784) grad_norm 3.2804 (3.3038) [2022-10-08 19:49:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][300/1251] eta 0:05:20 lr 0.000015 time 0.3339 (0.3367) loss 3.2887 (2.9767) grad_norm 3.4223 (3.3510) [2022-10-08 19:50:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][400/1251] eta 0:04:44 lr 0.000015 time 0.3266 (0.3343) loss 3.2568 (2.9757) grad_norm 3.1415 (3.3670) [2022-10-08 19:51:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][500/1251] eta 0:04:10 lr 0.000015 time 0.3285 (0.3329) loss 3.0428 (2.9740) grad_norm 3.1234 (3.3671) [2022-10-08 19:51:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][600/1251] eta 0:03:36 lr 0.000015 time 0.3350 (0.3319) loss 3.2214 (2.9745) grad_norm 3.2511 (3.3606) [2022-10-08 19:52:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][700/1251] eta 0:03:02 lr 0.000015 time 0.3235 (0.3311) loss 3.2375 (2.9752) grad_norm 3.2244 (3.3682) [2022-10-08 19:52:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][800/1251] eta 0:02:29 lr 0.000015 time 0.3261 (0.3306) loss 3.1205 (2.9715) grad_norm 3.4820 (3.3836) [2022-10-08 19:53:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][900/1251] eta 0:01:55 lr 0.000015 time 0.3298 (0.3304) loss 2.9468 (2.9705) grad_norm 3.4532 (3.3849) [2022-10-08 19:53:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][1000/1251] eta 0:01:22 lr 0.000015 time 0.3260 (0.3300) loss 2.9363 (2.9705) grad_norm 3.0225 (3.3849) [2022-10-08 19:54:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][1100/1251] eta 0:00:49 lr 0.000015 time 0.3218 (0.3297) loss 2.8321 (2.9708) grad_norm 3.3460 (3.3886) [2022-10-08 19:54:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [286/300][1200/1251] eta 0:00:16 lr 0.000015 time 0.3280 (0.3295) loss 2.9841 (2.9703) grad_norm 3.1348 (3.3860) [2022-10-08 19:55:06 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 286 training takes 0:06:52 [2022-10-08 19:55:09 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.073 (3.073) Loss 0.7900 (0.7900) Acc@1 80.078 (80.078) Acc@5 95.703 (95.703) [2022-10-08 19:55:20 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.048 Acc@5 95.480 [2022-10-08 19:55:20 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 19:55:20 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.05% [2022-10-08 19:55:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][0/1251] eta 1:08:28 lr 0.000015 time 3.2845 (3.2845) loss 2.7593 (2.7593) grad_norm 2.8270 (2.8270) [2022-10-08 19:55:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][100/1251] eta 0:06:48 lr 0.000015 time 0.3269 (0.3546) loss 3.1817 (2.9318) grad_norm 3.3630 (3.3331) [2022-10-08 19:56:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][200/1251] eta 0:05:57 lr 0.000014 time 0.3206 (0.3404) loss 2.7813 (2.9380) grad_norm 3.2855 (3.3079) [2022-10-08 19:57:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][300/1251] eta 0:05:19 lr 0.000014 time 0.3301 (0.3356) loss 2.9050 (2.9523) grad_norm 3.2673 (3.3443) [2022-10-08 19:57:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][400/1251] eta 0:04:43 lr 0.000014 time 0.3297 (0.3333) loss 3.1724 (2.9561) grad_norm 3.4658 (3.3392) [2022-10-08 19:58:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][500/1251] eta 0:04:09 lr 0.000014 time 0.3251 (0.3319) loss 2.9922 (2.9632) grad_norm 4.3171 (3.3720) [2022-10-08 19:58:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][600/1251] eta 0:03:35 lr 0.000014 time 0.3215 (0.3310) loss 3.1362 (2.9617) grad_norm 3.1517 (3.3576) [2022-10-08 19:59:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][700/1251] eta 0:03:01 lr 0.000014 time 0.3234 (0.3302) loss 3.1878 (2.9652) grad_norm 4.0563 (3.3744) [2022-10-08 19:59:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][800/1251] eta 0:02:28 lr 0.000014 time 0.3213 (0.3297) loss 3.0311 (2.9653) grad_norm 3.2716 (3.3731) [2022-10-08 20:00:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][900/1251] eta 0:01:55 lr 0.000014 time 0.3288 (0.3294) loss 2.9311 (2.9677) grad_norm 3.4471 (3.3780) [2022-10-08 20:00:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][1000/1251] eta 0:01:22 lr 0.000014 time 0.3237 (0.3291) loss 2.8156 (2.9683) grad_norm 3.3608 (3.3720) [2022-10-08 20:01:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][1100/1251] eta 0:00:49 lr 0.000014 time 0.3263 (0.3288) loss 2.8686 (2.9662) grad_norm 3.0268 (3.3700) [2022-10-08 20:01:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [287/300][1200/1251] eta 0:00:16 lr 0.000014 time 0.3248 (0.3286) loss 2.7773 (2.9658) grad_norm 2.8991 (3.3719) [2022-10-08 20:02:11 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 287 training takes 0:06:51 [2022-10-08 20:02:15 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.403 (3.403) Loss 0.8105 (0.8105) Acc@1 81.055 (81.055) Acc@5 96.191 (96.191) [2022-10-08 20:02:25 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.012 Acc@5 95.458 [2022-10-08 20:02:25 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 20:02:25 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.05% [2022-10-08 20:02:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][0/1251] eta 0:45:08 lr 0.000014 time 2.1654 (2.1654) loss 2.7768 (2.7768) grad_norm 2.8583 (2.8583) [2022-10-08 20:03:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][100/1251] eta 0:06:42 lr 0.000014 time 0.3230 (0.3498) loss 3.0451 (2.9536) grad_norm 2.9524 (3.3225) [2022-10-08 20:03:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][200/1251] eta 0:05:56 lr 0.000014 time 0.3239 (0.3387) loss 2.9886 (2.9611) grad_norm 4.1843 (3.3471) [2022-10-08 20:04:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][300/1251] eta 0:05:18 lr 0.000014 time 0.3233 (0.3345) loss 2.9062 (2.9617) grad_norm 3.1986 (3.3411) [2022-10-08 20:04:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][400/1251] eta 0:04:42 lr 0.000014 time 0.3271 (0.3322) loss 3.0765 (2.9648) grad_norm 3.0614 (3.3394) [2022-10-08 20:05:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][500/1251] eta 0:04:08 lr 0.000014 time 0.3248 (0.3308) loss 3.1253 (2.9690) grad_norm 3.2743 (3.3424) [2022-10-08 20:05:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][600/1251] eta 0:03:34 lr 0.000014 time 0.3295 (0.3298) loss 2.7837 (2.9712) grad_norm 3.5377 (3.3518) [2022-10-08 20:06:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][700/1251] eta 0:03:01 lr 0.000014 time 0.3314 (0.3292) loss 3.1564 (2.9730) grad_norm 2.8797 (3.3531) [2022-10-08 20:06:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][800/1251] eta 0:02:28 lr 0.000013 time 0.3222 (0.3285) loss 2.5213 (2.9710) grad_norm 3.1765 (3.3602) [2022-10-08 20:07:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][900/1251] eta 0:01:55 lr 0.000013 time 0.3277 (0.3280) loss 3.0205 (2.9743) grad_norm 3.7710 (3.3552) [2022-10-08 20:07:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][1000/1251] eta 0:01:22 lr 0.000013 time 0.3227 (0.3276) loss 2.7123 (2.9727) grad_norm 3.2560 (3.3640) [2022-10-08 20:08:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][1100/1251] eta 0:00:49 lr 0.000013 time 0.3236 (0.3274) loss 3.1502 (2.9712) grad_norm 3.8783 (3.3648) [2022-10-08 20:08:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [288/300][1200/1251] eta 0:00:16 lr 0.000013 time 0.3232 (0.3272) loss 2.9381 (2.9681) grad_norm 4.3954 (3.3655) [2022-10-08 20:09:14 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 288 training takes 0:06:49 [2022-10-08 20:09:17 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.752 (2.752) Loss 0.7841 (0.7841) Acc@1 82.422 (82.422) Acc@5 96.484 (96.484) [2022-10-08 20:09:28 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.084 Acc@5 95.502 [2022-10-08 20:09:28 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:09:28 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.08% [2022-10-08 20:09:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][0/1251] eta 0:48:07 lr 0.000013 time 2.3082 (2.3082) loss 2.4915 (2.4915) grad_norm 2.8173 (2.8173) [2022-10-08 20:10:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][100/1251] eta 0:06:43 lr 0.000013 time 0.3226 (0.3503) loss 2.7659 (2.9460) grad_norm 3.4595 (3.4607) [2022-10-08 20:10:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][200/1251] eta 0:05:54 lr 0.000013 time 0.3387 (0.3375) loss 2.7931 (2.9667) grad_norm 3.2839 (3.4063) [2022-10-08 20:11:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][300/1251] eta 0:05:16 lr 0.000013 time 0.3250 (0.3332) loss 3.0392 (2.9664) grad_norm 2.9019 (3.3971) [2022-10-08 20:11:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][400/1251] eta 0:04:41 lr 0.000013 time 0.3231 (0.3310) loss 3.2807 (2.9656) grad_norm 3.4142 (3.3932) [2022-10-08 20:12:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][500/1251] eta 0:04:07 lr 0.000013 time 0.3281 (0.3296) loss 3.1273 (2.9636) grad_norm 2.9785 (3.3855) [2022-10-08 20:12:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][600/1251] eta 0:03:33 lr 0.000013 time 0.3254 (0.3286) loss 2.8412 (2.9681) grad_norm 3.8684 (3.3965) [2022-10-08 20:13:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][700/1251] eta 0:03:00 lr 0.000013 time 0.3235 (0.3280) loss 3.1549 (2.9663) grad_norm 4.0026 (3.4078) [2022-10-08 20:13:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][800/1251] eta 0:02:27 lr 0.000013 time 0.3265 (0.3276) loss 2.9153 (2.9664) grad_norm 2.6746 (3.4035) [2022-10-08 20:14:23 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][900/1251] eta 0:01:54 lr 0.000013 time 0.3272 (0.3273) loss 2.8431 (2.9684) grad_norm 3.0533 (3.4041) [2022-10-08 20:14:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][1000/1251] eta 0:01:22 lr 0.000013 time 0.3216 (0.3270) loss 2.9552 (2.9682) grad_norm 2.6152 (3.4101) [2022-10-08 20:15:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][1100/1251] eta 0:00:49 lr 0.000013 time 0.3250 (0.3268) loss 2.9295 (2.9694) grad_norm 3.0961 (3.4052) [2022-10-08 20:16:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [289/300][1200/1251] eta 0:00:16 lr 0.000013 time 0.3226 (0.3265) loss 3.0057 (2.9672) grad_norm 3.8732 (3.4157) [2022-10-08 20:16:17 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 289 training takes 0:06:48 [2022-10-08 20:16:20 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.968 (2.968) Loss 0.8007 (0.8007) Acc@1 80.566 (80.566) Acc@5 95.508 (95.508) [2022-10-08 20:16:30 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.066 Acc@5 95.544 [2022-10-08 20:16:30 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:16:30 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.08% [2022-10-08 20:16:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][0/1251] eta 1:04:03 lr 0.000013 time 3.0726 (3.0726) loss 2.8541 (2.8541) grad_norm 3.7000 (3.7000) [2022-10-08 20:17:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][100/1251] eta 0:06:47 lr 0.000013 time 0.3330 (0.3545) loss 2.9275 (2.9700) grad_norm 4.1042 (3.4138) [2022-10-08 20:17:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][200/1251] eta 0:05:58 lr 0.000013 time 0.3252 (0.3414) loss 2.9152 (2.9814) grad_norm 3.9079 (3.4315) [2022-10-08 20:18:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][300/1251] eta 0:05:20 lr 0.000013 time 0.3254 (0.3366) loss 2.8382 (2.9749) grad_norm 3.1252 (3.4337) [2022-10-08 20:18:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][400/1251] eta 0:04:44 lr 0.000013 time 0.3238 (0.3342) loss 2.8517 (2.9816) grad_norm 3.0599 (3.4318) [2022-10-08 20:19:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][500/1251] eta 0:04:09 lr 0.000012 time 0.3279 (0.3327) loss 3.0856 (2.9818) grad_norm 4.5858 (3.4323) [2022-10-08 20:19:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][600/1251] eta 0:03:35 lr 0.000012 time 0.3228 (0.3316) loss 2.9979 (2.9793) grad_norm 3.4167 (3.4254) [2022-10-08 20:20:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][700/1251] eta 0:03:02 lr 0.000012 time 0.3293 (0.3308) loss 3.0418 (2.9778) grad_norm 3.5484 (3.4180) [2022-10-08 20:20:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][800/1251] eta 0:02:28 lr 0.000012 time 0.3267 (0.3302) loss 2.8324 (2.9744) grad_norm 2.8996 (3.4135) [2022-10-08 20:21:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][900/1251] eta 0:01:55 lr 0.000012 time 0.3374 (0.3297) loss 2.9279 (2.9742) grad_norm 3.2976 (3.4101) [2022-10-08 20:22:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][1000/1251] eta 0:01:22 lr 0.000012 time 0.3231 (0.3293) loss 3.0122 (2.9704) grad_norm 3.9398 (3.4111) [2022-10-08 20:22:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][1100/1251] eta 0:00:49 lr 0.000012 time 0.3293 (0.3289) loss 2.7493 (2.9696) grad_norm 3.2660 (3.4083) [2022-10-08 20:23:05 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [290/300][1200/1251] eta 0:00:16 lr 0.000012 time 0.3237 (0.3286) loss 3.0098 (2.9688) grad_norm 3.0829 (3.4107) [2022-10-08 20:23:22 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 290 training takes 0:06:51 [2022-10-08 20:23:22 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_290 saving...... [2022-10-08 20:23:22 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_290 saved !!! [2022-10-08 20:23:24 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.345 (2.345) Loss 0.7745 (0.7745) Acc@1 82.227 (82.227) Acc@5 96.094 (96.094) [2022-10-08 20:23:36 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.152 Acc@5 95.474 [2022-10-08 20:23:36 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-08 20:23:36 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:23:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][0/1251] eta 0:58:26 lr 0.000012 time 2.8028 (2.8028) loss 3.1349 (3.1349) grad_norm 4.2548 (4.2548) [2022-10-08 20:24:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][100/1251] eta 0:06:47 lr 0.000012 time 0.3235 (0.3544) loss 3.0105 (2.9758) grad_norm 3.4223 (3.4633) [2022-10-08 20:24:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][200/1251] eta 0:05:57 lr 0.000012 time 0.3271 (0.3400) loss 2.6143 (2.9634) grad_norm 3.0499 (3.4432) [2022-10-08 20:25:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][300/1251] eta 0:05:18 lr 0.000012 time 0.3240 (0.3351) loss 2.9893 (2.9693) grad_norm 3.4103 (3.4400) [2022-10-08 20:25:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][400/1251] eta 0:04:42 lr 0.000012 time 0.3273 (0.3325) loss 2.9164 (2.9641) grad_norm 3.0692 (3.4503) [2022-10-08 20:26:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][500/1251] eta 0:04:08 lr 0.000012 time 0.3268 (0.3310) loss 3.1262 (2.9637) grad_norm 3.3705 (3.4726) [2022-10-08 20:26:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][600/1251] eta 0:03:34 lr 0.000012 time 0.3255 (0.3301) loss 2.6295 (2.9595) grad_norm 3.2511 (3.4568) [2022-10-08 20:27:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][700/1251] eta 0:03:01 lr 0.000012 time 0.3253 (0.3294) loss 2.9496 (2.9613) grad_norm 3.1715 (3.4383) [2022-10-08 20:27:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][800/1251] eta 0:02:28 lr 0.000012 time 0.3244 (0.3288) loss 2.9448 (2.9641) grad_norm 5.7745 (3.4308) [2022-10-08 20:28:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][900/1251] eta 0:01:55 lr 0.000012 time 0.3224 (0.3283) loss 3.1907 (2.9615) grad_norm 2.9436 (3.4270) [2022-10-08 20:29:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][1000/1251] eta 0:01:22 lr 0.000012 time 0.3275 (0.3278) loss 3.0284 (2.9629) grad_norm 3.5392 (3.4406) [2022-10-08 20:29:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][1100/1251] eta 0:00:49 lr 0.000012 time 0.3239 (0.3274) loss 3.1118 (2.9640) grad_norm 2.9901 (3.4397) [2022-10-08 20:30:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [291/300][1200/1251] eta 0:00:16 lr 0.000012 time 0.3215 (0.3272) loss 3.1112 (2.9616) grad_norm 3.6353 (3.4529) [2022-10-08 20:30:25 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 291 training takes 0:06:49 [2022-10-08 20:30:28 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.992 (2.992) Loss 0.7772 (0.7772) Acc@1 81.445 (81.445) Acc@5 96.680 (96.680) [2022-10-08 20:30:39 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.080 Acc@5 95.520 [2022-10-08 20:30:39 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:30:39 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:30:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][0/1251] eta 0:56:03 lr 0.000012 time 2.6887 (2.6887) loss 3.2090 (3.2090) grad_norm 3.1805 (3.1805) [2022-10-08 20:31:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][100/1251] eta 0:06:42 lr 0.000012 time 0.3226 (0.3498) loss 3.1881 (2.9537) grad_norm 3.1760 (3.3342) [2022-10-08 20:31:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][200/1251] eta 0:05:54 lr 0.000012 time 0.3263 (0.3369) loss 2.7999 (2.9483) grad_norm 4.5061 (3.3765) [2022-10-08 20:32:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][300/1251] eta 0:05:16 lr 0.000012 time 0.3263 (0.3328) loss 2.9257 (2.9470) grad_norm 3.3971 (3.3632) [2022-10-08 20:32:51 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][400/1251] eta 0:04:41 lr 0.000012 time 0.3296 (0.3306) loss 2.9436 (2.9561) grad_norm 2.9701 (3.3912) [2022-10-08 20:33:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][500/1251] eta 0:04:07 lr 0.000012 time 0.3278 (0.3294) loss 3.0176 (2.9577) grad_norm 3.1494 (3.4066) [2022-10-08 20:33:56 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][600/1251] eta 0:03:33 lr 0.000012 time 0.3282 (0.3287) loss 3.0238 (2.9594) grad_norm 3.2829 (3.4208) [2022-10-08 20:34:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][700/1251] eta 0:03:00 lr 0.000012 time 0.3245 (0.3282) loss 3.1021 (2.9568) grad_norm 3.2638 (3.4260) [2022-10-08 20:35:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][800/1251] eta 0:02:27 lr 0.000011 time 0.3280 (0.3279) loss 2.9672 (2.9535) grad_norm 3.1992 (3.4220) [2022-10-08 20:35:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][900/1251] eta 0:01:54 lr 0.000011 time 0.3219 (0.3275) loss 2.9615 (2.9537) grad_norm 3.4327 (3.4230) [2022-10-08 20:36:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][1000/1251] eta 0:01:22 lr 0.000011 time 0.3243 (0.3272) loss 3.0881 (2.9544) grad_norm 2.8937 (3.4298) [2022-10-08 20:36:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][1100/1251] eta 0:00:49 lr 0.000011 time 0.3193 (0.3270) loss 3.0106 (2.9555) grad_norm 3.5016 (3.4270) [2022-10-08 20:37:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [292/300][1200/1251] eta 0:00:16 lr 0.000011 time 0.3261 (0.3268) loss 3.0203 (2.9574) grad_norm 3.0079 (3.4299) [2022-10-08 20:37:28 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 292 training takes 0:06:49 [2022-10-08 20:37:31 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.865 (2.865) Loss 0.7662 (0.7662) Acc@1 82.617 (82.617) Acc@5 95.605 (95.605) [2022-10-08 20:37:41 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.134 Acc@5 95.542 [2022-10-08 20:37:41 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:37:41 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:37:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][0/1251] eta 0:59:25 lr 0.000011 time 2.8500 (2.8500) loss 3.2553 (3.2553) grad_norm 3.8503 (3.8503) [2022-10-08 20:38:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][100/1251] eta 0:06:43 lr 0.000011 time 0.3244 (0.3510) loss 3.1492 (2.9601) grad_norm 3.3383 (3.4211) [2022-10-08 20:38:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][200/1251] eta 0:05:55 lr 0.000011 time 0.3250 (0.3378) loss 2.9917 (2.9511) grad_norm 3.0588 (3.4263) [2022-10-08 20:39:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][300/1251] eta 0:05:17 lr 0.000011 time 0.3225 (0.3335) loss 3.0683 (2.9462) grad_norm 3.7119 (3.4513) [2022-10-08 20:39:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][400/1251] eta 0:04:41 lr 0.000011 time 0.3243 (0.3314) loss 2.9136 (2.9486) grad_norm 3.1481 (3.4400) [2022-10-08 20:40:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][500/1251] eta 0:04:07 lr 0.000011 time 0.3228 (0.3301) loss 2.8462 (2.9503) grad_norm 3.9425 (3.4326) [2022-10-08 20:40:59 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][600/1251] eta 0:03:34 lr 0.000011 time 0.3188 (0.3293) loss 3.0663 (2.9586) grad_norm 3.4136 (3.4301) [2022-10-08 20:41:32 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][700/1251] eta 0:03:01 lr 0.000011 time 0.3222 (0.3286) loss 3.0312 (2.9576) grad_norm 3.2083 (3.4174) [2022-10-08 20:42:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][800/1251] eta 0:02:28 lr 0.000011 time 0.3232 (0.3282) loss 3.0437 (2.9576) grad_norm 3.1858 (3.4234) [2022-10-08 20:42:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][900/1251] eta 0:01:55 lr 0.000011 time 0.3213 (0.3278) loss 2.7882 (2.9579) grad_norm 3.8883 (3.4233) [2022-10-08 20:43:09 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][1000/1251] eta 0:01:22 lr 0.000011 time 0.3265 (0.3276) loss 2.8752 (2.9572) grad_norm 3.0932 (3.4279) [2022-10-08 20:43:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][1100/1251] eta 0:00:49 lr 0.000011 time 0.3258 (0.3273) loss 2.9764 (2.9575) grad_norm 3.3795 (3.4218) [2022-10-08 20:44:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [293/300][1200/1251] eta 0:00:16 lr 0.000011 time 0.3265 (0.3273) loss 3.0386 (2.9580) grad_norm 3.0859 (3.4224) [2022-10-08 20:44:31 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 293 training takes 0:06:49 [2022-10-08 20:44:34 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.843 (2.843) Loss 0.7881 (0.7881) Acc@1 80.273 (80.273) Acc@5 96.387 (96.387) [2022-10-08 20:44:45 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.068 Acc@5 95.476 [2022-10-08 20:44:45 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:44:45 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:44:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][0/1251] eta 1:09:55 lr 0.000011 time 3.3536 (3.3536) loss 2.6925 (2.6925) grad_norm 3.6910 (3.6910) [2022-10-08 20:45:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][100/1251] eta 0:06:49 lr 0.000011 time 0.3274 (0.3560) loss 2.8385 (2.9586) grad_norm 3.9027 (3.5304) [2022-10-08 20:45:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][200/1251] eta 0:05:58 lr 0.000011 time 0.3241 (0.3413) loss 3.1244 (2.9657) grad_norm 2.9382 (3.5268) [2022-10-08 20:46:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][300/1251] eta 0:05:19 lr 0.000011 time 0.3282 (0.3360) loss 2.8603 (2.9652) grad_norm 2.9460 (3.5074) [2022-10-08 20:46:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][400/1251] eta 0:04:43 lr 0.000011 time 0.3251 (0.3333) loss 2.8211 (2.9691) grad_norm 3.0220 (3.4858) [2022-10-08 20:47:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][500/1251] eta 0:04:09 lr 0.000011 time 0.3228 (0.3317) loss 2.7910 (2.9593) grad_norm 2.9465 (3.5165) [2022-10-08 20:48:03 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][600/1251] eta 0:03:35 lr 0.000011 time 0.3229 (0.3306) loss 2.9377 (2.9575) grad_norm 3.7999 (3.5064) [2022-10-08 20:48:36 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][700/1251] eta 0:03:01 lr 0.000011 time 0.3228 (0.3297) loss 3.0688 (2.9557) grad_norm 3.1006 (3.5157) [2022-10-08 20:49:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][800/1251] eta 0:02:28 lr 0.000011 time 0.3222 (0.3290) loss 2.9299 (2.9562) grad_norm 2.8437 (3.5147) [2022-10-08 20:49:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][900/1251] eta 0:01:55 lr 0.000011 time 0.3199 (0.3285) loss 2.8592 (2.9557) grad_norm 3.1270 (3.5084) [2022-10-08 20:50:13 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][1000/1251] eta 0:01:22 lr 0.000011 time 0.3258 (0.3280) loss 3.1373 (2.9552) grad_norm 3.4475 (3.5022) [2022-10-08 20:50:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][1100/1251] eta 0:00:49 lr 0.000011 time 0.3236 (0.3277) loss 2.8387 (2.9542) grad_norm 3.3431 (3.5031) [2022-10-08 20:51:18 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [294/300][1200/1251] eta 0:00:16 lr 0.000011 time 0.3272 (0.3275) loss 3.1601 (2.9567) grad_norm 3.3841 (3.5026) [2022-10-08 20:51:35 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 294 training takes 0:06:49 [2022-10-08 20:51:37 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.747 (2.747) Loss 0.8245 (0.8245) Acc@1 81.543 (81.543) Acc@5 95.117 (95.117) [2022-10-08 20:51:48 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.052 Acc@5 95.502 [2022-10-08 20:51:48 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 20:51:48 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:51:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][0/1251] eta 1:06:57 lr 0.000011 time 3.2111 (3.2111) loss 2.7057 (2.7057) grad_norm 3.5674 (3.5674) [2022-10-08 20:52:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][100/1251] eta 0:06:47 lr 0.000011 time 0.3244 (0.3538) loss 2.8581 (2.9547) grad_norm 3.4697 (3.4446) [2022-10-08 20:52:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][200/1251] eta 0:05:57 lr 0.000011 time 0.3308 (0.3401) loss 2.9841 (2.9554) grad_norm 4.1059 (3.4668) [2022-10-08 20:53:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][300/1251] eta 0:05:18 lr 0.000011 time 0.3237 (0.3353) loss 3.0525 (2.9567) grad_norm 3.3217 (3.4504) [2022-10-08 20:54:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][400/1251] eta 0:04:43 lr 0.000011 time 0.3298 (0.3330) loss 2.7260 (2.9631) grad_norm 3.2700 (3.4610) [2022-10-08 20:54:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][500/1251] eta 0:04:09 lr 0.000011 time 0.3265 (0.3317) loss 2.9134 (2.9661) grad_norm 3.5980 (3.4594) [2022-10-08 20:55:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][600/1251] eta 0:03:35 lr 0.000011 time 0.3277 (0.3308) loss 2.8210 (2.9655) grad_norm 3.3784 (3.4620) [2022-10-08 20:55:40 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][700/1251] eta 0:03:01 lr 0.000011 time 0.3225 (0.3302) loss 2.8757 (2.9670) grad_norm 2.9658 (3.4497) [2022-10-08 20:56:12 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][800/1251] eta 0:02:28 lr 0.000011 time 0.3322 (0.3295) loss 2.8118 (2.9631) grad_norm 2.9071 (3.4606) [2022-10-08 20:56:45 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][900/1251] eta 0:01:55 lr 0.000010 time 0.3197 (0.3290) loss 2.8567 (2.9630) grad_norm 3.0271 (3.4670) [2022-10-08 20:57:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][1000/1251] eta 0:01:22 lr 0.000010 time 0.3291 (0.3286) loss 2.8119 (2.9607) grad_norm 3.0149 (3.4867) [2022-10-08 20:57:50 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][1100/1251] eta 0:00:49 lr 0.000010 time 0.3263 (0.3282) loss 2.8779 (2.9601) grad_norm 3.4397 (3.4841) [2022-10-08 20:58:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [295/300][1200/1251] eta 0:00:16 lr 0.000010 time 0.3298 (0.3279) loss 2.9602 (2.9622) grad_norm 3.2213 (3.4956) [2022-10-08 20:58:39 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 295 training takes 0:06:50 [2022-10-08 20:58:42 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.853 (2.853) Loss 0.8001 (0.8001) Acc@1 81.055 (81.055) Acc@5 95.898 (95.898) [2022-10-08 20:58:53 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.010 Acc@5 95.496 [2022-10-08 20:58:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-08 20:58:53 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 20:58:55 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][0/1251] eta 0:59:19 lr 0.000010 time 2.8452 (2.8452) loss 2.8574 (2.8574) grad_norm 3.5319 (3.5319) [2022-10-08 20:59:28 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][100/1251] eta 0:06:44 lr 0.000010 time 0.3314 (0.3515) loss 2.7658 (2.9619) grad_norm 3.4977 (3.3633) [2022-10-08 21:00:01 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][200/1251] eta 0:05:56 lr 0.000010 time 0.3203 (0.3388) loss 2.7306 (2.9542) grad_norm 3.2205 (3.4186) [2022-10-08 21:00:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][300/1251] eta 0:05:18 lr 0.000010 time 0.3217 (0.3347) loss 3.0663 (2.9583) grad_norm 3.4108 (3.4137) [2022-10-08 21:01:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][400/1251] eta 0:04:43 lr 0.000010 time 0.3236 (0.3327) loss 3.0554 (2.9573) grad_norm 3.4275 (3.4313) [2022-10-08 21:01:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][500/1251] eta 0:04:08 lr 0.000010 time 0.3238 (0.3315) loss 2.7952 (2.9627) grad_norm 3.6466 (3.4461) [2022-10-08 21:02:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][600/1251] eta 0:03:35 lr 0.000010 time 0.3201 (0.3306) loss 3.0332 (2.9607) grad_norm 3.2900 (3.4562) [2022-10-08 21:02:44 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][700/1251] eta 0:03:01 lr 0.000010 time 0.3248 (0.3300) loss 2.9077 (2.9622) grad_norm 3.8290 (3.4615) [2022-10-08 21:03:17 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][800/1251] eta 0:02:28 lr 0.000010 time 0.3249 (0.3296) loss 2.9797 (2.9595) grad_norm 3.3947 (3.4695) [2022-10-08 21:03:49 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][900/1251] eta 0:01:55 lr 0.000010 time 0.3204 (0.3292) loss 3.0108 (2.9573) grad_norm 3.1762 (3.4754) [2022-10-08 21:04:22 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][1000/1251] eta 0:01:22 lr 0.000010 time 0.3256 (0.3287) loss 2.9340 (2.9586) grad_norm 2.9282 (3.4613) [2022-10-08 21:04:54 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][1100/1251] eta 0:00:49 lr 0.000010 time 0.3239 (0.3286) loss 2.9962 (2.9562) grad_norm 3.1431 (3.4727) [2022-10-08 21:05:27 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [296/300][1200/1251] eta 0:00:16 lr 0.000010 time 0.3257 (0.3283) loss 3.0690 (2.9574) grad_norm 3.7093 (3.4715) [2022-10-08 21:05:43 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 296 training takes 0:06:50 [2022-10-08 21:05:47 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.161 (3.161) Loss 0.8000 (0.8000) Acc@1 80.176 (80.176) Acc@5 95.605 (95.605) [2022-10-08 21:05:57 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.114 Acc@5 95.518 [2022-10-08 21:05:57 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 21:05:57 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 21:06:00 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][0/1251] eta 1:06:23 lr 0.000010 time 3.1844 (3.1844) loss 2.8250 (2.8250) grad_norm 3.2609 (3.2609) [2022-10-08 21:06:33 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][100/1251] eta 0:06:47 lr 0.000010 time 0.3255 (0.3542) loss 2.9061 (2.9861) grad_norm 2.9559 (3.4610) [2022-10-08 21:07:06 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][200/1251] eta 0:05:57 lr 0.000010 time 0.3226 (0.3398) loss 2.9462 (2.9720) grad_norm 4.0336 (3.5195) [2022-10-08 21:07:38 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][300/1251] eta 0:05:18 lr 0.000010 time 0.3253 (0.3349) loss 2.7637 (2.9711) grad_norm 3.4665 (3.5129) [2022-10-08 21:08:11 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][400/1251] eta 0:04:43 lr 0.000010 time 0.3250 (0.3327) loss 2.8233 (2.9657) grad_norm 2.9514 (3.5039) [2022-10-08 21:08:43 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][500/1251] eta 0:04:08 lr 0.000010 time 0.3280 (0.3312) loss 2.9018 (2.9615) grad_norm 3.7730 (3.5147) [2022-10-08 21:09:16 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][600/1251] eta 0:03:34 lr 0.000010 time 0.3223 (0.3302) loss 3.0497 (2.9627) grad_norm 2.9979 (3.5192) [2022-10-08 21:09:48 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][700/1251] eta 0:03:01 lr 0.000010 time 0.3241 (0.3295) loss 2.4643 (2.9613) grad_norm 3.7501 (3.5107) [2022-10-08 21:10:21 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][800/1251] eta 0:02:28 lr 0.000010 time 0.3233 (0.3290) loss 2.9470 (2.9586) grad_norm 3.7730 (3.4980) [2022-10-08 21:10:53 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][900/1251] eta 0:01:55 lr 0.000010 time 0.3314 (0.3285) loss 2.9258 (2.9588) grad_norm 3.3478 (3.5061) [2022-10-08 21:11:26 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][1000/1251] eta 0:01:22 lr 0.000010 time 0.3234 (0.3282) loss 2.9018 (2.9594) grad_norm 3.2851 (3.5176) [2022-10-08 21:11:58 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][1100/1251] eta 0:00:49 lr 0.000010 time 0.3242 (0.3279) loss 2.7855 (2.9593) grad_norm 3.2350 (3.5104) [2022-10-08 21:12:31 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [297/300][1200/1251] eta 0:00:16 lr 0.000010 time 0.3217 (0.3277) loss 3.0506 (2.9610) grad_norm 4.0534 (3.5082) [2022-10-08 21:12:47 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 297 training takes 0:06:50 [2022-10-08 21:12:51 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 3.221 (3.221) Loss 0.8121 (0.8121) Acc@1 80.957 (80.957) Acc@5 94.141 (94.141) [2022-10-08 21:13:01 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.066 Acc@5 95.526 [2022-10-08 21:13:01 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 21:13:01 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 21:13:04 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][0/1251] eta 0:59:57 lr 0.000010 time 2.8753 (2.8753) loss 3.1727 (3.1727) grad_norm 3.5110 (3.5110) [2022-10-08 21:13:37 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][100/1251] eta 0:06:45 lr 0.000010 time 0.3249 (0.3526) loss 2.8079 (2.9614) grad_norm 3.2022 (3.4731) [2022-10-08 21:14:10 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][200/1251] eta 0:05:57 lr 0.000010 time 0.3308 (0.3398) loss 3.2470 (2.9610) grad_norm 3.1644 (3.4970) [2022-10-08 21:14:42 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][300/1251] eta 0:05:18 lr 0.000010 time 0.3297 (0.3354) loss 3.1184 (2.9619) grad_norm 3.3631 (3.4939) [2022-10-08 21:15:15 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][400/1251] eta 0:04:43 lr 0.000010 time 0.3234 (0.3331) loss 3.0113 (2.9637) grad_norm 2.9358 (3.4940) [2022-10-08 21:15:47 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][500/1251] eta 0:04:08 lr 0.000010 time 0.3239 (0.3315) loss 2.8060 (2.9603) grad_norm 3.3486 (3.5085) [2022-10-08 21:16:20 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][600/1251] eta 0:03:35 lr 0.000010 time 0.3242 (0.3304) loss 2.9133 (2.9657) grad_norm 3.7392 (3.5064) [2022-10-08 21:16:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][700/1251] eta 0:03:01 lr 0.000010 time 0.3249 (0.3296) loss 3.2671 (2.9611) grad_norm 4.2474 (3.5049) [2022-10-08 21:17:25 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][800/1251] eta 0:02:28 lr 0.000010 time 0.3265 (0.3290) loss 2.8819 (2.9583) grad_norm 3.0818 (3.5133) [2022-10-08 21:17:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][900/1251] eta 0:01:55 lr 0.000010 time 0.3299 (0.3285) loss 3.0115 (2.9565) grad_norm 2.9586 (3.5044) [2022-10-08 21:18:30 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][1000/1251] eta 0:01:22 lr 0.000010 time 0.3179 (0.3280) loss 3.0612 (2.9547) grad_norm 3.9062 (3.5019) [2022-10-08 21:19:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][1100/1251] eta 0:00:49 lr 0.000010 time 0.3253 (0.3277) loss 3.2027 (2.9543) grad_norm 3.1807 (3.4982) [2022-10-08 21:19:35 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [298/300][1200/1251] eta 0:00:16 lr 0.000010 time 0.3260 (0.3274) loss 3.0920 (2.9567) grad_norm 3.5391 (3.4967) [2022-10-08 21:19:51 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 298 training takes 0:06:49 [2022-10-08 21:19:54 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.892 (2.892) Loss 0.8798 (0.8798) Acc@1 79.004 (79.004) Acc@5 95.410 (95.410) [2022-10-08 21:20:05 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.072 Acc@5 95.526 [2022-10-08 21:20:05 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 21:20:05 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 21:20:08 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][0/1251] eta 1:07:43 lr 0.000010 time 3.2483 (3.2483) loss 3.1059 (3.1059) grad_norm 3.3665 (3.3665) [2022-10-08 21:20:41 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][100/1251] eta 0:06:49 lr 0.000010 time 0.3332 (0.3559) loss 2.8953 (2.9716) grad_norm 3.3549 (3.5133) [2022-10-08 21:21:14 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][200/1251] eta 0:05:59 lr 0.000010 time 0.3285 (0.3419) loss 2.8564 (2.9685) grad_norm 3.0562 (3.5085) [2022-10-08 21:21:46 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][300/1251] eta 0:05:20 lr 0.000010 time 0.3269 (0.3369) loss 2.8058 (2.9587) grad_norm 3.1119 (3.5262) [2022-10-08 21:22:19 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][400/1251] eta 0:04:44 lr 0.000010 time 0.3233 (0.3342) loss 3.0519 (2.9525) grad_norm 3.7732 (3.5229) [2022-10-08 21:22:52 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][500/1251] eta 0:04:09 lr 0.000010 time 0.3229 (0.3325) loss 3.0904 (2.9508) grad_norm 2.7914 (3.5115) [2022-10-08 21:23:24 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][600/1251] eta 0:03:35 lr 0.000010 time 0.3260 (0.3312) loss 2.9979 (2.9502) grad_norm 4.4217 (3.4991) [2022-10-08 21:23:57 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][700/1251] eta 0:03:02 lr 0.000010 time 0.3262 (0.3304) loss 2.9431 (2.9507) grad_norm 3.1740 (3.5013) [2022-10-08 21:24:29 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][800/1251] eta 0:02:28 lr 0.000010 time 0.3234 (0.3297) loss 2.9849 (2.9522) grad_norm 3.8606 (3.4982) [2022-10-08 21:25:02 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][900/1251] eta 0:01:55 lr 0.000010 time 0.3265 (0.3292) loss 3.0903 (2.9531) grad_norm 2.8932 (3.5073) [2022-10-08 21:25:34 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][1000/1251] eta 0:01:22 lr 0.000010 time 0.3271 (0.3290) loss 3.1374 (2.9518) grad_norm 3.3273 (3.5109) [2022-10-08 21:26:07 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][1100/1251] eta 0:00:49 lr 0.000010 time 0.3289 (0.3286) loss 2.7703 (2.9533) grad_norm 4.0895 (3.5128) [2022-10-08 21:26:39 swin_tiny_patch4_window7_224] (main.py 188): INFO Train: [299/300][1200/1251] eta 0:00:16 lr 0.000010 time 0.3203 (0.3282) loss 2.7464 (2.9525) grad_norm 3.3891 (3.5341) [2022-10-08 21:26:56 swin_tiny_patch4_window7_224] (main.py 196): INFO EPOCH 299 training takes 0:06:50 [2022-10-08 21:26:56 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_299 saving...... [2022-10-08 21:26:56 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_eager_global/model_299 saved !!! [2022-10-08 21:26:59 swin_tiny_patch4_window7_224] (main.py 245): INFO Test: [0/49] Time 2.747 (2.747) Loss 0.8293 (0.8293) Acc@1 80.176 (80.176) Acc@5 95.703 (95.703) [2022-10-08 21:27:10 swin_tiny_patch4_window7_224] (main.py 252): INFO * Acc@1 81.110 Acc@5 95.482 [2022-10-08 21:27:10 swin_tiny_patch4_window7_224] (main.py 121): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-08 21:27:10 swin_tiny_patch4_window7_224] (main.py 123): INFO Max accuracy: 81.15% [2022-10-08 21:27:10 swin_tiny_patch4_window7_224] (main.py 127): INFO Training time 1 day, 11:25:42