[2022-10-01 16:50:48 swin_tiny_patch4_window7_224] (main.py 312): INFO Full config saved to output/swin_tiny_patch4_window7_224/fix_ddp/config.json [2022-10-01 16:50:48 swin_tiny_patch4_window7_224] (main.py 315): INFO AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 128 CACHE_MODE: part DATASET: imagenet DATA_PATH: /data/ImageNet/extract/ IMG_SIZE: 224 INTERPOLATION: bicubic NUM_WORKERS: 8 PIN_MEMORY: true ZIP_MODE: false EVAL_MODE: false LOCAL_RANK: 0 MODEL: DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 NAME: swin_tiny_patch4_window7_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 TYPE: swin OUTPUT: output/swin_tiny_patch4_window7_224/fix_ddp PRINT_FREQ: 100 SAVE_FREQ: 10 SEED: 0 TAG: fix_ddp TEST: CROP: true SEQUENTIAL: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 0 AUTO_RESUME: false BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 NAME: cosine MIN_LR: 1.0e-05 OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 [2022-10-01 16:50:51 swin_tiny_patch4_window7_224] (main.py 70): INFO Creating model:swin/swin_tiny_patch4_window7_224 [2022-10-01 16:50:53 swin_tiny_patch4_window7_224] (main.py 74): INFO SwinTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((96,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=384, out_features=192, bias=False) (norm): LayerNorm((384,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=768, out_features=384, bias=False) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=1536, out_features=768, bias=False) (norm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d() (head): Linear(in_features=768, out_features=1000, bias=True) ) [2022-10-01 16:50:53 swin_tiny_patch4_window7_224] (main.py 81): INFO number of params: 28288354 [2022-10-01 16:50:53 swin_tiny_patch4_window7_224] (main.py 121): INFO Start training [2022-10-01 16:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][0/1251] eta 2:27:26 lr 0.000001 time 7.0719 (7.0719) loss 6.9521 (6.9521) grad_norm 1.3919 (1.3919) [2022-10-01 16:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][100/1251] eta 0:06:51 lr 0.000005 time 0.2873 (0.3573) loss 6.9108 (6.9506) grad_norm 1.2623 (1.3438) [2022-10-01 16:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][200/1251] eta 0:05:40 lr 0.000009 time 0.2915 (0.3241) loss 6.9081 (6.9360) grad_norm 1.2570 (1.2904) [2022-10-01 16:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][300/1251] eta 0:04:57 lr 0.000013 time 0.2917 (0.3129) loss 6.8586 (6.9238) grad_norm 1.0591 (1.2330) [2022-10-01 16:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][400/1251] eta 0:04:21 lr 0.000017 time 0.2892 (0.3075) loss 6.8494 (6.9137) grad_norm 0.9821 (1.1806) [2022-10-01 16:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][500/1251] eta 0:03:48 lr 0.000021 time 0.2874 (0.3041) loss 6.9109 (6.9046) grad_norm 1.0199 (1.1370) [2022-10-01 16:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][600/1251] eta 0:03:16 lr 0.000025 time 0.2892 (0.3020) loss 6.8193 (6.8969) grad_norm 0.9274 (1.1026) [2022-10-01 16:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][700/1251] eta 0:02:45 lr 0.000029 time 0.2908 (0.3004) loss 6.8672 (6.8889) grad_norm 0.9154 (1.0799) [2022-10-01 16:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][800/1251] eta 0:02:14 lr 0.000033 time 0.2916 (0.2993) loss 6.8250 (6.8813) grad_norm 1.0434 (1.0773) [2022-10-01 16:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][900/1251] eta 0:01:44 lr 0.000037 time 0.2873 (0.2984) loss 6.8124 (6.8735) grad_norm 1.2532 (1.0964) [2022-10-01 16:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1000/1251] eta 0:01:14 lr 0.000041 time 0.2911 (0.2978) loss 6.7943 (6.8650) grad_norm 1.3142 (1.1263) [2022-10-01 16:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1100/1251] eta 0:00:44 lr 0.000045 time 0.2869 (0.2972) loss 6.8002 (6.8557) grad_norm 1.5771 (1.1605) [2022-10-01 16:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1200/1251] eta 0:00:15 lr 0.000049 time 0.2883 (0.2967) loss 6.7024 (6.8463) grad_norm 1.6667 (1.1874) [2022-10-01 16:57:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 0 training takes 0:06:11 [2022-10-01 16:57:04 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saving...... [2022-10-01 16:57:05 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saved !!! [2022-10-01 16:57:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.498 (2.498) Loss 6.3352 (6.3352) Acc@1 1.562 (1.562) Acc@5 6.641 (6.641) [2022-10-01 16:57:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 1.886 Acc@5 6.398 [2022-10-01 16:57:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 1.9% [2022-10-01 16:57:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 1.89% [2022-10-01 16:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][0/1251] eta 0:50:22 lr 0.000051 time 2.4164 (2.4164) loss 6.7680 (6.7680) grad_norm 1.0774 (1.0774) [2022-10-01 16:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][100/1251] eta 0:06:02 lr 0.000055 time 0.2908 (0.3146) loss 6.7120 (6.6954) grad_norm 2.5042 (1.8803) [2022-10-01 16:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][200/1251] eta 0:05:16 lr 0.000059 time 0.2873 (0.3015) loss 6.8553 (6.6903) grad_norm 1.6226 (1.9282) [2022-10-01 16:58:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][300/1251] eta 0:04:42 lr 0.000063 time 0.2878 (0.2972) loss 6.4293 (6.6680) grad_norm 1.9854 (1.9255) [2022-10-01 16:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][400/1251] eta 0:04:11 lr 0.000067 time 0.2873 (0.2950) loss 6.5593 (6.6495) grad_norm 2.7330 (1.9587) [2022-10-01 16:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][500/1251] eta 0:03:40 lr 0.000071 time 0.2856 (0.2936) loss 6.5637 (6.6360) grad_norm 1.9812 (1.9958) [2022-10-01 17:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][600/1251] eta 0:03:10 lr 0.000075 time 0.2891 (0.2926) loss 6.4361 (6.6235) grad_norm 2.2315 (2.0012) [2022-10-01 17:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][700/1251] eta 0:02:40 lr 0.000079 time 0.2910 (0.2920) loss 6.5575 (6.6065) grad_norm 1.7390 (2.0091) [2022-10-01 17:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][800/1251] eta 0:02:11 lr 0.000083 time 0.2910 (0.2915) loss 6.6398 (6.5936) grad_norm 2.3869 (2.0239) [2022-10-01 17:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][900/1251] eta 0:01:42 lr 0.000087 time 0.2855 (0.2911) loss 6.6702 (6.5829) grad_norm 1.7056 (2.0293) [2022-10-01 17:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1000/1251] eta 0:01:12 lr 0.000091 time 0.2872 (0.2907) loss 6.6374 (6.5727) grad_norm 1.9076 (2.0405) [2022-10-01 17:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1100/1251] eta 0:00:43 lr 0.000095 time 0.2865 (0.2905) loss 6.3268 (6.5621) grad_norm 1.6923 (2.0557) [2022-10-01 17:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1200/1251] eta 0:00:14 lr 0.000099 time 0.2862 (0.2902) loss 6.5366 (6.5515) grad_norm 1.8167 (2.0723) [2022-10-01 17:03:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 1 training takes 0:06:03 [2022-10-01 17:03:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.831 (2.831) Loss 5.5873 (5.5873) Acc@1 5.762 (5.762) Acc@5 16.797 (16.797) [2022-10-01 17:03:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 6.176 Acc@5 17.902 [2022-10-01 17:03:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 6.2% [2022-10-01 17:03:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 6.18% [2022-10-01 17:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][0/1251] eta 0:58:51 lr 0.000101 time 2.8230 (2.8230) loss 6.3257 (6.3257) grad_norm 1.7081 (1.7081) [2022-10-01 17:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][100/1251] eta 0:06:03 lr 0.000105 time 0.2912 (0.3157) loss 6.1881 (6.4491) grad_norm 1.7168 (2.1014) [2022-10-01 17:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][200/1251] eta 0:05:18 lr 0.000109 time 0.2866 (0.3026) loss 6.5381 (6.4349) grad_norm 2.4747 (2.1695) [2022-10-01 17:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][300/1251] eta 0:04:43 lr 0.000113 time 0.2892 (0.2981) loss 6.6259 (6.4103) grad_norm 3.0322 (2.1626) [2022-10-01 17:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][400/1251] eta 0:04:11 lr 0.000117 time 0.2883 (0.2958) loss 6.0784 (6.3873) grad_norm 1.7447 (2.1844) [2022-10-01 17:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][500/1251] eta 0:03:41 lr 0.000121 time 0.2952 (0.2945) loss 6.5569 (6.3768) grad_norm 2.0325 (2.1941) [2022-10-01 17:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][600/1251] eta 0:03:11 lr 0.000125 time 0.2896 (0.2936) loss 6.1189 (6.3644) grad_norm 2.6536 (2.1955) [2022-10-01 17:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][700/1251] eta 0:02:41 lr 0.000129 time 0.2886 (0.2929) loss 6.5359 (6.3553) grad_norm 2.1367 (2.2175) [2022-10-01 17:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][800/1251] eta 0:02:11 lr 0.000133 time 0.2881 (0.2923) loss 6.3675 (6.3476) grad_norm 1.7895 (2.2221) [2022-10-01 17:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][900/1251] eta 0:01:42 lr 0.000137 time 0.2914 (0.2919) loss 6.3719 (6.3357) grad_norm 2.1984 (2.2248) [2022-10-01 17:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1000/1251] eta 0:01:13 lr 0.000141 time 0.2914 (0.2916) loss 6.3624 (6.3234) grad_norm 1.8637 (2.2368) [2022-10-01 17:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1100/1251] eta 0:00:44 lr 0.000145 time 0.2909 (0.2914) loss 5.6476 (6.3139) grad_norm 2.2052 (2.2420) [2022-10-01 17:09:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1200/1251] eta 0:00:14 lr 0.000149 time 0.2906 (0.2914) loss 6.2213 (6.3088) grad_norm 2.2004 (2.2480) [2022-10-01 17:09:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 2 training takes 0:06:04 [2022-10-01 17:09:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.020 (3.020) Loss 4.9213 (4.9213) Acc@1 11.328 (11.328) Acc@5 28.711 (28.711) [2022-10-01 17:09:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 11.928 Acc@5 28.830 [2022-10-01 17:09:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 11.9% [2022-10-01 17:09:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 11.93% [2022-10-01 17:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][0/1251] eta 0:53:58 lr 0.000151 time 2.5887 (2.5887) loss 6.0880 (6.0880) grad_norm 2.2186 (2.2186) [2022-10-01 17:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][100/1251] eta 0:06:04 lr 0.000155 time 0.2889 (0.3166) loss 6.0505 (6.1919) grad_norm 2.3310 (2.2792) [2022-10-01 17:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][200/1251] eta 0:05:19 lr 0.000159 time 0.2942 (0.3036) loss 6.3353 (6.1427) grad_norm 2.0253 (2.3026) [2022-10-01 17:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][300/1251] eta 0:04:44 lr 0.000163 time 0.2899 (0.2991) loss 5.9657 (6.1478) grad_norm 2.1892 (2.3204) [2022-10-01 17:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][400/1251] eta 0:04:12 lr 0.000167 time 0.2877 (0.2969) loss 5.7259 (6.1285) grad_norm 2.1802 (2.3390) [2022-10-01 17:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][500/1251] eta 0:03:41 lr 0.000171 time 0.2888 (0.2955) loss 6.3193 (6.1263) grad_norm 3.0455 (2.3306) [2022-10-01 17:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][600/1251] eta 0:03:11 lr 0.000175 time 0.2887 (0.2946) loss 6.2239 (6.1127) grad_norm 2.2042 (2.3447) [2022-10-01 17:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][700/1251] eta 0:02:41 lr 0.000179 time 0.2903 (0.2940) loss 5.6669 (6.1089) grad_norm 4.0808 (2.3480) [2022-10-01 17:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][800/1251] eta 0:02:12 lr 0.000183 time 0.2874 (0.2935) loss 6.0038 (6.1016) grad_norm 2.6026 (2.3633) [2022-10-01 17:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][900/1251] eta 0:01:42 lr 0.000187 time 0.2888 (0.2931) loss 6.2393 (6.0989) grad_norm 1.7141 (2.3618) [2022-10-01 17:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1000/1251] eta 0:01:13 lr 0.000191 time 0.2899 (0.2928) loss 6.1887 (6.0935) grad_norm 2.0786 (2.3649) [2022-10-01 17:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1100/1251] eta 0:00:44 lr 0.000195 time 0.2933 (0.2926) loss 6.4370 (6.0831) grad_norm 2.1433 (2.3648) [2022-10-01 17:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1200/1251] eta 0:00:14 lr 0.000199 time 0.2897 (0.2924) loss 6.0211 (6.0706) grad_norm 2.5619 (2.3706) [2022-10-01 17:15:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 3 training takes 0:06:05 [2022-10-01 17:15:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.128 (3.128) Loss 4.3188 (4.3188) Acc@1 19.238 (19.238) Acc@5 41.113 (41.113) [2022-10-01 17:16:09 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 18.534 Acc@5 39.826 [2022-10-01 17:16:09 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 18.5% [2022-10-01 17:16:09 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 18.53% [2022-10-01 17:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][0/1251] eta 1:06:41 lr 0.000201 time 3.1983 (3.1983) loss 6.3155 (6.3155) grad_norm 3.5726 (3.5726) [2022-10-01 17:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][100/1251] eta 0:06:06 lr 0.000205 time 0.2886 (0.3188) loss 6.2699 (5.9007) grad_norm 2.0661 (2.3306) [2022-10-01 17:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][200/1251] eta 0:05:19 lr 0.000209 time 0.2908 (0.3044) loss 6.2229 (5.9249) grad_norm 1.9629 (2.3893) [2022-10-01 17:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][300/1251] eta 0:04:45 lr 0.000213 time 0.2912 (0.2998) loss 5.7585 (5.9424) grad_norm 2.2866 (2.3991) [2022-10-01 17:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][400/1251] eta 0:04:13 lr 0.000217 time 0.2912 (0.2973) loss 6.2847 (5.9266) grad_norm 2.6228 (2.3830) [2022-10-01 17:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][500/1251] eta 0:03:42 lr 0.000221 time 0.2869 (0.2958) loss 6.1394 (5.9231) grad_norm 2.3145 (2.4166) [2022-10-01 17:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][600/1251] eta 0:03:11 lr 0.000225 time 0.2910 (0.2949) loss 5.3866 (5.9090) grad_norm 2.3496 (2.4306) [2022-10-01 17:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][700/1251] eta 0:02:42 lr 0.000229 time 0.2876 (0.2940) loss 5.6238 (5.9074) grad_norm 2.1004 (2.4284) [2022-10-01 17:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][800/1251] eta 0:02:12 lr 0.000233 time 0.2905 (0.2934) loss 6.2417 (5.8955) grad_norm 2.9855 (2.4333) [2022-10-01 17:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][900/1251] eta 0:01:42 lr 0.000237 time 0.2874 (0.2928) loss 6.1343 (5.8926) grad_norm 2.6773 (2.4273) [2022-10-01 17:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1000/1251] eta 0:01:13 lr 0.000241 time 0.2884 (0.2924) loss 6.0324 (5.8916) grad_norm 2.2520 (2.4267) [2022-10-01 17:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1100/1251] eta 0:00:44 lr 0.000245 time 0.2900 (0.2920) loss 5.5111 (5.8835) grad_norm 2.7704 (2.4220) [2022-10-01 17:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1200/1251] eta 0:00:14 lr 0.000249 time 0.2885 (0.2917) loss 5.2286 (5.8752) grad_norm 2.2586 (2.4274) [2022-10-01 17:22:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 4 training takes 0:06:05 [2022-10-01 17:22:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.256 (2.256) Loss 3.9298 (3.9298) Acc@1 24.609 (24.609) Acc@5 47.559 (47.559) [2022-10-01 17:22:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 24.576 Acc@5 47.950 [2022-10-01 17:22:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 24.6% [2022-10-01 17:22:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 24.58% [2022-10-01 17:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][0/1251] eta 0:57:07 lr 0.000251 time 2.7395 (2.7395) loss 5.8243 (5.8243) grad_norm 2.2754 (2.2754) [2022-10-01 17:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][100/1251] eta 0:06:00 lr 0.000255 time 0.2894 (0.3133) loss 5.6168 (5.8511) grad_norm 2.6964 (2.3228) [2022-10-01 17:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][200/1251] eta 0:05:15 lr 0.000259 time 0.2883 (0.3005) loss 5.0327 (5.7997) grad_norm 3.0338 (2.3376) [2022-10-01 17:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][300/1251] eta 0:04:41 lr 0.000263 time 0.2909 (0.2964) loss 5.5154 (5.7734) grad_norm 2.2876 (2.3606) [2022-10-01 17:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][400/1251] eta 0:04:10 lr 0.000267 time 0.2901 (0.2943) loss 5.6988 (5.7623) grad_norm 1.8066 (2.3726) [2022-10-01 17:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][500/1251] eta 0:03:40 lr 0.000271 time 0.2896 (0.2930) loss 6.2257 (5.7480) grad_norm 2.6599 (2.3656) [2022-10-01 17:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][600/1251] eta 0:03:10 lr 0.000275 time 0.2877 (0.2920) loss 5.5225 (5.7362) grad_norm 2.1558 (2.3667) [2022-10-01 17:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][700/1251] eta 0:02:40 lr 0.000279 time 0.2900 (0.2914) loss 5.6136 (5.7200) grad_norm 2.5863 (2.3780) [2022-10-01 17:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][800/1251] eta 0:02:11 lr 0.000283 time 0.2909 (0.2910) loss 5.8336 (5.7071) grad_norm 1.9120 (2.3677) [2022-10-01 17:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][900/1251] eta 0:01:41 lr 0.000287 time 0.2828 (0.2906) loss 6.2502 (5.7052) grad_norm 2.9921 (2.3547) [2022-10-01 17:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1000/1251] eta 0:01:12 lr 0.000291 time 0.2897 (0.2902) loss 5.6443 (5.7001) grad_norm 1.7239 (2.3580) [2022-10-01 17:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1100/1251] eta 0:00:43 lr 0.000295 time 0.2870 (0.2900) loss 5.8982 (5.6972) grad_norm 2.0995 (2.3518) [2022-10-01 17:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1200/1251] eta 0:00:14 lr 0.000299 time 0.2918 (0.2898) loss 5.7473 (5.6858) grad_norm 2.2302 (2.3526) [2022-10-01 17:28:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 5 training takes 0:06:02 [2022-10-01 17:28:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.908 (2.908) Loss 3.4121 (3.4121) Acc@1 31.055 (31.055) Acc@5 58.301 (58.301) [2022-10-01 17:28:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 30.132 Acc@5 54.896 [2022-10-01 17:28:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 30.1% [2022-10-01 17:28:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 30.13% [2022-10-01 17:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][0/1251] eta 1:07:06 lr 0.000301 time 3.2190 (3.2190) loss 4.8109 (4.8109) grad_norm 1.9395 (1.9395) [2022-10-01 17:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][100/1251] eta 0:06:07 lr 0.000305 time 0.2886 (0.3189) loss 5.9748 (5.5595) grad_norm 2.0417 (2.2498) [2022-10-01 17:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][200/1251] eta 0:05:19 lr 0.000309 time 0.2907 (0.3040) loss 5.5806 (5.5916) grad_norm 2.7733 (2.2962) [2022-10-01 17:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][300/1251] eta 0:04:44 lr 0.000313 time 0.2905 (0.2989) loss 5.6239 (5.5617) grad_norm 2.1918 (2.2753) [2022-10-01 17:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][400/1251] eta 0:04:12 lr 0.000317 time 0.2902 (0.2963) loss 5.9343 (5.5770) grad_norm 2.3046 (2.3026) [2022-10-01 17:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][500/1251] eta 0:03:41 lr 0.000321 time 0.2859 (0.2946) loss 5.8380 (5.5596) grad_norm 2.1983 (2.2742) [2022-10-01 17:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][600/1251] eta 0:03:11 lr 0.000325 time 0.2871 (0.2935) loss 5.9235 (5.5496) grad_norm 1.7265 (2.2827) [2022-10-01 17:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][700/1251] eta 0:02:41 lr 0.000329 time 0.2923 (0.2927) loss 5.8804 (5.5458) grad_norm 2.7799 (2.2915) [2022-10-01 17:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][800/1251] eta 0:02:11 lr 0.000333 time 0.2872 (0.2921) loss 4.8829 (5.5395) grad_norm 2.1656 (2.2839) [2022-10-01 17:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][900/1251] eta 0:01:42 lr 0.000337 time 0.2879 (0.2917) loss 5.4070 (5.5393) grad_norm 2.5408 (2.2822) [2022-10-01 17:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1000/1251] eta 0:01:13 lr 0.000341 time 0.2883 (0.2914) loss 6.0047 (5.5383) grad_norm 1.7984 (2.2795) [2022-10-01 17:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1100/1251] eta 0:00:43 lr 0.000345 time 0.2855 (0.2910) loss 5.7984 (5.5384) grad_norm 2.3015 (2.2712) [2022-10-01 17:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1200/1251] eta 0:00:14 lr 0.000349 time 0.2893 (0.2908) loss 5.2571 (5.5295) grad_norm 1.8742 (2.2620) [2022-10-01 17:34:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 6 training takes 0:06:03 [2022-10-01 17:34:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.849 (2.849) Loss 3.2705 (3.2705) Acc@1 35.840 (35.840) Acc@5 57.129 (57.129) [2022-10-01 17:34:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 33.906 Acc@5 59.350 [2022-10-01 17:34:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 33.9% [2022-10-01 17:34:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 33.91% [2022-10-01 17:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][0/1251] eta 0:59:17 lr 0.000351 time 2.8436 (2.8436) loss 5.6448 (5.6448) grad_norm 1.8578 (1.8578) [2022-10-01 17:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][100/1251] eta 0:06:04 lr 0.000355 time 0.2915 (0.3163) loss 4.9631 (5.4833) grad_norm 2.1733 (2.1286) [2022-10-01 17:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][200/1251] eta 0:05:18 lr 0.000359 time 0.2881 (0.3027) loss 5.7841 (5.4614) grad_norm 1.9500 (2.2077) [2022-10-01 17:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][300/1251] eta 0:04:43 lr 0.000363 time 0.2885 (0.2983) loss 5.0254 (5.4439) grad_norm 2.1438 (2.1723) [2022-10-01 17:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][400/1251] eta 0:04:11 lr 0.000367 time 0.2886 (0.2961) loss 5.3875 (5.4383) grad_norm 1.9760 (2.1806) [2022-10-01 17:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][500/1251] eta 0:03:41 lr 0.000371 time 0.2893 (0.2948) loss 5.3340 (5.4303) grad_norm 2.0213 (2.1986) [2022-10-01 17:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][600/1251] eta 0:03:11 lr 0.000375 time 0.2881 (0.2938) loss 5.3447 (5.4222) grad_norm 1.9337 (2.1937) [2022-10-01 17:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][700/1251] eta 0:02:41 lr 0.000379 time 0.2882 (0.2931) loss 5.5156 (5.4263) grad_norm 2.2953 (2.1904) [2022-10-01 17:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][800/1251] eta 0:02:11 lr 0.000383 time 0.2893 (0.2926) loss 5.9514 (5.4193) grad_norm 2.5719 (2.1827) [2022-10-01 17:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][900/1251] eta 0:01:42 lr 0.000387 time 0.2869 (0.2921) loss 4.7490 (5.4079) grad_norm 2.0940 (2.1770) [2022-10-01 17:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1000/1251] eta 0:01:13 lr 0.000391 time 0.2879 (0.2917) loss 5.7854 (5.4111) grad_norm 1.8931 (2.1610) [2022-10-01 17:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1100/1251] eta 0:00:44 lr 0.000395 time 0.2901 (0.2914) loss 5.4661 (5.4008) grad_norm 1.9557 (2.1548) [2022-10-01 17:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1200/1251] eta 0:00:14 lr 0.000399 time 0.2932 (0.2912) loss 4.3394 (5.3945) grad_norm 2.0681 (2.1441) [2022-10-01 17:41:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 7 training takes 0:06:04 [2022-10-01 17:41:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.305 (3.305) Loss 2.9531 (2.9531) Acc@1 39.160 (39.160) Acc@5 62.598 (62.598) [2022-10-01 17:41:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 38.074 Acc@5 63.778 [2022-10-01 17:41:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 38.1% [2022-10-01 17:41:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 38.07% [2022-10-01 17:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][0/1251] eta 1:05:35 lr 0.000401 time 3.1458 (3.1458) loss 5.4528 (5.4528) grad_norm 2.1651 (2.1651) [2022-10-01 17:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][100/1251] eta 0:06:07 lr 0.000405 time 0.2986 (0.3196) loss 4.8553 (5.3851) grad_norm 2.3733 (2.1182) [2022-10-01 17:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][200/1251] eta 0:05:20 lr 0.000409 time 0.2895 (0.3050) loss 4.6769 (5.3717) grad_norm 2.2431 (2.0618) [2022-10-01 17:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][300/1251] eta 0:04:45 lr 0.000413 time 0.2932 (0.2999) loss 4.7300 (5.3672) grad_norm 1.9823 (2.0658) [2022-10-01 17:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][400/1251] eta 0:04:13 lr 0.000417 time 0.2956 (0.2974) loss 5.4682 (5.3408) grad_norm 1.7217 (2.0588) [2022-10-01 17:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][500/1251] eta 0:03:42 lr 0.000421 time 0.2904 (0.2958) loss 5.2993 (5.3269) grad_norm 1.5816 (2.0496) [2022-10-01 17:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][600/1251] eta 0:03:11 lr 0.000425 time 0.2861 (0.2949) loss 5.5838 (5.3045) grad_norm 1.9338 (2.0377) [2022-10-01 17:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][700/1251] eta 0:02:42 lr 0.000429 time 0.2914 (0.2942) loss 4.6573 (5.3028) grad_norm 1.8150 (2.0423) [2022-10-01 17:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][800/1251] eta 0:02:12 lr 0.000433 time 0.2886 (0.2935) loss 5.3474 (5.2958) grad_norm 1.6064 (2.0289) [2022-10-01 17:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][900/1251] eta 0:01:42 lr 0.000437 time 0.2936 (0.2931) loss 5.3498 (5.2931) grad_norm 1.8426 (2.0194) [2022-10-01 17:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1000/1251] eta 0:01:13 lr 0.000441 time 0.2905 (0.2928) loss 5.6980 (5.2880) grad_norm 1.7850 (2.0133) [2022-10-01 17:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1100/1251] eta 0:00:44 lr 0.000445 time 0.2928 (0.2926) loss 5.4427 (5.2858) grad_norm 2.0637 (2.0022) [2022-10-01 17:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1200/1251] eta 0:00:14 lr 0.000449 time 0.2886 (0.2924) loss 4.4379 (5.2809) grad_norm 1.5785 (1.9975) [2022-10-01 17:47:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 8 training takes 0:06:05 [2022-10-01 17:47:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.400 (2.400) Loss 2.8184 (2.8184) Acc@1 38.965 (38.965) Acc@5 66.113 (66.113) [2022-10-01 17:47:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 41.114 Acc@5 66.824 [2022-10-01 17:47:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 41.1% [2022-10-01 17:47:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 41.11% [2022-10-01 17:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][0/1251] eta 0:46:01 lr 0.000451 time 2.2075 (2.2075) loss 4.4769 (4.4769) grad_norm 1.7545 (1.7545) [2022-10-01 17:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][100/1251] eta 0:06:02 lr 0.000455 time 0.2883 (0.3148) loss 5.5393 (5.2626) grad_norm 1.8943 (1.9149) [2022-10-01 17:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][200/1251] eta 0:05:17 lr 0.000459 time 0.2923 (0.3020) loss 5.1049 (5.2543) grad_norm 1.9020 (1.9208) [2022-10-01 17:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][300/1251] eta 0:04:43 lr 0.000463 time 0.2931 (0.2977) loss 4.5248 (5.2657) grad_norm 1.8604 (1.9054) [2022-10-01 17:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][400/1251] eta 0:04:11 lr 0.000467 time 0.2898 (0.2957) loss 4.8631 (5.2315) grad_norm 1.6780 (1.9002) [2022-10-01 17:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][500/1251] eta 0:03:42 lr 0.000471 time 0.3078 (0.2959) loss 5.4725 (5.2105) grad_norm 1.5756 (1.9060) [2022-10-01 17:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][600/1251] eta 0:03:28 lr 0.000475 time 0.8278 (0.3202) loss 4.8827 (5.2085) grad_norm 2.0608 (1.9076) [2022-10-01 17:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][700/1251] eta 0:03:26 lr 0.000478 time 0.6110 (0.3741) loss 5.7638 (5.2033) grad_norm 1.8369 (1.9029) [2022-10-01 17:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][800/1251] eta 0:03:07 lr 0.000482 time 0.7752 (0.4160) loss 4.2502 (5.1969) grad_norm 1.7221 (1.8995) [2022-10-01 17:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][900/1251] eta 0:02:37 lr 0.000486 time 0.6987 (0.4476) loss 5.8803 (5.1830) grad_norm 1.8616 (1.8969) [2022-10-01 17:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1000/1251] eta 0:01:58 lr 0.000490 time 0.7078 (0.4709) loss 5.7180 (5.1704) grad_norm 2.6617 (1.8985) [2022-10-01 17:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1100/1251] eta 0:01:12 lr 0.000494 time 0.2903 (0.4786) loss 4.9975 (5.1681) grad_norm 1.3933 (1.8915) [2022-10-01 17:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1200/1251] eta 0:00:23 lr 0.000498 time 0.2869 (0.4628) loss 4.5360 (5.1541) grad_norm 1.9517 (1.8826) [2022-10-01 17:57:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 9 training takes 0:09:30 [2022-10-01 17:57:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.702 (2.702) Loss 2.6213 (2.6213) Acc@1 42.383 (42.383) Acc@5 70.215 (70.215) [2022-10-01 17:57:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 43.676 Acc@5 69.462 [2022-10-01 17:57:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 43.7% [2022-10-01 17:57:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 43.68% [2022-10-01 17:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][0/1251] eta 1:08:46 lr 0.000501 time 3.2982 (3.2982) loss 5.4259 (5.4259) grad_norm 1.9638 (1.9638) [2022-10-01 17:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][100/1251] eta 0:06:07 lr 0.000504 time 0.2891 (0.3192) loss 4.8651 (5.1121) grad_norm 2.0125 (1.8294) [2022-10-01 17:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][200/1251] eta 0:05:19 lr 0.000508 time 0.2895 (0.3045) loss 5.1709 (5.1394) grad_norm 1.7248 (1.8020) [2022-10-01 17:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][300/1251] eta 0:04:44 lr 0.000512 time 0.2898 (0.2993) loss 5.4752 (5.1158) grad_norm 1.7437 (1.8059) [2022-10-01 17:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][400/1251] eta 0:04:12 lr 0.000516 time 0.2881 (0.2968) loss 4.1091 (5.1250) grad_norm 1.4142 (1.8083) [2022-10-01 17:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][500/1251] eta 0:03:41 lr 0.000520 time 0.2910 (0.2952) loss 5.4991 (5.1135) grad_norm 1.8140 (1.7961) [2022-10-01 18:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][600/1251] eta 0:03:11 lr 0.000524 time 0.2896 (0.2942) loss 5.1631 (5.1159) grad_norm 1.6650 (1.7961) [2022-10-01 18:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][700/1251] eta 0:02:41 lr 0.000528 time 0.2896 (0.2934) loss 5.6554 (5.1136) grad_norm 1.6512 (1.7943) [2022-10-01 18:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][800/1251] eta 0:02:12 lr 0.000532 time 0.2880 (0.2928) loss 4.0436 (5.0983) grad_norm 1.8445 (1.7886) [2022-10-01 18:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][900/1251] eta 0:01:42 lr 0.000536 time 0.2964 (0.2923) loss 4.1461 (5.0889) grad_norm 2.2364 (1.7782) [2022-10-01 18:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1000/1251] eta 0:01:13 lr 0.000540 time 0.2932 (0.2920) loss 5.7101 (5.0919) grad_norm 1.9273 (1.7766) [2022-10-01 18:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1100/1251] eta 0:00:44 lr 0.000544 time 0.2928 (0.2917) loss 5.3912 (5.0769) grad_norm 1.5299 (1.7738) [2022-10-01 18:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1200/1251] eta 0:00:14 lr 0.000548 time 0.2887 (0.2915) loss 5.5545 (5.0701) grad_norm 2.0159 (1.7698) [2022-10-01 18:03:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 10 training takes 0:06:04 [2022-10-01 18:03:22 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saving...... [2022-10-01 18:03:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saved !!! [2022-10-01 18:03:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.667 (2.667) Loss 2.5550 (2.5550) Acc@1 44.824 (44.824) Acc@5 71.680 (71.680) [2022-10-01 18:03:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 46.876 Acc@5 72.648 [2022-10-01 18:03:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 46.9% [2022-10-01 18:03:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 46.88% [2022-10-01 18:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][0/1251] eta 1:08:53 lr 0.000550 time 3.3039 (3.3039) loss 5.4742 (5.4742) grad_norm 1.4987 (1.4987) [2022-10-01 18:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][100/1251] eta 0:06:08 lr 0.000554 time 0.2893 (0.3198) loss 4.8561 (5.0247) grad_norm 1.7139 (1.7978) [2022-10-01 18:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][200/1251] eta 0:05:20 lr 0.000558 time 0.2884 (0.3048) loss 4.2810 (4.9829) grad_norm 1.5042 (1.7538) [2022-10-01 18:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][300/1251] eta 0:04:45 lr 0.000562 time 0.2898 (0.3000) loss 5.5350 (5.0056) grad_norm 1.4418 (1.7456) [2022-10-01 18:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][400/1251] eta 0:04:13 lr 0.000566 time 0.2885 (0.2975) loss 5.3795 (5.0179) grad_norm 1.3728 (1.7182) [2022-10-01 18:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][500/1251] eta 0:03:42 lr 0.000570 time 0.2916 (0.2960) loss 5.0814 (5.0005) grad_norm 1.6450 (1.7131) [2022-10-01 18:06:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][600/1251] eta 0:03:12 lr 0.000574 time 0.2908 (0.2950) loss 5.4849 (4.9916) grad_norm 1.4930 (1.7122) [2022-10-01 18:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][700/1251] eta 0:02:42 lr 0.000578 time 0.2918 (0.2942) loss 5.4313 (5.0028) grad_norm 1.5253 (1.7011) [2022-10-01 18:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][800/1251] eta 0:02:12 lr 0.000582 time 0.2857 (0.2936) loss 5.1530 (5.0090) grad_norm 1.4924 (1.6997) [2022-10-01 18:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][900/1251] eta 0:01:42 lr 0.000586 time 0.2891 (0.2932) loss 4.2391 (5.0069) grad_norm 1.3443 (1.6948) [2022-10-01 18:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1000/1251] eta 0:01:13 lr 0.000590 time 0.2857 (0.2929) loss 5.1783 (4.9894) grad_norm 1.3836 (1.6898) [2022-10-01 18:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1100/1251] eta 0:00:44 lr 0.000594 time 0.2891 (0.2925) loss 5.7349 (4.9867) grad_norm 1.3621 (1.6841) [2022-10-01 18:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1200/1251] eta 0:00:14 lr 0.000598 time 0.2877 (0.2922) loss 4.7684 (4.9851) grad_norm 1.5580 (1.6821) [2022-10-01 18:09:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 11 training takes 0:06:05 [2022-10-01 18:09:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.624 (2.624) Loss 2.2763 (2.2763) Acc@1 50.293 (50.293) Acc@5 77.148 (77.148) [2022-10-01 18:09:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 48.862 Acc@5 74.346 [2022-10-01 18:09:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 48.9% [2022-10-01 18:09:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 48.86% [2022-10-01 18:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][0/1251] eta 1:06:57 lr 0.000600 time 3.2114 (3.2114) loss 5.6795 (5.6795) grad_norm 1.6223 (1.6223) [2022-10-01 18:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][100/1251] eta 0:06:07 lr 0.000604 time 0.2858 (0.3195) loss 4.5086 (4.8798) grad_norm 1.6924 (1.6550) [2022-10-01 18:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][200/1251] eta 0:05:20 lr 0.000608 time 0.2927 (0.3047) loss 5.0603 (4.9121) grad_norm 1.5198 (1.6360) [2022-10-01 18:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][300/1251] eta 0:04:45 lr 0.000612 time 0.2869 (0.3000) loss 4.9994 (4.9248) grad_norm 1.5166 (1.6311) [2022-10-01 18:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][400/1251] eta 0:04:13 lr 0.000616 time 0.2939 (0.2976) loss 5.1964 (4.9117) grad_norm 1.6286 (1.6327) [2022-10-01 18:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][500/1251] eta 0:03:42 lr 0.000620 time 0.2875 (0.2960) loss 5.5953 (4.9024) grad_norm 1.5388 (1.6271) [2022-10-01 18:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][600/1251] eta 0:03:11 lr 0.000624 time 0.2917 (0.2949) loss 5.2551 (4.9121) grad_norm 1.4001 (1.6135) [2022-10-01 18:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][700/1251] eta 0:02:42 lr 0.000628 time 0.2858 (0.2940) loss 4.7561 (4.9112) grad_norm 1.7086 (1.6018) [2022-10-01 18:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][800/1251] eta 0:02:12 lr 0.000632 time 0.2919 (0.2934) loss 3.9712 (4.9068) grad_norm 1.4078 (1.5970) [2022-10-01 18:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][900/1251] eta 0:01:42 lr 0.000636 time 0.2927 (0.2929) loss 5.3508 (4.9184) grad_norm 1.5960 (1.5972) [2022-10-01 18:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1000/1251] eta 0:01:13 lr 0.000640 time 0.2914 (0.2925) loss 4.8772 (4.9049) grad_norm 1.5895 (1.5945) [2022-10-01 18:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1100/1251] eta 0:00:44 lr 0.000644 time 0.2846 (0.2922) loss 4.8743 (4.9023) grad_norm 1.6835 (1.5929) [2022-10-01 18:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1200/1251] eta 0:00:14 lr 0.000648 time 0.2933 (0.2919) loss 4.2940 (4.9033) grad_norm 1.3631 (1.5843) [2022-10-01 18:15:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 12 training takes 0:06:05 [2022-10-01 18:16:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.781 (2.781) Loss 2.2631 (2.2631) Acc@1 50.000 (50.000) Acc@5 75.586 (75.586) [2022-10-01 18:16:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 50.414 Acc@5 75.634 [2022-10-01 18:16:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 50.4% [2022-10-01 18:16:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 50.41% [2022-10-01 18:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][0/1251] eta 0:56:37 lr 0.000650 time 2.7161 (2.7161) loss 3.5211 (3.5211) grad_norm 1.7930 (1.7930) [2022-10-01 18:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][100/1251] eta 0:06:02 lr 0.000654 time 0.2892 (0.3152) loss 3.8201 (4.7817) grad_norm 1.3923 (1.5540) [2022-10-01 18:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][200/1251] eta 0:05:18 lr 0.000658 time 0.2889 (0.3028) loss 4.3528 (4.8066) grad_norm 1.5799 (1.5339) [2022-10-01 18:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][300/1251] eta 0:04:43 lr 0.000662 time 0.2909 (0.2985) loss 5.2601 (4.7978) grad_norm 1.5574 (1.5444) [2022-10-01 18:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][400/1251] eta 0:04:12 lr 0.000666 time 0.2876 (0.2964) loss 5.4226 (4.8103) grad_norm 2.0557 (1.5407) [2022-10-01 18:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][500/1251] eta 0:03:41 lr 0.000670 time 0.2899 (0.2950) loss 5.4307 (4.8248) grad_norm 1.3475 (1.5423) [2022-10-01 18:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][600/1251] eta 0:03:11 lr 0.000674 time 0.2870 (0.2941) loss 5.4253 (4.8263) grad_norm 1.6694 (1.5444) [2022-10-01 18:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][700/1251] eta 0:02:41 lr 0.000678 time 0.2894 (0.2934) loss 5.9181 (4.8274) grad_norm 1.4230 (1.5489) [2022-10-01 18:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][800/1251] eta 0:02:12 lr 0.000682 time 0.2886 (0.2929) loss 4.5554 (4.8166) grad_norm 1.3306 (1.5475) [2022-10-01 18:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][900/1251] eta 0:01:42 lr 0.000686 time 0.2926 (0.2925) loss 4.0536 (4.8242) grad_norm 1.2953 (1.5439) [2022-10-01 18:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1000/1251] eta 0:01:13 lr 0.000690 time 0.2957 (0.2922) loss 5.0204 (4.8212) grad_norm 1.3440 (1.5418) [2022-10-01 18:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1100/1251] eta 0:00:44 lr 0.000694 time 0.2902 (0.2919) loss 5.3306 (4.8199) grad_norm 1.4237 (1.5376) [2022-10-01 18:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1200/1251] eta 0:00:14 lr 0.000698 time 0.2880 (0.2917) loss 4.6166 (4.8159) grad_norm 1.5327 (1.5379) [2022-10-01 18:22:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 13 training takes 0:06:05 [2022-10-01 18:22:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.635 (2.635) Loss 2.1723 (2.1723) Acc@1 53.906 (53.906) Acc@5 76.953 (76.953) [2022-10-01 18:22:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 52.344 Acc@5 77.310 [2022-10-01 18:22:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 52.3% [2022-10-01 18:22:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 52.34% [2022-10-01 18:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][0/1251] eta 0:48:28 lr 0.000700 time 2.3246 (2.3246) loss 5.2348 (5.2348) grad_norm 1.3364 (1.3364) [2022-10-01 18:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][100/1251] eta 0:05:58 lr 0.000704 time 0.2886 (0.3112) loss 4.4641 (4.8266) grad_norm 1.3558 (1.5047) [2022-10-01 18:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][200/1251] eta 0:05:14 lr 0.000708 time 0.2853 (0.2995) loss 5.1760 (4.7861) grad_norm 1.5693 (1.4774) [2022-10-01 18:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][300/1251] eta 0:04:41 lr 0.000712 time 0.2875 (0.2956) loss 4.5914 (4.7794) grad_norm 1.6851 (1.4812) [2022-10-01 18:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][400/1251] eta 0:04:09 lr 0.000716 time 0.2872 (0.2936) loss 4.9976 (4.7693) grad_norm 1.2319 (1.4950) [2022-10-01 18:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][500/1251] eta 0:03:39 lr 0.000720 time 0.2885 (0.2925) loss 5.5717 (4.7576) grad_norm 1.2403 (1.4945) [2022-10-01 18:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][600/1251] eta 0:03:09 lr 0.000724 time 0.2831 (0.2916) loss 4.9894 (4.7698) grad_norm 1.3395 (1.4876) [2022-10-01 18:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][700/1251] eta 0:02:40 lr 0.000728 time 0.2890 (0.2911) loss 4.8346 (4.7722) grad_norm 1.4540 (1.4841) [2022-10-01 18:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][800/1251] eta 0:02:11 lr 0.000732 time 0.2914 (0.2907) loss 3.6646 (4.7833) grad_norm 1.6610 (1.4870) [2022-10-01 18:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][900/1251] eta 0:01:41 lr 0.000736 time 0.2894 (0.2905) loss 4.2399 (4.7837) grad_norm 1.7508 (1.4869) [2022-10-01 18:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1000/1251] eta 0:01:12 lr 0.000740 time 0.2857 (0.2903) loss 4.4251 (4.7782) grad_norm 1.2602 (1.4835) [2022-10-01 18:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1100/1251] eta 0:00:43 lr 0.000744 time 0.2908 (0.2901) loss 5.3379 (4.7721) grad_norm 1.6222 (1.4847) [2022-10-01 18:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1200/1251] eta 0:00:14 lr 0.000748 time 0.2869 (0.2900) loss 4.9040 (4.7669) grad_norm 1.1309 (1.4794) [2022-10-01 18:28:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 14 training takes 0:06:02 [2022-10-01 18:28:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.963 (2.963) Loss 2.0344 (2.0344) Acc@1 54.395 (54.395) Acc@5 80.078 (80.078) [2022-10-01 18:28:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 54.174 Acc@5 78.642 [2022-10-01 18:28:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 54.2% [2022-10-01 18:28:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 54.17% [2022-10-01 18:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][0/1251] eta 1:05:41 lr 0.000750 time 3.1508 (3.1508) loss 4.9622 (4.9622) grad_norm 1.5589 (1.5589) [2022-10-01 18:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][100/1251] eta 0:06:06 lr 0.000754 time 0.2911 (0.3183) loss 5.4451 (4.7200) grad_norm 1.3952 (1.4161) [2022-10-01 18:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][200/1251] eta 0:05:19 lr 0.000758 time 0.2882 (0.3039) loss 5.5519 (4.7107) grad_norm 1.3926 (1.4339) [2022-10-01 18:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][300/1251] eta 0:04:44 lr 0.000762 time 0.2888 (0.2992) loss 5.1016 (4.7145) grad_norm 1.2397 (1.4436) [2022-10-01 18:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][400/1251] eta 0:04:12 lr 0.000766 time 0.2919 (0.2967) loss 4.1529 (4.7134) grad_norm 1.2921 (1.4415) [2022-10-01 18:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][500/1251] eta 0:03:41 lr 0.000770 time 0.2873 (0.2952) loss 4.2388 (4.7297) grad_norm 1.7089 (1.4440) [2022-10-01 18:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][600/1251] eta 0:03:11 lr 0.000774 time 0.2888 (0.2941) loss 4.4613 (4.7259) grad_norm 1.3544 (1.4351) [2022-10-01 18:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][700/1251] eta 0:02:41 lr 0.000778 time 0.2877 (0.2933) loss 4.8856 (4.7230) grad_norm 1.4073 (1.4361) [2022-10-01 18:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][800/1251] eta 0:02:12 lr 0.000782 time 0.2886 (0.2927) loss 4.8135 (4.7261) grad_norm 2.2834 (1.4411) [2022-10-01 18:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][900/1251] eta 0:01:42 lr 0.000786 time 0.2863 (0.2922) loss 4.2891 (4.7208) grad_norm 1.4387 (1.4374) [2022-10-01 18:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1000/1251] eta 0:01:13 lr 0.000790 time 0.2881 (0.2919) loss 4.9687 (4.7211) grad_norm 1.2893 (1.4332) [2022-10-01 18:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1100/1251] eta 0:00:44 lr 0.000794 time 0.2870 (0.2915) loss 5.3998 (4.7097) grad_norm 1.3239 (1.4298) [2022-10-01 18:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1200/1251] eta 0:00:14 lr 0.000798 time 0.2897 (0.2913) loss 5.6830 (4.7138) grad_norm 1.6018 (1.4253) [2022-10-01 18:34:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 15 training takes 0:06:04 [2022-10-01 18:34:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.669 (2.669) Loss 1.9670 (1.9670) Acc@1 56.445 (56.445) Acc@5 80.469 (80.469) [2022-10-01 18:35:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 55.368 Acc@5 79.548 [2022-10-01 18:35:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 55.4% [2022-10-01 18:35:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 55.37% [2022-10-01 18:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][0/1251] eta 0:56:36 lr 0.000800 time 2.7149 (2.7149) loss 4.9091 (4.9091) grad_norm 1.4752 (1.4752) [2022-10-01 18:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][100/1251] eta 0:06:01 lr 0.000804 time 0.2897 (0.3136) loss 4.8080 (4.7331) grad_norm 1.4952 (1.4310) [2022-10-01 18:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][200/1251] eta 0:05:16 lr 0.000808 time 0.2884 (0.3013) loss 4.9895 (4.6646) grad_norm 1.3378 (1.4051) [2022-10-01 18:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][300/1251] eta 0:04:42 lr 0.000812 time 0.2854 (0.2969) loss 4.6075 (4.6742) grad_norm 1.2405 (1.4094) [2022-10-01 18:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][400/1251] eta 0:04:10 lr 0.000816 time 0.2858 (0.2945) loss 4.9033 (4.6634) grad_norm 1.8038 (1.4073) [2022-10-01 18:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][500/1251] eta 0:03:40 lr 0.000820 time 0.2936 (0.2931) loss 5.0739 (4.6743) grad_norm 1.4640 (1.4045) [2022-10-01 18:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][600/1251] eta 0:03:10 lr 0.000824 time 0.2859 (0.2922) loss 4.9106 (4.6866) grad_norm 1.8336 (1.4096) [2022-10-01 18:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][700/1251] eta 0:02:40 lr 0.000828 time 0.2882 (0.2915) loss 4.9993 (4.6843) grad_norm 1.3812 (1.4053) [2022-10-01 18:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][800/1251] eta 0:02:11 lr 0.000832 time 0.2859 (0.2909) loss 5.1688 (4.6826) grad_norm 1.4606 (1.4001) [2022-10-01 18:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][900/1251] eta 0:01:41 lr 0.000836 time 0.2876 (0.2905) loss 5.0233 (4.6787) grad_norm 1.2851 (1.3958) [2022-10-01 18:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1000/1251] eta 0:01:12 lr 0.000840 time 0.2856 (0.2902) loss 4.9369 (4.6739) grad_norm 1.4344 (1.3957) [2022-10-01 18:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1100/1251] eta 0:00:43 lr 0.000844 time 0.2906 (0.2899) loss 4.8424 (4.6644) grad_norm 1.6019 (1.4000) [2022-10-01 18:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1200/1251] eta 0:00:14 lr 0.000848 time 0.2927 (0.2897) loss 4.8531 (4.6656) grad_norm 1.3291 (1.3976) [2022-10-01 18:41:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 16 training takes 0:06:02 [2022-10-01 18:41:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.882 (2.882) Loss 2.0137 (2.0137) Acc@1 55.762 (55.762) Acc@5 79.199 (79.199) [2022-10-01 18:41:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 56.660 Acc@5 80.698 [2022-10-01 18:41:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 56.7% [2022-10-01 18:41:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 56.66% [2022-10-01 18:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][0/1251] eta 0:50:14 lr 0.000850 time 2.4094 (2.4094) loss 4.1925 (4.1925) grad_norm 1.3510 (1.3510) [2022-10-01 18:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][100/1251] eta 0:06:02 lr 0.000854 time 0.2964 (0.3152) loss 5.0112 (4.5881) grad_norm 1.5534 (1.3526) [2022-10-01 18:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][200/1251] eta 0:05:18 lr 0.000858 time 0.2867 (0.3030) loss 5.1326 (4.5663) grad_norm 1.2063 (1.3503) [2022-10-01 18:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][300/1251] eta 0:04:44 lr 0.000862 time 0.2915 (0.2988) loss 4.7557 (4.5528) grad_norm 1.3216 (1.3594) [2022-10-01 18:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][400/1251] eta 0:04:12 lr 0.000866 time 0.2889 (0.2970) loss 5.2048 (4.5531) grad_norm 1.4587 (1.3715) [2022-10-01 18:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][500/1251] eta 0:03:42 lr 0.000870 time 0.2936 (0.2958) loss 3.9803 (4.5809) grad_norm 1.3314 (1.3674) [2022-10-01 18:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][600/1251] eta 0:03:12 lr 0.000874 time 0.2889 (0.2950) loss 5.3793 (4.5884) grad_norm 1.4616 (1.3616) [2022-10-01 18:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][700/1251] eta 0:02:42 lr 0.000878 time 0.2953 (0.2944) loss 4.8243 (4.5951) grad_norm 1.2026 (1.3561) [2022-10-01 18:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][800/1251] eta 0:02:12 lr 0.000882 time 0.2871 (0.2941) loss 4.9334 (4.6012) grad_norm 1.2240 (1.3524) [2022-10-01 18:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][900/1251] eta 0:01:43 lr 0.000886 time 0.2884 (0.2937) loss 5.2477 (4.5954) grad_norm 1.1879 (1.3443) [2022-10-01 18:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1000/1251] eta 0:01:13 lr 0.000890 time 0.2891 (0.2935) loss 5.0673 (4.6040) grad_norm 1.1301 (1.3467) [2022-10-01 18:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1100/1251] eta 0:00:44 lr 0.000894 time 0.2957 (0.2933) loss 4.8650 (4.6134) grad_norm 1.1776 (1.3440) [2022-10-01 18:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1200/1251] eta 0:00:14 lr 0.000898 time 0.2891 (0.2931) loss 5.2772 (4.6215) grad_norm 1.7296 (1.3420) [2022-10-01 18:47:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 17 training takes 0:06:06 [2022-10-01 18:47:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.654 (2.654) Loss 1.8608 (1.8608) Acc@1 58.984 (58.984) Acc@5 83.398 (83.398) [2022-10-01 18:47:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 57.038 Acc@5 81.310 [2022-10-01 18:47:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 57.0% [2022-10-01 18:47:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 57.04% [2022-10-01 18:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][0/1251] eta 0:43:18 lr 0.000900 time 2.0771 (2.0771) loss 4.2163 (4.2163) grad_norm 1.2747 (1.2747) [2022-10-01 18:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][100/1251] eta 0:05:59 lr 0.000904 time 0.2911 (0.3127) loss 4.3034 (4.5144) grad_norm 1.6668 (1.3472) [2022-10-01 18:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][200/1251] eta 0:05:16 lr 0.000908 time 0.2945 (0.3010) loss 5.2936 (4.5310) grad_norm 1.2531 (1.3555) [2022-10-01 18:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][300/1251] eta 0:04:42 lr 0.000912 time 0.2880 (0.2971) loss 4.7962 (4.5715) grad_norm 1.3159 (1.3450) [2022-10-01 18:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][400/1251] eta 0:04:11 lr 0.000916 time 0.2932 (0.2952) loss 4.8918 (4.5580) grad_norm 1.6274 (1.3510) [2022-10-01 18:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][500/1251] eta 0:03:40 lr 0.000920 time 0.2879 (0.2940) loss 4.5868 (4.5623) grad_norm 1.8621 (1.3404) [2022-10-01 18:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][600/1251] eta 0:03:10 lr 0.000924 time 0.2890 (0.2932) loss 5.2882 (4.5533) grad_norm 1.1511 (1.3340) [2022-10-01 18:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][700/1251] eta 0:02:41 lr 0.000928 time 0.2889 (0.2926) loss 4.8514 (4.5522) grad_norm 1.6758 (1.3267) [2022-10-01 18:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][800/1251] eta 0:02:11 lr 0.000932 time 0.2891 (0.2921) loss 4.4598 (4.5584) grad_norm 1.3051 (1.3210) [2022-10-01 18:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][900/1251] eta 0:01:42 lr 0.000936 time 0.2880 (0.2917) loss 4.7058 (4.5474) grad_norm 1.3620 (1.3204) [2022-10-01 18:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1000/1251] eta 0:01:13 lr 0.000940 time 0.2940 (0.2914) loss 4.9257 (4.5523) grad_norm 1.2811 (1.3164) [2022-10-01 18:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1100/1251] eta 0:00:43 lr 0.000944 time 0.2884 (0.2912) loss 5.1242 (4.5624) grad_norm 1.3414 (1.3173) [2022-10-01 18:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1200/1251] eta 0:00:14 lr 0.000948 time 0.2952 (0.2910) loss 5.4373 (4.5641) grad_norm 1.2489 (1.3174) [2022-10-01 18:53:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 18 training takes 0:06:04 [2022-10-01 18:53:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.933 (2.933) Loss 1.8510 (1.8510) Acc@1 58.398 (58.398) Acc@5 82.422 (82.422) [2022-10-01 18:53:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 57.862 Acc@5 81.644 [2022-10-01 18:53:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 57.9% [2022-10-01 18:53:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 57.86% [2022-10-01 18:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][0/1251] eta 1:04:55 lr 0.000950 time 3.1139 (3.1139) loss 4.4823 (4.4823) grad_norm 1.2667 (1.2667) [2022-10-01 18:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][100/1251] eta 0:06:06 lr 0.000954 time 0.2893 (0.3185) loss 4.7149 (4.5063) grad_norm 1.5639 (1.2914) [2022-10-01 18:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][200/1251] eta 0:05:20 lr 0.000958 time 0.2903 (0.3046) loss 3.5295 (4.5323) grad_norm 1.3533 (1.2757) [2022-10-01 18:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][300/1251] eta 0:04:45 lr 0.000962 time 0.2856 (0.2998) loss 3.6358 (4.5295) grad_norm 1.2991 (1.2734) [2022-10-01 18:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][400/1251] eta 0:04:13 lr 0.000966 time 0.2894 (0.2975) loss 3.7042 (4.5503) grad_norm 1.0225 (1.2698) [2022-10-01 18:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][500/1251] eta 0:03:42 lr 0.000970 time 0.2871 (0.2960) loss 4.3912 (4.5518) grad_norm 1.3208 (1.2593) [2022-10-01 18:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][600/1251] eta 0:03:11 lr 0.000974 time 0.2884 (0.2949) loss 4.1724 (4.5475) grad_norm 1.2708 (1.2694) [2022-10-01 18:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][700/1251] eta 0:02:42 lr 0.000978 time 0.2853 (0.2941) loss 4.6294 (4.5362) grad_norm 1.3880 (1.2692) [2022-10-01 18:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][800/1251] eta 0:02:12 lr 0.000982 time 0.2906 (0.2935) loss 4.6025 (4.5382) grad_norm 1.0803 (1.2707) [2022-10-01 18:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][900/1251] eta 0:01:42 lr 0.000986 time 0.2883 (0.2930) loss 4.7402 (4.5437) grad_norm 1.0872 (1.2667) [2022-10-01 18:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1000/1251] eta 0:01:13 lr 0.000990 time 0.2890 (0.2925) loss 4.7378 (4.5569) grad_norm 1.0263 (1.2678) [2022-10-01 18:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1100/1251] eta 0:00:44 lr 0.000994 time 0.2837 (0.2920) loss 5.5357 (4.5513) grad_norm 1.6492 (1.2677) [2022-10-01 18:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1200/1251] eta 0:00:14 lr 0.000998 time 0.2862 (0.2916) loss 4.8824 (4.5499) grad_norm 1.2344 (1.2666) [2022-10-01 18:59:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 19 training takes 0:06:04 [2022-10-01 19:00:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.300 (3.300) Loss 1.8464 (1.8464) Acc@1 61.230 (61.230) Acc@5 81.641 (81.641) [2022-10-01 19:00:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 59.258 Acc@5 82.858 [2022-10-01 19:00:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 59.3% [2022-10-01 19:00:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 59.26% [2022-10-01 19:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][0/1251] eta 0:58:32 lr 0.000989 time 2.8077 (2.8077) loss 4.3921 (4.3921) grad_norm 1.1121 (1.1121) [2022-10-01 19:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][100/1251] eta 0:06:01 lr 0.000989 time 0.2892 (0.3139) loss 4.1687 (4.4282) grad_norm 1.5529 (1.2411) [2022-10-01 19:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][200/1251] eta 0:05:16 lr 0.000989 time 0.2869 (0.3011) loss 4.0257 (4.4880) grad_norm 1.0587 (1.2407) [2022-10-01 19:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][300/1251] eta 0:04:42 lr 0.000989 time 0.2902 (0.2967) loss 3.4341 (4.5060) grad_norm 1.0837 (1.2372) [2022-10-01 19:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][400/1251] eta 0:04:10 lr 0.000989 time 0.2893 (0.2945) loss 5.0672 (4.4976) grad_norm 1.3159 (1.2462) [2022-10-01 19:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][500/1251] eta 0:03:40 lr 0.000989 time 0.2926 (0.2933) loss 4.7883 (4.4828) grad_norm 1.1644 (1.2444) [2022-10-01 19:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][600/1251] eta 0:03:10 lr 0.000989 time 0.2884 (0.2924) loss 4.8448 (4.4829) grad_norm 1.4983 (1.2427) [2022-10-01 19:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][700/1251] eta 0:02:40 lr 0.000989 time 0.2897 (0.2917) loss 5.2123 (4.4795) grad_norm 1.1252 (1.2400) [2022-10-01 19:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][800/1251] eta 0:02:11 lr 0.000988 time 0.2879 (0.2913) loss 4.3666 (4.4765) grad_norm 1.1664 (1.2379) [2022-10-01 19:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][900/1251] eta 0:01:42 lr 0.000988 time 0.2913 (0.2909) loss 4.7730 (4.4718) grad_norm 1.6105 (1.2393) [2022-10-01 19:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1000/1251] eta 0:01:12 lr 0.000988 time 0.2887 (0.2906) loss 5.0198 (4.4627) grad_norm 1.1972 (1.2401) [2022-10-01 19:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1100/1251] eta 0:00:43 lr 0.000988 time 0.2887 (0.2904) loss 4.5811 (4.4576) grad_norm 1.1518 (1.2367) [2022-10-01 19:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1200/1251] eta 0:00:14 lr 0.000988 time 0.2881 (0.2901) loss 4.4243 (4.4590) grad_norm 0.9343 (1.2367) [2022-10-01 19:06:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 20 training takes 0:06:03 [2022-10-01 19:06:14 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saving...... [2022-10-01 19:06:14 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saved !!! [2022-10-01 19:06:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.885 (2.885) Loss 1.7491 (1.7491) Acc@1 59.277 (59.277) Acc@5 84.961 (84.961) [2022-10-01 19:06:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 60.286 Acc@5 83.450 [2022-10-01 19:06:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 60.3% [2022-10-01 19:06:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 60.29% [2022-10-01 19:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][0/1251] eta 0:52:18 lr 0.000988 time 2.5089 (2.5089) loss 3.8975 (3.8975) grad_norm 1.0303 (1.0303) [2022-10-01 19:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][100/1251] eta 0:06:08 lr 0.000988 time 0.2925 (0.3199) loss 4.7421 (4.5255) grad_norm 0.9830 (1.2302) [2022-10-01 19:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][200/1251] eta 0:05:21 lr 0.000988 time 0.2883 (0.3057) loss 4.6155 (4.5234) grad_norm 1.2402 (1.2159) [2022-10-01 19:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][300/1251] eta 0:04:46 lr 0.000988 time 0.2889 (0.3009) loss 4.4639 (4.5197) grad_norm 1.2348 (1.2171) [2022-10-01 19:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][400/1251] eta 0:04:13 lr 0.000988 time 0.2887 (0.2984) loss 4.0734 (4.4978) grad_norm 1.0791 (1.2203) [2022-10-01 19:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][500/1251] eta 0:03:42 lr 0.000988 time 0.2896 (0.2969) loss 4.9709 (4.4999) grad_norm 1.1796 (1.2183) [2022-10-01 19:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][600/1251] eta 0:03:12 lr 0.000988 time 0.2926 (0.2958) loss 5.0882 (4.5026) grad_norm 1.8110 (1.2191) [2022-10-01 19:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][700/1251] eta 0:02:42 lr 0.000987 time 0.2930 (0.2950) loss 4.1696 (4.5015) grad_norm 1.2374 (1.2163) [2022-10-01 19:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][800/1251] eta 0:02:12 lr 0.000987 time 0.2892 (0.2945) loss 3.3003 (4.4879) grad_norm 1.2628 (1.2085) [2022-10-01 19:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][900/1251] eta 0:01:43 lr 0.000987 time 0.2870 (0.2940) loss 4.1376 (4.4956) grad_norm 1.2811 (1.2106) [2022-10-01 19:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1000/1251] eta 0:01:13 lr 0.000987 time 0.2895 (0.2936) loss 4.7637 (4.4846) grad_norm 1.1962 (1.2091) [2022-10-01 19:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1100/1251] eta 0:00:44 lr 0.000987 time 0.2914 (0.2933) loss 5.3232 (4.4779) grad_norm 1.2965 (1.2106) [2022-10-01 19:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1200/1251] eta 0:00:14 lr 0.000987 time 0.2908 (0.2931) loss 4.5274 (4.4698) grad_norm 1.1246 (1.2058) [2022-10-01 19:12:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 21 training takes 0:06:06 [2022-10-01 19:12:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.413 (2.413) Loss 1.7280 (1.7280) Acc@1 60.645 (60.645) Acc@5 83.496 (83.496) [2022-10-01 19:12:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 60.778 Acc@5 84.182 [2022-10-01 19:12:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 60.8% [2022-10-01 19:12:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 60.78% [2022-10-01 19:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][0/1251] eta 0:48:36 lr 0.000987 time 2.3310 (2.3310) loss 5.2219 (5.2219) grad_norm 1.0767 (1.0767) [2022-10-01 19:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][100/1251] eta 0:06:02 lr 0.000987 time 0.2877 (0.3153) loss 4.6367 (4.3973) grad_norm 1.0381 (1.1827) [2022-10-01 19:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][200/1251] eta 0:05:16 lr 0.000987 time 0.2891 (0.3015) loss 4.5190 (4.4036) grad_norm 1.3527 (1.2019) [2022-10-01 19:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][300/1251] eta 0:04:42 lr 0.000987 time 0.2850 (0.2969) loss 4.3234 (4.4028) grad_norm 1.2575 (1.2110) [2022-10-01 19:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][400/1251] eta 0:04:10 lr 0.000987 time 0.2874 (0.2947) loss 4.6792 (4.4070) grad_norm 0.9602 (1.2153) [2022-10-01 19:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][500/1251] eta 0:03:40 lr 0.000986 time 0.2890 (0.2933) loss 4.9207 (4.4082) grad_norm 1.2780 (1.2068) [2022-10-01 19:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][600/1251] eta 0:03:10 lr 0.000986 time 0.2877 (0.2923) loss 4.3822 (4.4049) grad_norm 1.0377 (1.2046) [2022-10-01 19:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][700/1251] eta 0:02:40 lr 0.000986 time 0.2911 (0.2917) loss 4.7881 (4.4010) grad_norm 1.1363 (1.1993) [2022-10-01 19:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][800/1251] eta 0:02:11 lr 0.000986 time 0.2863 (0.2911) loss 4.6787 (4.4052) grad_norm 1.3235 (1.2051) [2022-10-01 19:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][900/1251] eta 0:01:42 lr 0.000986 time 0.2877 (0.2907) loss 4.6186 (4.3948) grad_norm 1.1566 (1.2039) [2022-10-01 19:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1000/1251] eta 0:01:12 lr 0.000986 time 0.2875 (0.2904) loss 4.7102 (4.3937) grad_norm 1.1467 (1.2053) [2022-10-01 19:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1100/1251] eta 0:00:43 lr 0.000986 time 0.2863 (0.2902) loss 5.2356 (4.4011) grad_norm 1.3556 (1.2018) [2022-10-01 19:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1200/1251] eta 0:00:14 lr 0.000986 time 0.2871 (0.2900) loss 4.4553 (4.4056) grad_norm 1.1031 (1.1980) [2022-10-01 19:18:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 22 training takes 0:06:02 [2022-10-01 19:18:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.183 (3.183) Loss 1.7344 (1.7344) Acc@1 61.133 (61.133) Acc@5 83.496 (83.496) [2022-10-01 19:19:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 61.904 Acc@5 84.788 [2022-10-01 19:19:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 61.9% [2022-10-01 19:19:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 61.90% [2022-10-01 19:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][0/1251] eta 1:02:20 lr 0.000986 time 2.9904 (2.9904) loss 3.5903 (3.5903) grad_norm 1.2082 (1.2082) [2022-10-01 19:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][100/1251] eta 0:06:04 lr 0.000986 time 0.2901 (0.3164) loss 4.0894 (4.4148) grad_norm 1.0916 (1.1673) [2022-10-01 19:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][200/1251] eta 0:05:18 lr 0.000986 time 0.2877 (0.3028) loss 4.2717 (4.3481) grad_norm 1.0273 (1.1817) [2022-10-01 19:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][300/1251] eta 0:04:43 lr 0.000985 time 0.2900 (0.2983) loss 4.2239 (4.3553) grad_norm 0.9238 (1.1790) [2022-10-01 19:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][400/1251] eta 0:04:11 lr 0.000985 time 0.2907 (0.2961) loss 4.7929 (4.3622) grad_norm 1.1291 (1.1824) [2022-10-01 19:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][500/1251] eta 0:03:41 lr 0.000985 time 0.2894 (0.2949) loss 4.4745 (4.3553) grad_norm 1.1604 (1.1783) [2022-10-01 19:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][600/1251] eta 0:03:11 lr 0.000985 time 0.2906 (0.2939) loss 5.3997 (4.3826) grad_norm 1.2483 (1.1749) [2022-10-01 19:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][700/1251] eta 0:02:41 lr 0.000985 time 0.2846 (0.2933) loss 4.6962 (4.3743) grad_norm 1.0889 (1.1728) [2022-10-01 19:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][800/1251] eta 0:02:11 lr 0.000985 time 0.2875 (0.2927) loss 4.6833 (4.3798) grad_norm 1.0885 (1.1709) [2022-10-01 19:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][900/1251] eta 0:01:42 lr 0.000985 time 0.2874 (0.2922) loss 4.5965 (4.3726) grad_norm 1.1628 (1.1688) [2022-10-01 19:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1000/1251] eta 0:01:13 lr 0.000985 time 0.2883 (0.2918) loss 3.6604 (4.3713) grad_norm 1.0490 (1.1689) [2022-10-01 19:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1100/1251] eta 0:00:44 lr 0.000985 time 0.2866 (0.2914) loss 5.2218 (4.3727) grad_norm 0.9922 (1.1701) [2022-10-01 19:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1200/1251] eta 0:00:14 lr 0.000985 time 0.2874 (0.2912) loss 3.0340 (4.3719) grad_norm 0.9833 (1.1717) [2022-10-01 19:25:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 23 training takes 0:06:04 [2022-10-01 19:25:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.818 (2.818) Loss 1.7318 (1.7318) Acc@1 63.379 (63.379) Acc@5 85.254 (85.254) [2022-10-01 19:25:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 62.126 Acc@5 84.922 [2022-10-01 19:25:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 62.1% [2022-10-01 19:25:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 62.13% [2022-10-01 19:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][0/1251] eta 1:06:56 lr 0.000984 time 3.2103 (3.2103) loss 4.9461 (4.9461) grad_norm 1.0260 (1.0260) [2022-10-01 19:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][100/1251] eta 0:06:06 lr 0.000984 time 0.2879 (0.3188) loss 4.7460 (4.3854) grad_norm 1.1866 (1.1794) [2022-10-01 19:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][200/1251] eta 0:05:20 lr 0.000984 time 0.2890 (0.3045) loss 3.2025 (4.3789) grad_norm 1.0378 (1.1596) [2022-10-01 19:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][300/1251] eta 0:04:44 lr 0.000984 time 0.2970 (0.2995) loss 4.2967 (4.3707) grad_norm 1.0886 (1.1696) [2022-10-01 19:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][400/1251] eta 0:04:12 lr 0.000984 time 0.2890 (0.2970) loss 4.5868 (4.3575) grad_norm 1.0723 (1.1672) [2022-10-01 19:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][500/1251] eta 0:03:41 lr 0.000984 time 0.2839 (0.2956) loss 3.5250 (4.3480) grad_norm 1.0499 (1.1659) [2022-10-01 19:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][600/1251] eta 0:03:11 lr 0.000984 time 0.2909 (0.2946) loss 3.9255 (4.3366) grad_norm 1.2030 (1.1637) [2022-10-01 19:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][700/1251] eta 0:02:41 lr 0.000984 time 0.2909 (0.2938) loss 4.8070 (4.3317) grad_norm 1.1647 (1.1638) [2022-10-01 19:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][800/1251] eta 0:02:12 lr 0.000984 time 0.2938 (0.2932) loss 4.4675 (4.3524) grad_norm 1.0637 (1.1632) [2022-10-01 19:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][900/1251] eta 0:01:42 lr 0.000984 time 0.2899 (0.2927) loss 3.3177 (4.3550) grad_norm 1.0390 (1.1633) [2022-10-01 19:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1000/1251] eta 0:01:13 lr 0.000983 time 0.2924 (0.2923) loss 5.1617 (4.3465) grad_norm 1.0405 (1.1622) [2022-10-01 19:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1100/1251] eta 0:00:44 lr 0.000983 time 0.2872 (0.2920) loss 4.5791 (4.3561) grad_norm 1.2101 (1.1608) [2022-10-01 19:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1200/1251] eta 0:00:14 lr 0.000983 time 0.2921 (0.2917) loss 4.2382 (4.3527) grad_norm 0.9813 (1.1602) [2022-10-01 19:31:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 24 training takes 0:06:05 [2022-10-01 19:31:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.701 (2.701) Loss 1.6676 (1.6676) Acc@1 60.938 (60.938) Acc@5 84.473 (84.473) [2022-10-01 19:31:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 63.372 Acc@5 85.742 [2022-10-01 19:31:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 63.4% [2022-10-01 19:31:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 63.37% [2022-10-01 19:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][0/1251] eta 1:09:14 lr 0.000983 time 3.3212 (3.3212) loss 5.3581 (5.3581) grad_norm 1.2725 (1.2725) [2022-10-01 19:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][100/1251] eta 0:06:07 lr 0.000983 time 0.2874 (0.3191) loss 4.9474 (4.3125) grad_norm 0.9691 (1.2192) [2022-10-01 19:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][200/1251] eta 0:05:19 lr 0.000983 time 0.2899 (0.3040) loss 4.5992 (4.3358) grad_norm 1.0570 (1.1712) [2022-10-01 19:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][300/1251] eta 0:04:44 lr 0.000983 time 0.2903 (0.2991) loss 5.0464 (4.3291) grad_norm 1.1321 (1.1747) [2022-10-01 19:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][400/1251] eta 0:04:12 lr 0.000983 time 0.2891 (0.2967) loss 5.0254 (4.3302) grad_norm 1.1184 (1.1674) [2022-10-01 19:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][500/1251] eta 0:03:41 lr 0.000983 time 0.2903 (0.2951) loss 4.6073 (4.3356) grad_norm 0.9572 (1.1576) [2022-10-01 19:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][600/1251] eta 0:03:11 lr 0.000982 time 0.2894 (0.2941) loss 4.6799 (4.3335) grad_norm 1.0899 (1.1526) [2022-10-01 19:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][700/1251] eta 0:02:41 lr 0.000982 time 0.2893 (0.2934) loss 3.3702 (4.3285) grad_norm 1.1465 (1.1569) [2022-10-01 19:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][800/1251] eta 0:02:12 lr 0.000982 time 0.2887 (0.2928) loss 4.4307 (4.3194) grad_norm 0.9832 (1.1520) [2022-10-01 19:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][900/1251] eta 0:01:42 lr 0.000982 time 0.2880 (0.2924) loss 4.3374 (4.3326) grad_norm 1.3487 (1.1561) [2022-10-01 19:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1000/1251] eta 0:01:13 lr 0.000982 time 0.2905 (0.2920) loss 3.4959 (4.3330) grad_norm 1.0535 (1.1550) [2022-10-01 19:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1100/1251] eta 0:00:44 lr 0.000982 time 0.2880 (0.2917) loss 3.7318 (4.3360) grad_norm 1.1150 (1.1568) [2022-10-01 19:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1200/1251] eta 0:00:14 lr 0.000982 time 0.2883 (0.2914) loss 5.1668 (4.3332) grad_norm 1.3441 (1.1514) [2022-10-01 19:37:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 25 training takes 0:06:04 [2022-10-01 19:37:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.849 (2.849) Loss 1.5415 (1.5415) Acc@1 64.160 (64.160) Acc@5 87.695 (87.695) [2022-10-01 19:37:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 63.174 Acc@5 85.958 [2022-10-01 19:37:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 63.2% [2022-10-01 19:37:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 63.37% [2022-10-01 19:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][0/1251] eta 1:05:55 lr 0.000982 time 3.1619 (3.1619) loss 4.4805 (4.4805) grad_norm 1.0376 (1.0376) [2022-10-01 19:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][100/1251] eta 0:06:06 lr 0.000982 time 0.2959 (0.3185) loss 5.0026 (4.2606) grad_norm 1.3175 (1.1403) [2022-10-01 19:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][200/1251] eta 0:05:19 lr 0.000982 time 0.2903 (0.3038) loss 4.5829 (4.2790) grad_norm 1.2526 (1.1305) [2022-10-01 19:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][300/1251] eta 0:04:44 lr 0.000981 time 0.2930 (0.2991) loss 5.1454 (4.3083) grad_norm 1.6340 (1.1331) [2022-10-01 19:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][400/1251] eta 0:04:12 lr 0.000981 time 0.2930 (0.2967) loss 4.4678 (4.2981) grad_norm 1.3455 (1.1322) [2022-10-01 19:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][500/1251] eta 0:03:41 lr 0.000981 time 0.2909 (0.2951) loss 4.6651 (4.2916) grad_norm 1.3920 (1.1304) [2022-10-01 19:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][600/1251] eta 0:03:11 lr 0.000981 time 0.2896 (0.2941) loss 4.6647 (4.3051) grad_norm 0.9581 (1.1283) [2022-10-01 19:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][700/1251] eta 0:02:41 lr 0.000981 time 0.2910 (0.2934) loss 4.9592 (4.2936) grad_norm 1.0558 (1.1307) [2022-10-01 19:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][800/1251] eta 0:02:12 lr 0.000981 time 0.2877 (0.2927) loss 4.9502 (4.3021) grad_norm 1.1993 (1.1329) [2022-10-01 19:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][900/1251] eta 0:01:42 lr 0.000981 time 0.2933 (0.2923) loss 4.7384 (4.2935) grad_norm 0.8206 (1.1316) [2022-10-01 19:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1000/1251] eta 0:01:13 lr 0.000981 time 0.2901 (0.2919) loss 4.6141 (4.2994) grad_norm 1.1944 (1.1321) [2022-10-01 19:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1100/1251] eta 0:00:44 lr 0.000981 time 0.2896 (0.2916) loss 4.4431 (4.3073) grad_norm 1.1856 (1.1320) [2022-10-01 19:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1200/1251] eta 0:00:14 lr 0.000980 time 0.2856 (0.2913) loss 4.5079 (4.2928) grad_norm 1.3597 (1.1317) [2022-10-01 19:43:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 26 training takes 0:06:04 [2022-10-01 19:44:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.487 (2.487) Loss 1.5492 (1.5492) Acc@1 62.305 (62.305) Acc@5 86.816 (86.816) [2022-10-01 19:44:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 63.958 Acc@5 86.296 [2022-10-01 19:44:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 64.0% [2022-10-01 19:44:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 63.96% [2022-10-01 19:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][0/1251] eta 1:07:37 lr 0.000980 time 3.2432 (3.2432) loss 4.6911 (4.6911) grad_norm 1.3245 (1.3245) [2022-10-01 19:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][100/1251] eta 0:06:08 lr 0.000980 time 0.2895 (0.3204) loss 4.5513 (4.3559) grad_norm 1.1184 (1.1583) [2022-10-01 19:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][200/1251] eta 0:05:20 lr 0.000980 time 0.2909 (0.3053) loss 4.6172 (4.2618) grad_norm 0.8858 (1.1535) [2022-10-01 19:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][300/1251] eta 0:04:45 lr 0.000980 time 0.2941 (0.3003) loss 3.9849 (4.2497) grad_norm 1.0684 (1.1545) [2022-10-01 19:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][400/1251] eta 0:04:13 lr 0.000980 time 0.2873 (0.2978) loss 3.6182 (4.2338) grad_norm 1.1062 (1.1520) [2022-10-01 19:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][500/1251] eta 0:03:42 lr 0.000980 time 0.2889 (0.2963) loss 4.3368 (4.2192) grad_norm 0.9680 (1.1438) [2022-10-01 19:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][600/1251] eta 0:03:12 lr 0.000980 time 0.2882 (0.2954) loss 4.5496 (4.2376) grad_norm 1.0699 (1.1445) [2022-10-01 19:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][700/1251] eta 0:02:42 lr 0.000980 time 0.2952 (0.2946) loss 3.9841 (4.2336) grad_norm 1.2408 (1.1408) [2022-10-01 19:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][800/1251] eta 0:02:12 lr 0.000979 time 0.2864 (0.2941) loss 4.2534 (4.2398) grad_norm 1.1073 (1.1449) [2022-10-01 19:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][900/1251] eta 0:01:43 lr 0.000979 time 0.2868 (0.2936) loss 3.3721 (4.2244) grad_norm 1.1165 (1.1411) [2022-10-01 19:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1000/1251] eta 0:01:13 lr 0.000979 time 0.2875 (0.2932) loss 5.0969 (4.2408) grad_norm 1.0944 (1.1397) [2022-10-01 19:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1100/1251] eta 0:00:44 lr 0.000979 time 0.2909 (0.2929) loss 3.6746 (4.2450) grad_norm 0.8497 (1.1352) [2022-10-01 19:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1200/1251] eta 0:00:14 lr 0.000979 time 0.2873 (0.2926) loss 3.4155 (4.2459) grad_norm 0.9934 (1.1349) [2022-10-01 19:50:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 27 training takes 0:06:06 [2022-10-01 19:50:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.939 (2.939) Loss 1.5715 (1.5715) Acc@1 62.988 (62.988) Acc@5 86.914 (86.914) [2022-10-01 19:50:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 64.834 Acc@5 86.824 [2022-10-01 19:50:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 64.8% [2022-10-01 19:50:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 64.83% [2022-10-01 19:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][0/1251] eta 1:07:06 lr 0.000979 time 3.2187 (3.2187) loss 4.6021 (4.6021) grad_norm 1.0197 (1.0197) [2022-10-01 19:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][100/1251] eta 0:06:08 lr 0.000979 time 0.2957 (0.3201) loss 4.8042 (4.2397) grad_norm 1.0265 (1.1402) [2022-10-01 19:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][200/1251] eta 0:05:21 lr 0.000979 time 0.2874 (0.3054) loss 3.8727 (4.2955) grad_norm 1.3070 (1.1319) [2022-10-01 19:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][300/1251] eta 0:04:45 lr 0.000979 time 0.2906 (0.3004) loss 3.4754 (4.2747) grad_norm 1.2128 (1.1347) [2022-10-01 19:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][400/1251] eta 0:04:13 lr 0.000978 time 0.2873 (0.2978) loss 3.9006 (4.2361) grad_norm 1.2330 (1.1420) [2022-10-01 19:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][500/1251] eta 0:03:42 lr 0.000978 time 0.2884 (0.2961) loss 3.9640 (4.2413) grad_norm 1.3183 (1.1417) [2022-10-01 19:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][600/1251] eta 0:03:11 lr 0.000978 time 0.2881 (0.2948) loss 4.3755 (4.2529) grad_norm 0.9309 (1.1389) [2022-10-01 19:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][700/1251] eta 0:02:41 lr 0.000978 time 0.2898 (0.2939) loss 4.3295 (4.2325) grad_norm 0.8122 (1.1439) [2022-10-01 19:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][800/1251] eta 0:02:12 lr 0.000978 time 0.2866 (0.2933) loss 4.3382 (4.2311) grad_norm 1.2743 (1.1407) [2022-10-01 19:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][900/1251] eta 0:01:42 lr 0.000978 time 0.2864 (0.2927) loss 4.4085 (4.2196) grad_norm 1.2058 (1.1421) [2022-10-01 19:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1000/1251] eta 0:01:13 lr 0.000978 time 0.2854 (0.2922) loss 4.0496 (4.2162) grad_norm 1.1827 (1.1411) [2022-10-01 19:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1100/1251] eta 0:00:44 lr 0.000978 time 0.2860 (0.2919) loss 4.1593 (4.2145) grad_norm 0.9446 (1.1388) [2022-10-01 19:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1200/1251] eta 0:00:14 lr 0.000977 time 0.2898 (0.2915) loss 4.4588 (4.2122) grad_norm 1.0818 (1.1337) [2022-10-01 19:56:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 28 training takes 0:06:04 [2022-10-01 19:56:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.002 (3.002) Loss 1.4737 (1.4737) Acc@1 66.895 (66.895) Acc@5 87.598 (87.598) [2022-10-01 19:56:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.642 Acc@5 87.160 [2022-10-01 19:56:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.6% [2022-10-01 19:56:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.64% [2022-10-01 19:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][0/1251] eta 1:08:22 lr 0.000977 time 3.2798 (3.2798) loss 4.6174 (4.6174) grad_norm 0.9908 (0.9908) [2022-10-01 19:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][100/1251] eta 0:06:06 lr 0.000977 time 0.2887 (0.3188) loss 4.9882 (4.0918) grad_norm 1.4492 (1.1207) [2022-10-01 19:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][200/1251] eta 0:05:19 lr 0.000977 time 0.2907 (0.3042) loss 3.1976 (4.1247) grad_norm 1.0425 (1.1432) [2022-10-01 19:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][300/1251] eta 0:04:44 lr 0.000977 time 0.2976 (0.2993) loss 4.1850 (4.1526) grad_norm 1.1486 (1.1355) [2022-10-01 19:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][400/1251] eta 0:04:12 lr 0.000977 time 0.2853 (0.2966) loss 4.0079 (4.1728) grad_norm 0.9847 (1.1347) [2022-10-01 19:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][500/1251] eta 0:03:41 lr 0.000977 time 0.2875 (0.2951) loss 3.5387 (4.1745) grad_norm 1.0502 (1.1319) [2022-10-01 19:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][600/1251] eta 0:03:11 lr 0.000977 time 0.2905 (0.2940) loss 4.2536 (4.1681) grad_norm 1.2484 (1.1259) [2022-10-01 20:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][700/1251] eta 0:02:41 lr 0.000976 time 0.2870 (0.2933) loss 4.8203 (4.1814) grad_norm 0.8553 (1.1240) [2022-10-01 20:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][800/1251] eta 0:02:12 lr 0.000976 time 0.2908 (0.2928) loss 4.7101 (4.1854) grad_norm 1.4540 (1.1194) [2022-10-01 20:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][900/1251] eta 0:01:42 lr 0.000976 time 0.2917 (0.2924) loss 4.8706 (4.1875) grad_norm 1.1575 (1.1180) [2022-10-01 20:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1000/1251] eta 0:01:13 lr 0.000976 time 0.2888 (0.2921) loss 5.3263 (4.1839) grad_norm 1.1669 (1.1179) [2022-10-01 20:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1100/1251] eta 0:00:44 lr 0.000976 time 0.2909 (0.2918) loss 4.1367 (4.1857) grad_norm 1.1222 (1.1181) [2022-10-01 20:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1200/1251] eta 0:00:14 lr 0.000976 time 0.2914 (0.2916) loss 4.5278 (4.1860) grad_norm 1.0818 (1.1186) [2022-10-01 20:02:52 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 29 training takes 0:06:04 [2022-10-01 20:02:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.186 (3.186) Loss 1.5157 (1.5157) Acc@1 65.625 (65.625) Acc@5 87.695 (87.695) [2022-10-01 20:03:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.700 Acc@5 87.478 [2022-10-01 20:03:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.7% [2022-10-01 20:03:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.70% [2022-10-01 20:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][0/1251] eta 0:47:20 lr 0.000976 time 2.2708 (2.2708) loss 4.8886 (4.8886) grad_norm 1.0819 (1.0819) [2022-10-01 20:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][100/1251] eta 0:06:02 lr 0.000976 time 0.2879 (0.3153) loss 3.4066 (4.1775) grad_norm 1.3196 (1.1342) [2022-10-01 20:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][200/1251] eta 0:05:17 lr 0.000976 time 0.2904 (0.3025) loss 4.9839 (4.2143) grad_norm 1.4265 (1.1418) [2022-10-01 20:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][300/1251] eta 0:04:43 lr 0.000975 time 0.2901 (0.2982) loss 4.3122 (4.1821) grad_norm 1.0526 (1.1332) [2022-10-01 20:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][400/1251] eta 0:04:11 lr 0.000975 time 0.2891 (0.2959) loss 4.6410 (4.1911) grad_norm 1.0183 (1.1339) [2022-10-01 20:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][500/1251] eta 0:03:41 lr 0.000975 time 0.2929 (0.2945) loss 3.1819 (4.1681) grad_norm 1.0920 (1.1248) [2022-10-01 20:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][600/1251] eta 0:03:11 lr 0.000975 time 0.2903 (0.2936) loss 2.8258 (4.1770) grad_norm 0.9915 (1.1191) [2022-10-01 20:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][700/1251] eta 0:02:41 lr 0.000975 time 0.2884 (0.2929) loss 3.8068 (4.1821) grad_norm 1.0708 (1.1189) [2022-10-01 20:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][800/1251] eta 0:02:11 lr 0.000975 time 0.2887 (0.2923) loss 4.3058 (4.1762) grad_norm 1.2300 (1.1167) [2022-10-01 20:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][900/1251] eta 0:01:42 lr 0.000975 time 0.2925 (0.2919) loss 3.7219 (4.1762) grad_norm 1.1253 (1.1179) [2022-10-01 20:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1000/1251] eta 0:01:13 lr 0.000974 time 0.2915 (0.2916) loss 4.2727 (4.1765) grad_norm 1.4250 (1.1164) [2022-10-01 20:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1100/1251] eta 0:00:43 lr 0.000974 time 0.2849 (0.2913) loss 4.0493 (4.1689) grad_norm 1.2870 (1.1153) [2022-10-01 20:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1200/1251] eta 0:00:14 lr 0.000974 time 0.2881 (0.2910) loss 4.4706 (4.1683) grad_norm 1.0355 (1.1134) [2022-10-01 20:09:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 30 training takes 0:06:04 [2022-10-01 20:09:09 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saving...... [2022-10-01 20:09:10 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saved !!! [2022-10-01 20:09:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.072 (3.072) Loss 1.4835 (1.4835) Acc@1 66.992 (66.992) Acc@5 87.500 (87.500) [2022-10-01 20:09:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 65.784 Acc@5 87.574 [2022-10-01 20:09:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 65.8% [2022-10-01 20:09:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 65.78% [2022-10-01 20:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][0/1251] eta 1:08:32 lr 0.000974 time 3.2871 (3.2871) loss 3.1150 (3.1150) grad_norm 1.3975 (1.3975) [2022-10-01 20:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][100/1251] eta 0:06:08 lr 0.000974 time 0.2857 (0.3199) loss 4.3986 (4.0423) grad_norm 1.0071 (1.1353) [2022-10-01 20:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][200/1251] eta 0:05:20 lr 0.000974 time 0.2850 (0.3050) loss 4.2969 (4.1168) grad_norm 1.1205 (1.1202) [2022-10-01 20:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][300/1251] eta 0:04:45 lr 0.000974 time 0.2921 (0.3000) loss 3.8543 (4.1355) grad_norm 0.9809 (1.1210) [2022-10-01 20:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][400/1251] eta 0:04:13 lr 0.000974 time 0.2940 (0.2975) loss 4.0206 (4.1479) grad_norm 1.2693 (1.1186) [2022-10-01 20:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][500/1251] eta 0:03:42 lr 0.000973 time 0.2895 (0.2959) loss 3.8013 (4.1506) grad_norm 1.0964 (1.1190) [2022-10-01 20:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][600/1251] eta 0:03:11 lr 0.000973 time 0.2898 (0.2947) loss 4.7192 (4.1463) grad_norm 1.2935 (1.1251) [2022-10-01 20:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][700/1251] eta 0:02:41 lr 0.000973 time 0.2915 (0.2939) loss 4.6747 (4.1578) grad_norm 0.9976 (1.1197) [2022-10-01 20:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][800/1251] eta 0:02:12 lr 0.000973 time 0.2890 (0.2932) loss 4.0560 (4.1623) grad_norm 1.0759 (1.1139) [2022-10-01 20:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][900/1251] eta 0:01:42 lr 0.000973 time 0.2896 (0.2928) loss 4.4049 (4.1679) grad_norm 1.1723 (1.1174) [2022-10-01 20:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1000/1251] eta 0:01:13 lr 0.000973 time 0.2932 (0.2924) loss 4.3518 (4.1692) grad_norm 1.5068 (1.1180) [2022-10-01 20:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1100/1251] eta 0:00:44 lr 0.000973 time 0.2910 (0.2921) loss 3.0916 (4.1670) grad_norm 1.1326 (1.1218) [2022-10-01 20:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1200/1251] eta 0:00:14 lr 0.000973 time 0.2870 (0.2918) loss 4.3968 (4.1646) grad_norm 1.0431 (1.1208) [2022-10-01 20:15:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 31 training takes 0:06:05 [2022-10-01 20:15:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.035 (3.035) Loss 1.4181 (1.4181) Acc@1 68.750 (68.750) Acc@5 87.500 (87.500) [2022-10-01 20:15:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 66.310 Acc@5 87.726 [2022-10-01 20:15:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 66.3% [2022-10-01 20:15:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 66.31% [2022-10-01 20:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][0/1251] eta 1:00:07 lr 0.000972 time 2.8837 (2.8837) loss 3.8253 (3.8253) grad_norm 1.0736 (1.0736) [2022-10-01 20:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][100/1251] eta 0:06:04 lr 0.000972 time 0.2906 (0.3168) loss 3.9264 (4.1191) grad_norm 1.0665 (1.1231) [2022-10-01 20:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][200/1251] eta 0:05:19 lr 0.000972 time 0.2883 (0.3037) loss 3.9367 (4.1353) grad_norm 1.0104 (1.1276) [2022-10-01 20:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][300/1251] eta 0:04:44 lr 0.000972 time 0.2935 (0.2993) loss 4.6936 (4.1687) grad_norm 1.0927 (1.1208) [2022-10-01 20:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][400/1251] eta 0:04:12 lr 0.000972 time 0.2884 (0.2970) loss 3.3285 (4.1599) grad_norm 1.0601 (1.1221) [2022-10-01 20:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][500/1251] eta 0:03:41 lr 0.000972 time 0.2886 (0.2956) loss 3.9182 (4.1847) grad_norm 1.0410 (1.1207) [2022-10-01 20:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][600/1251] eta 0:03:11 lr 0.000972 time 0.2901 (0.2946) loss 3.2584 (4.1598) grad_norm 1.0141 (1.1198) [2022-10-01 20:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][700/1251] eta 0:02:41 lr 0.000972 time 0.2978 (0.2939) loss 4.2409 (4.1549) grad_norm 0.8650 (1.1142) [2022-10-01 20:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][800/1251] eta 0:02:12 lr 0.000971 time 0.2905 (0.2935) loss 3.8930 (4.1408) grad_norm 1.1366 (1.1174) [2022-10-01 20:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][900/1251] eta 0:01:42 lr 0.000971 time 0.2927 (0.2931) loss 4.6521 (4.1379) grad_norm 1.4125 (1.1135) [2022-10-01 20:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1000/1251] eta 0:01:13 lr 0.000971 time 0.2865 (0.2928) loss 4.1792 (4.1409) grad_norm 0.8685 (1.1138) [2022-10-01 20:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1100/1251] eta 0:00:44 lr 0.000971 time 0.2933 (0.2925) loss 4.2744 (4.1489) grad_norm 0.9323 (1.1144) [2022-10-01 20:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1200/1251] eta 0:00:14 lr 0.000971 time 0.2895 (0.2923) loss 4.9606 (4.1511) grad_norm 1.1197 (1.1165) [2022-10-01 20:21:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 32 training takes 0:06:05 [2022-10-01 20:21:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.345 (2.345) Loss 1.4455 (1.4455) Acc@1 66.504 (66.504) Acc@5 88.672 (88.672) [2022-10-01 20:21:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 66.660 Acc@5 88.188 [2022-10-01 20:21:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 66.7% [2022-10-01 20:21:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 66.66% [2022-10-01 20:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][0/1251] eta 0:47:37 lr 0.000971 time 2.2843 (2.2843) loss 4.4891 (4.4891) grad_norm 1.0320 (1.0320) [2022-10-01 20:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][100/1251] eta 0:06:02 lr 0.000971 time 0.2868 (0.3149) loss 4.0257 (4.1208) grad_norm 1.0596 (1.1172) [2022-10-01 20:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][200/1251] eta 0:05:16 lr 0.000970 time 0.2868 (0.3014) loss 3.3994 (4.1038) grad_norm 0.9308 (1.0932) [2022-10-01 20:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][300/1251] eta 0:04:42 lr 0.000970 time 0.2853 (0.2970) loss 4.2592 (4.1124) grad_norm 1.0311 (1.1034) [2022-10-01 20:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][400/1251] eta 0:04:10 lr 0.000970 time 0.2866 (0.2947) loss 3.5003 (4.0890) grad_norm 1.0747 (1.0973) [2022-10-01 20:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][500/1251] eta 0:03:40 lr 0.000970 time 0.2876 (0.2934) loss 4.2610 (4.0777) grad_norm 1.0805 (1.1003) [2022-10-01 20:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][600/1251] eta 0:03:10 lr 0.000970 time 0.2909 (0.2924) loss 3.5061 (4.0694) grad_norm 0.8338 (1.1025) [2022-10-01 20:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][700/1251] eta 0:02:40 lr 0.000970 time 0.2890 (0.2917) loss 4.6933 (4.0883) grad_norm 1.2074 (1.1043) [2022-10-01 20:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][800/1251] eta 0:02:11 lr 0.000970 time 0.2866 (0.2912) loss 3.1766 (4.0932) grad_norm 1.0124 (1.1067) [2022-10-01 20:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][900/1251] eta 0:01:42 lr 0.000969 time 0.2862 (0.2908) loss 3.3052 (4.0989) grad_norm 1.1162 (1.1071) [2022-10-01 20:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1000/1251] eta 0:01:12 lr 0.000969 time 0.2845 (0.2904) loss 4.4868 (4.1038) grad_norm 1.1425 (1.1067) [2022-10-01 20:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1100/1251] eta 0:00:43 lr 0.000969 time 0.2878 (0.2900) loss 4.0420 (4.1050) grad_norm 1.2698 (1.1060) [2022-10-01 20:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1200/1251] eta 0:00:14 lr 0.000969 time 0.2878 (0.2898) loss 4.6239 (4.1103) grad_norm 1.1863 (1.1064) [2022-10-01 20:28:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 33 training takes 0:06:02 [2022-10-01 20:28:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.642 (2.642) Loss 1.4385 (1.4385) Acc@1 67.090 (67.090) Acc@5 88.184 (88.184) [2022-10-01 20:28:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.142 Acc@5 88.246 [2022-10-01 20:28:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.1% [2022-10-01 20:28:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.14% [2022-10-01 20:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][0/1251] eta 1:09:45 lr 0.000969 time 3.3456 (3.3456) loss 4.2873 (4.2873) grad_norm 1.4069 (1.4069) [2022-10-01 20:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][100/1251] eta 0:06:08 lr 0.000969 time 0.2922 (0.3203) loss 4.2529 (4.0057) grad_norm 1.0913 (1.0945) [2022-10-01 20:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][200/1251] eta 0:05:20 lr 0.000969 time 0.2926 (0.3049) loss 4.4363 (4.0505) grad_norm 1.1147 (1.1066) [2022-10-01 20:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][300/1251] eta 0:04:45 lr 0.000969 time 0.2889 (0.2999) loss 4.2016 (4.0690) grad_norm 1.1027 (1.1112) [2022-10-01 20:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][400/1251] eta 0:04:13 lr 0.000968 time 0.2943 (0.2974) loss 3.5037 (4.0637) grad_norm 1.1224 (1.1181) [2022-10-01 20:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][500/1251] eta 0:03:42 lr 0.000968 time 0.2865 (0.2958) loss 4.4365 (4.0585) grad_norm 1.1482 (1.1188) [2022-10-01 20:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][600/1251] eta 0:03:11 lr 0.000968 time 0.2921 (0.2947) loss 3.5399 (4.0707) grad_norm 1.0115 (1.1158) [2022-10-01 20:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][700/1251] eta 0:02:41 lr 0.000968 time 0.2853 (0.2939) loss 4.4544 (4.0843) grad_norm 0.9848 (1.1146) [2022-10-01 20:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][800/1251] eta 0:02:12 lr 0.000968 time 0.2923 (0.2933) loss 4.2758 (4.0960) grad_norm 1.2739 (1.1155) [2022-10-01 20:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][900/1251] eta 0:01:42 lr 0.000968 time 0.2867 (0.2928) loss 4.2919 (4.1061) grad_norm 1.0527 (1.1168) [2022-10-01 20:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1000/1251] eta 0:01:13 lr 0.000967 time 0.2885 (0.2923) loss 4.8440 (4.1042) grad_norm 1.0925 (1.1167) [2022-10-01 20:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1100/1251] eta 0:00:44 lr 0.000967 time 0.2865 (0.2919) loss 3.8846 (4.1023) grad_norm 0.9926 (1.1140) [2022-10-01 20:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1200/1251] eta 0:00:14 lr 0.000967 time 0.2862 (0.2917) loss 5.0352 (4.1038) grad_norm 1.0137 (1.1131) [2022-10-01 20:34:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 34 training takes 0:06:05 [2022-10-01 20:34:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.206 (3.206) Loss 1.3274 (1.3274) Acc@1 68.750 (68.750) Acc@5 89.355 (89.355) [2022-10-01 20:34:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.416 Acc@5 88.220 [2022-10-01 20:34:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.4% [2022-10-01 20:34:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.42% [2022-10-01 20:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][0/1251] eta 1:09:18 lr 0.000967 time 3.3242 (3.3242) loss 4.4211 (4.4211) grad_norm 1.1985 (1.1985) [2022-10-01 20:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][100/1251] eta 0:06:07 lr 0.000967 time 0.2893 (0.3189) loss 3.1177 (4.0582) grad_norm 1.3894 (1.1155) [2022-10-01 20:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][200/1251] eta 0:05:19 lr 0.000967 time 0.2878 (0.3037) loss 2.9908 (4.0831) grad_norm 1.2475 (1.1201) [2022-10-01 20:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][300/1251] eta 0:04:44 lr 0.000967 time 0.2916 (0.2987) loss 4.5414 (4.1053) grad_norm 1.1558 (1.1053) [2022-10-01 20:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][400/1251] eta 0:04:11 lr 0.000967 time 0.2867 (0.2961) loss 4.6674 (4.1298) grad_norm 1.1001 (1.1059) [2022-10-01 20:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][500/1251] eta 0:03:41 lr 0.000966 time 0.2908 (0.2946) loss 4.2196 (4.1320) grad_norm 0.9496 (1.1064) [2022-10-01 20:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][600/1251] eta 0:03:10 lr 0.000966 time 0.2861 (0.2934) loss 3.8811 (4.1403) grad_norm 0.9420 (1.1045) [2022-10-01 20:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][700/1251] eta 0:02:41 lr 0.000966 time 0.2866 (0.2926) loss 4.1304 (4.1214) grad_norm 1.0639 (1.1050) [2022-10-01 20:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][800/1251] eta 0:02:11 lr 0.000966 time 0.2867 (0.2920) loss 4.5964 (4.1151) grad_norm 1.0417 (1.1058) [2022-10-01 20:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][900/1251] eta 0:01:42 lr 0.000966 time 0.2862 (0.2915) loss 3.3268 (4.1163) grad_norm 0.9509 (1.1051) [2022-10-01 20:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1000/1251] eta 0:01:13 lr 0.000966 time 0.2878 (0.2911) loss 4.1466 (4.1217) grad_norm 1.0557 (1.1029) [2022-10-01 20:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1100/1251] eta 0:00:43 lr 0.000965 time 0.2854 (0.2908) loss 4.8975 (4.1131) grad_norm 1.0629 (1.1019) [2022-10-01 20:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1200/1251] eta 0:00:14 lr 0.000965 time 0.2885 (0.2906) loss 3.1278 (4.1125) grad_norm 1.3235 (1.1056) [2022-10-01 20:40:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 35 training takes 0:06:03 [2022-10-01 20:40:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.152 (3.152) Loss 1.3990 (1.3990) Acc@1 68.164 (68.164) Acc@5 88.867 (88.867) [2022-10-01 20:40:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.322 Acc@5 88.496 [2022-10-01 20:40:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.3% [2022-10-01 20:40:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.42% [2022-10-01 20:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][0/1251] eta 0:46:31 lr 0.000965 time 2.2311 (2.2311) loss 4.4826 (4.4826) grad_norm 1.1017 (1.1017) [2022-10-01 20:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][100/1251] eta 0:06:02 lr 0.000965 time 0.2901 (0.3153) loss 4.1925 (4.1510) grad_norm 0.9807 (1.0972) [2022-10-01 20:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][200/1251] eta 0:05:18 lr 0.000965 time 0.2908 (0.3029) loss 4.7906 (4.1069) grad_norm 1.0989 (1.1151) [2022-10-01 20:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][300/1251] eta 0:04:44 lr 0.000965 time 0.2887 (0.2988) loss 4.5269 (4.1155) grad_norm 1.1906 (1.1075) [2022-10-01 20:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][400/1251] eta 0:04:12 lr 0.000965 time 0.2899 (0.2967) loss 4.8536 (4.1050) grad_norm 1.2311 (1.1045) [2022-10-01 20:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][500/1251] eta 0:03:41 lr 0.000964 time 0.2911 (0.2954) loss 3.7794 (4.0809) grad_norm 1.1339 (1.1105) [2022-10-01 20:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][600/1251] eta 0:03:11 lr 0.000964 time 0.2972 (0.2946) loss 4.4052 (4.0750) grad_norm 0.9845 (1.1130) [2022-10-01 20:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][700/1251] eta 0:02:41 lr 0.000964 time 0.2902 (0.2940) loss 4.1238 (4.0766) grad_norm 1.0194 (1.1131) [2022-10-01 20:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][800/1251] eta 0:02:12 lr 0.000964 time 0.2868 (0.2935) loss 3.5491 (4.0722) grad_norm 1.2279 (1.1135) [2022-10-01 20:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][900/1251] eta 0:01:42 lr 0.000964 time 0.2864 (0.2930) loss 3.5405 (4.0634) grad_norm 1.0992 (1.1178) [2022-10-01 20:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1000/1251] eta 0:01:13 lr 0.000964 time 0.2927 (0.2927) loss 3.2269 (4.0741) grad_norm 1.1349 (1.1175) [2022-10-01 20:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1100/1251] eta 0:00:44 lr 0.000964 time 0.2917 (0.2924) loss 3.9685 (4.0736) grad_norm 1.1597 (1.1160) [2022-10-01 20:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1200/1251] eta 0:00:14 lr 0.000963 time 0.2875 (0.2922) loss 4.4763 (4.0808) grad_norm 1.0149 (1.1131) [2022-10-01 20:46:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 36 training takes 0:06:05 [2022-10-01 20:46:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.305 (2.305) Loss 1.4302 (1.4302) Acc@1 68.359 (68.359) Acc@5 88.086 (88.086) [2022-10-01 20:47:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.924 Acc@5 88.658 [2022-10-01 20:47:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.9% [2022-10-01 20:47:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.92% [2022-10-01 20:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][0/1251] eta 1:00:09 lr 0.000963 time 2.8849 (2.8849) loss 4.2942 (4.2942) grad_norm 1.4255 (1.4255) [2022-10-01 20:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][100/1251] eta 0:06:01 lr 0.000963 time 0.2900 (0.3144) loss 4.5733 (4.1124) grad_norm 1.0347 (1.1304) [2022-10-01 20:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][200/1251] eta 0:05:15 lr 0.000963 time 0.2853 (0.3006) loss 4.4928 (4.0858) grad_norm 1.2952 (1.1349) [2022-10-01 20:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][300/1251] eta 0:04:41 lr 0.000963 time 0.2876 (0.2961) loss 3.3759 (4.0611) grad_norm 0.9843 (1.1182) [2022-10-01 20:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][400/1251] eta 0:04:09 lr 0.000963 time 0.2849 (0.2937) loss 4.1028 (4.0634) grad_norm 1.1992 (1.1127) [2022-10-01 20:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][500/1251] eta 0:03:39 lr 0.000963 time 0.2871 (0.2923) loss 3.2099 (4.0650) grad_norm 1.0278 (1.1050) [2022-10-01 20:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][600/1251] eta 0:03:09 lr 0.000962 time 0.2866 (0.2913) loss 4.8162 (4.0742) grad_norm 1.1392 (1.1081) [2022-10-01 20:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][700/1251] eta 0:02:40 lr 0.000962 time 0.2851 (0.2906) loss 3.1214 (4.0780) grad_norm 1.0073 (1.1103) [2022-10-01 20:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][800/1251] eta 0:02:10 lr 0.000962 time 0.2859 (0.2901) loss 3.7629 (4.0873) grad_norm 1.2722 (1.1107) [2022-10-01 20:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][900/1251] eta 0:01:41 lr 0.000962 time 0.2868 (0.2897) loss 4.0750 (4.0828) grad_norm 1.1617 (1.1066) [2022-10-01 20:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1000/1251] eta 0:01:12 lr 0.000962 time 0.2914 (0.2894) loss 4.3847 (4.0836) grad_norm 1.2757 (1.1049) [2022-10-01 20:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1100/1251] eta 0:00:43 lr 0.000962 time 0.2853 (0.2892) loss 4.8809 (4.0828) grad_norm 0.9330 (1.1051) [2022-10-01 20:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1200/1251] eta 0:00:14 lr 0.000961 time 0.2865 (0.2891) loss 4.6903 (4.0885) grad_norm 1.1516 (1.1053) [2022-10-01 20:53:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 37 training takes 0:06:01 [2022-10-01 20:53:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.300 (3.300) Loss 1.3638 (1.3638) Acc@1 69.434 (69.434) Acc@5 88.965 (88.965) [2022-10-01 20:53:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 67.640 Acc@5 88.614 [2022-10-01 20:53:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 67.6% [2022-10-01 20:53:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 67.92% [2022-10-01 20:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][0/1251] eta 1:06:52 lr 0.000961 time 3.2078 (3.2078) loss 4.2377 (4.2377) grad_norm 1.4562 (1.4562) [2022-10-01 20:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][100/1251] eta 0:06:09 lr 0.000961 time 0.2915 (0.3206) loss 2.9748 (4.0663) grad_norm 1.1496 (1.1246) [2022-10-01 20:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][200/1251] eta 0:05:21 lr 0.000961 time 0.2961 (0.3059) loss 4.2449 (4.0923) grad_norm 0.8949 (1.1162) [2022-10-01 20:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][300/1251] eta 0:04:46 lr 0.000961 time 0.2872 (0.3009) loss 4.0382 (4.1061) grad_norm 1.1290 (1.1186) [2022-10-01 20:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][400/1251] eta 0:04:13 lr 0.000961 time 0.2986 (0.2983) loss 4.7100 (4.1041) grad_norm 1.2240 (1.1187) [2022-10-01 20:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][500/1251] eta 0:03:42 lr 0.000961 time 0.2875 (0.2968) loss 4.1004 (4.0944) grad_norm 1.2826 (1.1229) [2022-10-01 20:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][600/1251] eta 0:03:12 lr 0.000960 time 0.2983 (0.2957) loss 4.6170 (4.0788) grad_norm 1.0517 (1.1192) [2022-10-01 20:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][700/1251] eta 0:02:42 lr 0.000960 time 0.2874 (0.2950) loss 5.0876 (4.0847) grad_norm 1.1887 (1.1234) [2022-10-01 20:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][800/1251] eta 0:02:12 lr 0.000960 time 0.2978 (0.2944) loss 4.1036 (4.0799) grad_norm 1.1868 (1.1248) [2022-10-01 20:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][900/1251] eta 0:01:43 lr 0.000960 time 0.2887 (0.2938) loss 4.1796 (4.0877) grad_norm 1.0763 (1.1246) [2022-10-01 20:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1000/1251] eta 0:01:13 lr 0.000960 time 0.2912 (0.2934) loss 4.9518 (4.0858) grad_norm 0.9001 (1.1219) [2022-10-01 20:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1100/1251] eta 0:00:44 lr 0.000960 time 0.2861 (0.2931) loss 2.9435 (4.0822) grad_norm 1.1142 (1.1220) [2022-10-01 20:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1200/1251] eta 0:00:14 lr 0.000959 time 0.2948 (0.2928) loss 3.4094 (4.0756) grad_norm 1.0244 (1.1206) [2022-10-01 20:59:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 38 training takes 0:06:06 [2022-10-01 20:59:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.158 (3.158) Loss 1.3234 (1.3234) Acc@1 69.629 (69.629) Acc@5 89.160 (89.160) [2022-10-01 20:59:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.400 Acc@5 89.050 [2022-10-01 20:59:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-01 20:59:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.40% [2022-10-01 20:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][0/1251] eta 1:04:48 lr 0.000959 time 3.1087 (3.1087) loss 3.7300 (3.7300) grad_norm 0.9448 (0.9448) [2022-10-01 21:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][100/1251] eta 0:06:04 lr 0.000959 time 0.2866 (0.3165) loss 3.6518 (4.0388) grad_norm 1.0508 (1.1236) [2022-10-01 21:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][200/1251] eta 0:05:17 lr 0.000959 time 0.2884 (0.3020) loss 2.8881 (4.0846) grad_norm 1.2699 (1.1041) [2022-10-01 21:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][300/1251] eta 0:04:42 lr 0.000959 time 0.2914 (0.2973) loss 4.9940 (4.0714) grad_norm 1.0335 (1.1175) [2022-10-01 21:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][400/1251] eta 0:04:10 lr 0.000959 time 0.2904 (0.2948) loss 4.4424 (4.0712) grad_norm 0.9907 (1.1105) [2022-10-01 21:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][500/1251] eta 0:03:40 lr 0.000958 time 0.2911 (0.2934) loss 4.4119 (4.0652) grad_norm 1.2669 (1.1094) [2022-10-01 21:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][600/1251] eta 0:03:10 lr 0.000958 time 0.2883 (0.2924) loss 4.2413 (4.0785) grad_norm 1.0000 (1.1115) [2022-10-01 21:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][700/1251] eta 0:02:40 lr 0.000958 time 0.2866 (0.2918) loss 3.5337 (4.0734) grad_norm 1.0881 (1.1079) [2022-10-01 21:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][800/1251] eta 0:02:11 lr 0.000958 time 0.2882 (0.2912) loss 4.5250 (4.0674) grad_norm 1.0106 (1.1054) [2022-10-01 21:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][900/1251] eta 0:01:42 lr 0.000958 time 0.2887 (0.2908) loss 3.9069 (4.0658) grad_norm 1.0105 (1.1032) [2022-10-01 21:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1000/1251] eta 0:01:12 lr 0.000958 time 0.2900 (0.2904) loss 4.1927 (4.0601) grad_norm 1.1303 (1.1041) [2022-10-01 21:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1100/1251] eta 0:00:43 lr 0.000957 time 0.2890 (0.2902) loss 4.1835 (4.0621) grad_norm 1.1142 (1.1036) [2022-10-01 21:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1200/1251] eta 0:00:14 lr 0.000957 time 0.2844 (0.2899) loss 4.3143 (4.0718) grad_norm 1.3686 (1.1045) [2022-10-01 21:05:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 39 training takes 0:06:02 [2022-10-01 21:05:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.846 (2.846) Loss 1.4096 (1.4096) Acc@1 68.750 (68.750) Acc@5 88.672 (88.672) [2022-10-01 21:05:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.392 Acc@5 89.054 [2022-10-01 21:05:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-10-01 21:05:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.40% [2022-10-01 21:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][0/1251] eta 0:45:43 lr 0.000957 time 2.1928 (2.1928) loss 3.0981 (3.0981) grad_norm 1.0222 (1.0222) [2022-10-01 21:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][100/1251] eta 0:06:03 lr 0.000957 time 0.2847 (0.3155) loss 4.6429 (3.9802) grad_norm 0.9222 (1.1100) [2022-10-01 21:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][200/1251] eta 0:05:17 lr 0.000957 time 0.2873 (0.3018) loss 3.8719 (3.9908) grad_norm 1.2632 (1.1147) [2022-10-01 21:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][300/1251] eta 0:04:42 lr 0.000957 time 0.2876 (0.2972) loss 4.6583 (4.0121) grad_norm 1.0669 (1.1187) [2022-10-01 21:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][400/1251] eta 0:04:11 lr 0.000957 time 0.2864 (0.2949) loss 4.7425 (4.0544) grad_norm 1.1671 (1.1159) [2022-10-01 21:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][500/1251] eta 0:03:40 lr 0.000956 time 0.2857 (0.2935) loss 4.7361 (4.0595) grad_norm 1.2117 (1.1193) [2022-10-01 21:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][600/1251] eta 0:03:10 lr 0.000956 time 0.2869 (0.2925) loss 4.0898 (4.0517) grad_norm 1.4352 (1.1106) [2022-10-01 21:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][700/1251] eta 0:02:40 lr 0.000956 time 0.2858 (0.2918) loss 3.9844 (4.0665) grad_norm 0.9770 (1.1078) [2022-10-01 21:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][800/1251] eta 0:02:11 lr 0.000956 time 0.2851 (0.2912) loss 3.4574 (4.0510) grad_norm 0.9654 (1.1070) [2022-10-01 21:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][900/1251] eta 0:01:42 lr 0.000956 time 0.2867 (0.2908) loss 4.1311 (4.0663) grad_norm 1.2363 (1.1058) [2022-10-01 21:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1000/1251] eta 0:01:12 lr 0.000956 time 0.2860 (0.2905) loss 4.0243 (4.0627) grad_norm 1.2169 (1.1038) [2022-10-01 21:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1100/1251] eta 0:00:43 lr 0.000955 time 0.2871 (0.2903) loss 4.3917 (4.0574) grad_norm 1.1043 (1.1025) [2022-10-01 21:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1200/1251] eta 0:00:14 lr 0.000955 time 0.2879 (0.2901) loss 4.7669 (4.0478) grad_norm 1.3786 (1.1024) [2022-10-01 21:11:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 40 training takes 0:06:03 [2022-10-01 21:11:59 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saving...... [2022-10-01 21:11:59 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saved !!! [2022-10-01 21:12:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.333 (2.333) Loss 1.3944 (1.3944) Acc@1 68.164 (68.164) Acc@5 87.988 (87.988) [2022-10-01 21:12:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.744 Acc@5 89.162 [2022-10-01 21:12:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-01 21:12:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.74% [2022-10-01 21:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][0/1251] eta 0:58:21 lr 0.000955 time 2.7993 (2.7993) loss 4.4149 (4.4149) grad_norm 0.8944 (0.8944) [2022-10-01 21:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][100/1251] eta 0:06:02 lr 0.000955 time 0.2908 (0.3150) loss 3.9865 (4.0262) grad_norm 1.1711 (1.1086) [2022-10-01 21:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][200/1251] eta 0:05:17 lr 0.000955 time 0.2858 (0.3018) loss 4.2904 (4.0285) grad_norm 1.0885 (1.0900) [2022-10-01 21:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][300/1251] eta 0:04:42 lr 0.000955 time 0.2866 (0.2975) loss 4.8029 (4.0561) grad_norm 1.1223 (1.0951) [2022-10-01 21:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][400/1251] eta 0:04:11 lr 0.000954 time 0.2875 (0.2953) loss 4.6724 (4.0662) grad_norm 0.9225 (1.1026) [2022-10-01 21:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][500/1251] eta 0:03:40 lr 0.000954 time 0.2888 (0.2940) loss 2.8296 (4.0408) grad_norm 0.9703 (1.1021) [2022-10-01 21:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][600/1251] eta 0:03:10 lr 0.000954 time 0.2868 (0.2931) loss 4.3276 (4.0336) grad_norm 1.0797 (1.1027) [2022-10-01 21:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][700/1251] eta 0:02:41 lr 0.000954 time 0.2892 (0.2925) loss 3.8559 (4.0396) grad_norm 1.0246 (1.1048) [2022-10-01 21:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][800/1251] eta 0:02:11 lr 0.000954 time 0.2858 (0.2920) loss 3.0229 (4.0346) grad_norm 1.2650 (1.1034) [2022-10-01 21:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][900/1251] eta 0:01:42 lr 0.000954 time 0.2886 (0.2915) loss 3.9501 (4.0465) grad_norm 1.1514 (1.1005) [2022-10-01 21:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1000/1251] eta 0:01:13 lr 0.000953 time 0.2893 (0.2912) loss 3.7127 (4.0457) grad_norm 1.0744 (1.0990) [2022-10-01 21:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1100/1251] eta 0:00:43 lr 0.000953 time 0.2891 (0.2910) loss 3.5161 (4.0481) grad_norm 0.9584 (1.0979) [2022-10-01 21:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1200/1251] eta 0:00:14 lr 0.000953 time 0.2862 (0.2908) loss 3.1577 (4.0498) grad_norm 1.0778 (1.0995) [2022-10-01 21:18:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 41 training takes 0:06:04 [2022-10-01 21:18:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.217 (3.217) Loss 1.3267 (1.3267) Acc@1 69.043 (69.043) Acc@5 90.625 (90.625) [2022-10-01 21:18:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.830 Acc@5 89.374 [2022-10-01 21:18:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.8% [2022-10-01 21:18:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.83% [2022-10-01 21:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][0/1251] eta 0:53:13 lr 0.000953 time 2.5531 (2.5531) loss 4.1009 (4.1009) grad_norm 0.8160 (0.8160) [2022-10-01 21:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][100/1251] eta 0:05:58 lr 0.000953 time 0.2874 (0.3110) loss 4.9786 (4.0995) grad_norm 1.1145 (1.0827) [2022-10-01 21:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][200/1251] eta 0:05:14 lr 0.000953 time 0.2867 (0.2990) loss 3.1934 (4.0187) grad_norm 0.9780 (1.1110) [2022-10-01 21:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][300/1251] eta 0:04:40 lr 0.000952 time 0.2871 (0.2950) loss 4.2567 (4.0001) grad_norm 1.5681 (1.1213) [2022-10-01 21:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][400/1251] eta 0:04:09 lr 0.000952 time 0.2869 (0.2929) loss 3.7447 (4.0006) grad_norm 1.2833 (1.1145) [2022-10-01 21:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][500/1251] eta 0:03:39 lr 0.000952 time 0.2862 (0.2917) loss 3.8052 (4.0082) grad_norm 1.1047 (1.1156) [2022-10-01 21:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][600/1251] eta 0:03:09 lr 0.000952 time 0.2857 (0.2909) loss 3.5916 (4.0104) grad_norm 1.1217 (1.1152) [2022-10-01 21:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][700/1251] eta 0:02:39 lr 0.000952 time 0.2904 (0.2904) loss 3.6549 (4.0022) grad_norm 1.1218 (1.1147) [2022-10-01 21:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][800/1251] eta 0:02:10 lr 0.000951 time 0.2907 (0.2899) loss 3.4375 (3.9950) grad_norm 0.9446 (1.1146) [2022-10-01 21:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][900/1251] eta 0:01:41 lr 0.000951 time 0.2855 (0.2896) loss 4.8088 (3.9924) grad_norm 0.9681 (1.1136) [2022-10-01 21:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1000/1251] eta 0:01:12 lr 0.000951 time 0.2863 (0.2893) loss 3.3624 (3.9977) grad_norm 1.1938 (1.1113) [2022-10-01 21:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1100/1251] eta 0:00:43 lr 0.000951 time 0.2856 (0.2891) loss 4.0732 (3.9974) grad_norm 1.0313 (1.1121) [2022-10-01 21:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1200/1251] eta 0:00:14 lr 0.000951 time 0.2882 (0.2889) loss 4.5533 (3.9999) grad_norm 0.9624 (1.1118) [2022-10-01 21:24:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 42 training takes 0:06:01 [2022-10-01 21:24:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.076 (3.076) Loss 1.3892 (1.3892) Acc@1 66.797 (66.797) Acc@5 89.258 (89.258) [2022-10-01 21:24:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.754 Acc@5 89.478 [2022-10-01 21:24:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.8% [2022-10-01 21:24:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.83% [2022-10-01 21:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][0/1251] eta 1:07:27 lr 0.000951 time 3.2354 (3.2354) loss 4.0649 (4.0649) grad_norm 1.1162 (1.1162) [2022-10-01 21:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][100/1251] eta 0:06:09 lr 0.000950 time 0.2921 (0.3212) loss 4.0650 (4.0535) grad_norm 1.0709 (1.1082) [2022-10-01 21:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][200/1251] eta 0:05:21 lr 0.000950 time 0.2888 (0.3059) loss 4.1272 (4.0253) grad_norm 1.0349 (1.1165) [2022-10-01 21:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][300/1251] eta 0:04:46 lr 0.000950 time 0.2896 (0.3007) loss 4.0005 (4.0355) grad_norm 1.0958 (1.1138) [2022-10-01 21:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][400/1251] eta 0:04:13 lr 0.000950 time 0.2892 (0.2979) loss 3.0453 (4.0120) grad_norm 1.1469 (1.1145) [2022-10-01 21:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][500/1251] eta 0:03:42 lr 0.000950 time 0.2858 (0.2963) loss 4.1680 (4.0258) grad_norm 0.9119 (1.1148) [2022-10-01 21:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][600/1251] eta 0:03:12 lr 0.000950 time 0.2892 (0.2952) loss 4.0751 (4.0190) grad_norm 1.0359 (1.1126) [2022-10-01 21:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][700/1251] eta 0:02:42 lr 0.000949 time 0.2894 (0.2943) loss 4.6350 (4.0082) grad_norm 0.9380 (1.1120) [2022-10-01 21:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][800/1251] eta 0:02:12 lr 0.000949 time 0.2892 (0.2936) loss 4.7898 (4.0162) grad_norm 1.3813 (1.1127) [2022-10-01 21:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][900/1251] eta 0:01:42 lr 0.000949 time 0.2899 (0.2933) loss 4.2631 (4.0111) grad_norm 1.3311 (1.1163) [2022-10-01 21:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1000/1251] eta 0:01:13 lr 0.000949 time 0.2916 (0.2929) loss 3.4408 (3.9968) grad_norm 1.0629 (1.1158) [2022-10-01 21:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1100/1251] eta 0:00:44 lr 0.000949 time 0.2874 (0.2926) loss 4.6211 (4.0014) grad_norm 1.3237 (1.1168) [2022-10-01 21:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1200/1251] eta 0:00:14 lr 0.000948 time 0.2918 (0.2922) loss 2.9134 (3.9975) grad_norm 1.1926 (1.1161) [2022-10-01 21:30:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 43 training takes 0:06:05 [2022-10-01 21:30:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.449 (2.449) Loss 1.4503 (1.4503) Acc@1 66.797 (66.797) Acc@5 87.988 (87.988) [2022-10-01 21:31:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 68.672 Acc@5 89.384 [2022-10-01 21:31:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-10-01 21:31:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 68.83% [2022-10-01 21:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][0/1251] eta 0:51:47 lr 0.000948 time 2.4838 (2.4838) loss 3.7833 (3.7833) grad_norm 1.2309 (1.2309) [2022-10-01 21:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][100/1251] eta 0:06:04 lr 0.000948 time 0.2894 (0.3170) loss 4.1041 (4.0533) grad_norm 0.9699 (1.1269) [2022-10-01 21:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][200/1251] eta 0:05:19 lr 0.000948 time 0.2948 (0.3042) loss 2.9190 (4.0082) grad_norm 1.2758 (1.1263) [2022-10-01 21:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][300/1251] eta 0:04:45 lr 0.000948 time 0.2902 (0.3000) loss 4.1019 (3.9993) grad_norm 1.0191 (1.1223) [2022-10-01 21:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][400/1251] eta 0:04:13 lr 0.000948 time 0.2900 (0.2977) loss 3.9047 (4.0098) grad_norm 1.1177 (1.1119) [2022-10-01 21:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][500/1251] eta 0:03:42 lr 0.000947 time 0.2894 (0.2962) loss 3.0112 (4.0087) grad_norm 1.0080 (1.1093) [2022-10-01 21:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][600/1251] eta 0:03:12 lr 0.000947 time 0.2908 (0.2952) loss 3.9381 (4.0178) grad_norm 1.0445 (1.1115) [2022-10-01 21:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][700/1251] eta 0:02:42 lr 0.000947 time 0.2948 (0.2945) loss 2.7720 (4.0029) grad_norm 0.9140 (1.1149) [2022-10-01 21:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][800/1251] eta 0:02:12 lr 0.000947 time 0.2905 (0.2938) loss 3.4463 (3.9944) grad_norm 1.4306 (1.1171) [2022-10-01 21:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][900/1251] eta 0:01:42 lr 0.000947 time 0.2892 (0.2933) loss 3.7693 (3.9942) grad_norm 1.2066 (1.1176) [2022-10-01 21:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1000/1251] eta 0:01:13 lr 0.000947 time 0.2879 (0.2929) loss 4.2448 (4.0000) grad_norm 1.0705 (1.1145) [2022-10-01 21:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1100/1251] eta 0:00:44 lr 0.000946 time 0.2905 (0.2927) loss 4.4659 (3.9962) grad_norm 1.1604 (1.1110) [2022-10-01 21:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1200/1251] eta 0:00:14 lr 0.000946 time 0.2859 (0.2924) loss 3.9490 (3.9939) grad_norm 0.8633 (1.1126) [2022-10-01 21:37:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 44 training takes 0:06:05 [2022-10-01 21:37:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.153 (3.153) Loss 1.3064 (1.3064) Acc@1 68.945 (68.945) Acc@5 89.551 (89.551) [2022-10-01 21:37:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.632 Acc@5 89.602 [2022-10-01 21:37:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.6% [2022-10-01 21:37:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.63% [2022-10-01 21:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][0/1251] eta 1:06:49 lr 0.000946 time 3.2046 (3.2046) loss 4.8474 (4.8474) grad_norm 1.0907 (1.0907) [2022-10-01 21:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][100/1251] eta 0:06:08 lr 0.000946 time 0.2901 (0.3198) loss 4.1353 (4.1367) grad_norm 1.1773 (1.1077) [2022-10-01 21:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][200/1251] eta 0:05:20 lr 0.000946 time 0.2908 (0.3051) loss 4.6822 (4.0576) grad_norm 1.1938 (1.1221) [2022-10-01 21:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][300/1251] eta 0:04:45 lr 0.000945 time 0.2937 (0.3002) loss 3.6126 (4.0312) grad_norm 1.1848 (1.1193) [2022-10-01 21:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][400/1251] eta 0:04:13 lr 0.000945 time 0.2876 (0.2978) loss 4.0700 (4.0312) grad_norm 1.0097 (1.1169) [2022-10-01 21:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][500/1251] eta 0:03:42 lr 0.000945 time 0.2959 (0.2964) loss 2.8344 (4.0246) grad_norm 1.1137 (1.1120) [2022-10-01 21:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][600/1251] eta 0:03:12 lr 0.000945 time 0.2888 (0.2954) loss 4.0755 (4.0188) grad_norm 0.9825 (1.1129) [2022-10-01 21:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][700/1251] eta 0:02:42 lr 0.000945 time 0.2903 (0.2947) loss 4.4419 (4.0162) grad_norm 1.1837 (1.1158) [2022-10-01 21:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][800/1251] eta 0:02:12 lr 0.000945 time 0.2894 (0.2942) loss 4.3146 (4.0081) grad_norm 1.3028 (1.1180) [2022-10-01 21:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][900/1251] eta 0:01:43 lr 0.000944 time 0.2854 (0.2937) loss 3.7351 (4.0010) grad_norm 1.0377 (1.1178) [2022-10-01 21:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1000/1251] eta 0:01:13 lr 0.000944 time 0.2896 (0.2932) loss 4.4069 (3.9927) grad_norm 1.1415 (1.1182) [2022-10-01 21:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1100/1251] eta 0:00:44 lr 0.000944 time 0.2868 (0.2928) loss 3.0123 (3.9963) grad_norm 1.3234 (1.1179) [2022-10-01 21:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1200/1251] eta 0:00:14 lr 0.000944 time 0.2887 (0.2924) loss 4.7497 (3.9891) grad_norm 0.9779 (1.1176) [2022-10-01 21:43:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 45 training takes 0:06:05 [2022-10-01 21:43:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.209 (3.209) Loss 1.2801 (1.2801) Acc@1 71.973 (71.973) Acc@5 90.234 (90.234) [2022-10-01 21:43:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.412 Acc@5 89.510 [2022-10-01 21:43:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.4% [2022-10-01 21:43:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.63% [2022-10-01 21:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][0/1251] eta 0:56:36 lr 0.000944 time 2.7154 (2.7154) loss 3.0987 (3.0987) grad_norm 1.1890 (1.1890) [2022-10-01 21:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][100/1251] eta 0:06:05 lr 0.000943 time 0.2939 (0.3173) loss 4.2648 (4.0641) grad_norm 1.0812 (1.0996) [2022-10-01 21:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][200/1251] eta 0:05:19 lr 0.000943 time 0.2912 (0.3042) loss 4.2117 (4.0159) grad_norm 1.3620 (1.1146) [2022-10-01 21:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][300/1251] eta 0:04:45 lr 0.000943 time 0.2926 (0.2998) loss 4.3629 (4.0038) grad_norm 0.9941 (1.1139) [2022-10-01 21:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][400/1251] eta 0:04:13 lr 0.000943 time 0.2931 (0.2976) loss 4.3128 (3.9710) grad_norm 1.0302 (1.1174) [2022-10-01 21:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][500/1251] eta 0:03:42 lr 0.000943 time 0.2881 (0.2962) loss 4.6035 (3.9786) grad_norm 1.1918 (1.1184) [2022-10-01 21:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][600/1251] eta 0:03:12 lr 0.000943 time 0.2904 (0.2951) loss 3.3881 (3.9839) grad_norm 1.1301 (1.1193) [2022-10-01 21:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][700/1251] eta 0:02:42 lr 0.000942 time 0.2858 (0.2943) loss 3.9287 (3.9748) grad_norm 0.9681 (1.1160) [2022-10-01 21:47:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][800/1251] eta 0:02:12 lr 0.000942 time 0.2925 (0.2938) loss 4.2313 (3.9812) grad_norm 1.3787 (1.1187) [2022-10-01 21:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][900/1251] eta 0:01:42 lr 0.000942 time 0.2902 (0.2934) loss 4.3669 (3.9739) grad_norm 1.1268 (1.1169) [2022-10-01 21:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1000/1251] eta 0:01:13 lr 0.000942 time 0.2918 (0.2929) loss 3.2216 (3.9753) grad_norm 1.1775 (1.1180) [2022-10-01 21:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1100/1251] eta 0:00:44 lr 0.000942 time 0.2870 (0.2925) loss 2.7930 (3.9641) grad_norm 1.1358 (1.1172) [2022-10-01 21:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1200/1251] eta 0:00:14 lr 0.000941 time 0.2915 (0.2922) loss 4.3512 (3.9671) grad_norm 1.0123 (1.1169) [2022-10-01 21:49:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 46 training takes 0:06:05 [2022-10-01 21:49:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.896 (2.896) Loss 1.2854 (1.2854) Acc@1 69.141 (69.141) Acc@5 90.430 (90.430) [2022-10-01 21:49:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.744 Acc@5 89.956 [2022-10-01 21:49:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.7% [2022-10-01 21:49:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.74% [2022-10-01 21:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][0/1251] eta 0:52:40 lr 0.000941 time 2.5263 (2.5263) loss 4.2631 (4.2631) grad_norm 1.1542 (1.1542) [2022-10-01 21:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][100/1251] eta 0:06:01 lr 0.000941 time 0.2873 (0.3145) loss 3.4590 (3.9319) grad_norm 1.0864 (1.1090) [2022-10-01 21:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][200/1251] eta 0:05:16 lr 0.000941 time 0.2885 (0.3014) loss 4.0397 (3.9488) grad_norm 1.2405 (1.1292) [2022-10-01 21:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][300/1251] eta 0:04:42 lr 0.000941 time 0.2836 (0.2969) loss 4.5091 (3.9465) grad_norm 1.2392 (1.1278) [2022-10-01 21:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][400/1251] eta 0:04:10 lr 0.000940 time 0.2900 (0.2947) loss 4.3624 (3.9481) grad_norm 1.3085 (1.1297) [2022-10-01 21:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][500/1251] eta 0:03:40 lr 0.000940 time 0.2873 (0.2934) loss 4.0704 (3.9469) grad_norm 1.1557 (1.1281) [2022-10-01 21:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][600/1251] eta 0:03:10 lr 0.000940 time 0.2879 (0.2926) loss 3.3719 (3.9714) grad_norm 1.0692 (1.1271) [2022-10-01 21:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][700/1251] eta 0:02:40 lr 0.000940 time 0.2897 (0.2919) loss 4.2821 (3.9661) grad_norm 1.0259 (1.1239) [2022-10-01 21:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][800/1251] eta 0:02:11 lr 0.000940 time 0.2852 (0.2914) loss 4.4512 (3.9795) grad_norm 1.0024 (1.1304) [2022-10-01 21:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][900/1251] eta 0:01:42 lr 0.000939 time 0.2883 (0.2910) loss 2.7679 (3.9663) grad_norm 1.0900 (1.1321) [2022-10-01 21:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1000/1251] eta 0:01:12 lr 0.000939 time 0.2875 (0.2907) loss 4.8177 (3.9643) grad_norm 1.1453 (1.1297) [2022-10-01 21:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1100/1251] eta 0:00:43 lr 0.000939 time 0.2853 (0.2905) loss 4.1486 (3.9678) grad_norm 1.0027 (1.1271) [2022-10-01 21:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1200/1251] eta 0:00:14 lr 0.000939 time 0.2864 (0.2903) loss 4.1947 (3.9719) grad_norm 1.0765 (1.1261) [2022-10-01 21:56:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 47 training takes 0:06:03 [2022-10-01 21:56:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.194 (3.194) Loss 1.2831 (1.2831) Acc@1 70.117 (70.117) Acc@5 90.625 (90.625) [2022-10-01 21:56:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.566 Acc@5 89.732 [2022-10-01 21:56:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.6% [2022-10-01 21:56:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.74% [2022-10-01 21:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][0/1251] eta 1:06:12 lr 0.000939 time 3.1753 (3.1753) loss 4.1991 (4.1991) grad_norm 1.1780 (1.1780) [2022-10-01 21:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][100/1251] eta 0:06:04 lr 0.000939 time 0.2849 (0.3165) loss 3.2169 (4.0030) grad_norm 1.2635 (1.1301) [2022-10-01 21:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][200/1251] eta 0:05:17 lr 0.000938 time 0.2873 (0.3023) loss 4.2375 (3.9809) grad_norm 1.1205 (1.1218) [2022-10-01 21:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][300/1251] eta 0:04:42 lr 0.000938 time 0.2885 (0.2976) loss 4.4036 (3.9497) grad_norm 0.9931 (1.1163) [2022-10-01 21:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][400/1251] eta 0:04:11 lr 0.000938 time 0.2901 (0.2951) loss 4.2260 (3.9521) grad_norm 1.1236 (1.1166) [2022-10-01 21:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][500/1251] eta 0:03:40 lr 0.000938 time 0.2875 (0.2937) loss 4.2500 (3.9652) grad_norm 1.2524 (1.1205) [2022-10-01 21:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][600/1251] eta 0:03:10 lr 0.000938 time 0.2893 (0.2927) loss 4.0004 (3.9698) grad_norm 0.9900 (1.1224) [2022-10-01 21:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][700/1251] eta 0:02:40 lr 0.000937 time 0.2902 (0.2920) loss 3.9651 (3.9759) grad_norm 1.1445 (1.1240) [2022-10-01 22:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][800/1251] eta 0:02:11 lr 0.000937 time 0.2892 (0.2914) loss 4.2405 (3.9829) grad_norm 1.1154 (1.1227) [2022-10-01 22:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][900/1251] eta 0:01:42 lr 0.000937 time 0.2849 (0.2910) loss 4.1757 (3.9750) grad_norm 1.0767 (1.1238) [2022-10-01 22:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1000/1251] eta 0:01:12 lr 0.000937 time 0.2879 (0.2906) loss 3.6241 (3.9750) grad_norm 1.3712 (1.1245) [2022-10-01 22:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1100/1251] eta 0:00:43 lr 0.000937 time 0.2866 (0.2903) loss 4.2065 (3.9729) grad_norm 1.0668 (1.1247) [2022-10-01 22:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1200/1251] eta 0:00:14 lr 0.000936 time 0.2886 (0.2901) loss 3.6979 (3.9696) grad_norm 1.1321 (1.1224) [2022-10-01 22:02:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 48 training takes 0:06:03 [2022-10-01 22:02:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.329 (2.329) Loss 1.3097 (1.3097) Acc@1 71.484 (71.484) Acc@5 89.844 (89.844) [2022-10-01 22:02:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 69.754 Acc@5 89.872 [2022-10-01 22:02:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 69.8% [2022-10-01 22:02:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 69.75% [2022-10-01 22:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][0/1251] eta 0:49:30 lr 0.000936 time 2.3746 (2.3746) loss 4.3620 (4.3620) grad_norm 1.2475 (1.2475) [2022-10-01 22:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][100/1251] eta 0:06:03 lr 0.000936 time 0.2823 (0.3155) loss 3.4664 (3.9216) grad_norm 1.0530 (1.1428) [2022-10-01 22:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][200/1251] eta 0:05:17 lr 0.000936 time 0.2883 (0.3025) loss 4.0093 (3.9108) grad_norm 1.0256 (1.1394) [2022-10-01 22:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][300/1251] eta 0:04:43 lr 0.000936 time 0.2862 (0.2983) loss 4.0064 (3.9086) grad_norm 1.1828 (1.1389) [2022-10-01 22:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][400/1251] eta 0:04:12 lr 0.000935 time 0.2884 (0.2961) loss 4.0654 (3.9188) grad_norm 1.0204 (1.1355) [2022-10-01 22:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][500/1251] eta 0:03:41 lr 0.000935 time 0.2924 (0.2949) loss 3.5963 (3.9253) grad_norm 1.1004 (1.1300) [2022-10-01 22:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][600/1251] eta 0:03:11 lr 0.000935 time 0.2855 (0.2942) loss 4.2177 (3.9213) grad_norm 0.8983 (1.1315) [2022-10-01 22:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][700/1251] eta 0:02:41 lr 0.000935 time 0.2897 (0.2936) loss 4.1098 (3.9341) grad_norm 1.0783 (1.1286) [2022-10-01 22:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][800/1251] eta 0:02:12 lr 0.000935 time 0.2849 (0.2932) loss 4.0947 (3.9324) grad_norm 0.9007 (1.1267) [2022-10-01 22:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][900/1251] eta 0:01:42 lr 0.000934 time 0.2901 (0.2928) loss 2.5454 (3.9346) grad_norm 1.0865 (1.1258) [2022-10-01 22:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1000/1251] eta 0:01:13 lr 0.000934 time 0.2841 (0.2925) loss 4.5598 (3.9274) grad_norm 0.9382 (1.1246) [2022-10-01 22:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1100/1251] eta 0:00:44 lr 0.000934 time 0.2920 (0.2922) loss 3.6225 (3.9286) grad_norm 1.1306 (1.1220) [2022-10-01 22:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1200/1251] eta 0:00:14 lr 0.000934 time 0.2863 (0.2919) loss 4.6683 (3.9297) grad_norm 1.0541 (1.1221) [2022-10-01 22:08:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 49 training takes 0:06:05 [2022-10-01 22:08:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.149 (3.149) Loss 1.3373 (1.3373) Acc@1 68.848 (68.848) Acc@5 87.695 (87.695) [2022-10-01 22:08:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.016 Acc@5 90.002 [2022-10-01 22:08:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-10-01 22:08:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.02% [2022-10-01 22:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][0/1251] eta 1:09:44 lr 0.000934 time 3.3448 (3.3448) loss 3.3920 (3.3920) grad_norm 1.2376 (1.2376) [2022-10-01 22:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][100/1251] eta 0:06:10 lr 0.000933 time 0.2944 (0.3217) loss 4.2883 (3.8897) grad_norm 1.0856 (1.1297) [2022-10-01 22:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][200/1251] eta 0:05:22 lr 0.000933 time 0.2981 (0.3067) loss 4.0607 (3.9433) grad_norm 1.3155 (1.1220) [2022-10-01 22:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][300/1251] eta 0:04:46 lr 0.000933 time 0.2928 (0.3016) loss 2.6909 (3.9375) grad_norm 0.9169 (1.1302) [2022-10-01 22:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][400/1251] eta 0:04:14 lr 0.000933 time 0.2908 (0.2990) loss 3.8389 (3.9676) grad_norm 1.3486 (1.1341) [2022-10-01 22:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][500/1251] eta 0:03:43 lr 0.000933 time 0.2928 (0.2975) loss 4.2174 (3.9682) grad_norm 1.0128 (1.1246) [2022-10-01 22:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][600/1251] eta 0:03:12 lr 0.000932 time 0.2929 (0.2964) loss 3.4706 (3.9697) grad_norm 1.2270 (1.1286) [2022-10-01 22:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][700/1251] eta 0:02:42 lr 0.000932 time 0.2931 (0.2955) loss 3.4196 (3.9595) grad_norm 1.0970 (1.1244) [2022-10-01 22:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][800/1251] eta 0:02:12 lr 0.000932 time 0.2900 (0.2947) loss 3.0994 (3.9636) grad_norm 1.1743 (1.1264) [2022-10-01 22:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][900/1251] eta 0:01:43 lr 0.000932 time 0.2900 (0.2941) loss 4.7552 (3.9644) grad_norm 1.2152 (1.1259) [2022-10-01 22:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1000/1251] eta 0:01:13 lr 0.000932 time 0.2933 (0.2936) loss 3.0967 (3.9592) grad_norm 1.0322 (1.1237) [2022-10-01 22:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1100/1251] eta 0:00:44 lr 0.000931 time 0.2906 (0.2931) loss 3.6348 (3.9547) grad_norm 1.0428 (1.1234) [2022-10-01 22:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1200/1251] eta 0:00:14 lr 0.000931 time 0.2900 (0.2928) loss 3.3376 (3.9570) grad_norm 1.1264 (1.1249) [2022-10-01 22:14:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 50 training takes 0:06:06 [2022-10-01 22:14:54 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saving...... [2022-10-01 22:14:54 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saved !!! [2022-10-01 22:14:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.124 (2.124) Loss 1.3264 (1.3264) Acc@1 70.801 (70.801) Acc@5 89.551 (89.551) [2022-10-01 22:15:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.012 Acc@5 89.994 [2022-10-01 22:15:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-10-01 22:15:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.02% [2022-10-01 22:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][0/1251] eta 1:01:48 lr 0.000931 time 2.9644 (2.9644) loss 2.7773 (2.7773) grad_norm 1.0113 (1.0113) [2022-10-01 22:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][100/1251] eta 0:06:02 lr 0.000931 time 0.2874 (0.3152) loss 3.2509 (4.0019) grad_norm 1.1306 (1.1442) [2022-10-01 22:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][200/1251] eta 0:05:17 lr 0.000931 time 0.2919 (0.3022) loss 4.4050 (3.9626) grad_norm 1.2441 (1.1297) [2022-10-01 22:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][300/1251] eta 0:04:42 lr 0.000930 time 0.2847 (0.2972) loss 4.0069 (3.9468) grad_norm 1.1525 (1.1265) [2022-10-01 22:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][400/1251] eta 0:04:10 lr 0.000930 time 0.2911 (0.2949) loss 3.9182 (3.9248) grad_norm 1.2759 (1.1397) [2022-10-01 22:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][500/1251] eta 0:03:40 lr 0.000930 time 0.2878 (0.2935) loss 4.3916 (3.9216) grad_norm 1.1230 (1.1343) [2022-10-01 22:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][600/1251] eta 0:03:10 lr 0.000930 time 0.2920 (0.2924) loss 3.7712 (3.9228) grad_norm 0.9382 (1.1292) [2022-10-01 22:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][700/1251] eta 0:02:40 lr 0.000930 time 0.2840 (0.2917) loss 3.1811 (3.9249) grad_norm 0.9706 (1.1336) [2022-10-01 22:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][800/1251] eta 0:02:11 lr 0.000929 time 0.2881 (0.2911) loss 3.6518 (3.9224) grad_norm 1.2116 (1.1315) [2022-10-01 22:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][900/1251] eta 0:01:42 lr 0.000929 time 0.2964 (0.2907) loss 3.9252 (3.9222) grad_norm 1.2221 (1.1319) [2022-10-01 22:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1000/1251] eta 0:01:12 lr 0.000929 time 0.2924 (0.2903) loss 3.2545 (3.9169) grad_norm 1.2501 (1.1337) [2022-10-01 22:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1100/1251] eta 0:00:43 lr 0.000929 time 0.2845 (0.2900) loss 4.2692 (3.9209) grad_norm 1.0801 (1.1317) [2022-10-01 22:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1200/1251] eta 0:00:14 lr 0.000929 time 0.2891 (0.2898) loss 3.3426 (3.9235) grad_norm 1.0642 (1.1298) [2022-10-01 22:21:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 51 training takes 0:06:02 [2022-10-01 22:21:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.445 (2.445) Loss 1.3899 (1.3899) Acc@1 68.750 (68.750) Acc@5 88.965 (88.965) [2022-10-01 22:21:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.034 Acc@5 89.854 [2022-10-01 22:21:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-10-01 22:21:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.03% [2022-10-01 22:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][0/1251] eta 1:00:30 lr 0.000928 time 2.9024 (2.9024) loss 3.7879 (3.7879) grad_norm 1.1496 (1.1496) [2022-10-01 22:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][100/1251] eta 0:06:03 lr 0.000928 time 0.2868 (0.3154) loss 4.8472 (3.8824) grad_norm 1.0560 (1.1225) [2022-10-01 22:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][200/1251] eta 0:05:17 lr 0.000928 time 0.2865 (0.3020) loss 3.8398 (3.9614) grad_norm 1.1112 (1.1266) [2022-10-01 22:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][300/1251] eta 0:04:42 lr 0.000928 time 0.2891 (0.2975) loss 4.1655 (3.9336) grad_norm 1.0889 (1.1322) [2022-10-01 22:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][400/1251] eta 0:04:11 lr 0.000928 time 0.2868 (0.2953) loss 4.5898 (3.9512) grad_norm 1.0183 (1.1300) [2022-10-01 22:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][500/1251] eta 0:03:40 lr 0.000927 time 0.2897 (0.2940) loss 4.5394 (3.9665) grad_norm 0.9044 (1.1246) [2022-10-01 22:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][600/1251] eta 0:03:10 lr 0.000927 time 0.2884 (0.2933) loss 3.8271 (3.9535) grad_norm 1.1832 (1.1309) [2022-10-01 22:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][700/1251] eta 0:02:41 lr 0.000927 time 0.2888 (0.2927) loss 4.0392 (3.9411) grad_norm 1.2356 (1.1323) [2022-10-01 22:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][800/1251] eta 0:02:11 lr 0.000927 time 0.2845 (0.2922) loss 4.5749 (3.9360) grad_norm 1.1511 (1.1333) [2022-10-01 22:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][900/1251] eta 0:01:42 lr 0.000926 time 0.2881 (0.2918) loss 4.1413 (3.9416) grad_norm 0.8825 (1.1346) [2022-10-01 22:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1000/1251] eta 0:01:13 lr 0.000926 time 0.2862 (0.2915) loss 3.9430 (3.9387) grad_norm 1.2928 (1.1369) [2022-10-01 22:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1100/1251] eta 0:00:43 lr 0.000926 time 0.2880 (0.2912) loss 3.9662 (3.9426) grad_norm 1.0582 (1.1352) [2022-10-01 22:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1200/1251] eta 0:00:14 lr 0.000926 time 0.2876 (0.2910) loss 3.6749 (3.9450) grad_norm 1.1686 (1.1346) [2022-10-01 22:27:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 52 training takes 0:06:04 [2022-10-01 22:27:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.209 (3.209) Loss 1.3508 (1.3508) Acc@1 69.922 (69.922) Acc@5 89.258 (89.258) [2022-10-01 22:27:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.548 Acc@5 90.336 [2022-10-01 22:27:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.5% [2022-10-01 22:27:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.55% [2022-10-01 22:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][0/1251] eta 1:08:20 lr 0.000926 time 3.2776 (3.2776) loss 4.2038 (4.2038) grad_norm 1.3584 (1.3584) [2022-10-01 22:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][100/1251] eta 0:06:07 lr 0.000925 time 0.2879 (0.3194) loss 3.5891 (3.9259) grad_norm 0.9852 (1.1550) [2022-10-01 22:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][200/1251] eta 0:05:19 lr 0.000925 time 0.2866 (0.3043) loss 4.6362 (3.8863) grad_norm 1.0053 (1.1381) [2022-10-01 22:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][300/1251] eta 0:04:44 lr 0.000925 time 0.2881 (0.2992) loss 2.8985 (3.9082) grad_norm 1.3550 (1.1438) [2022-10-01 22:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][400/1251] eta 0:04:12 lr 0.000925 time 0.2855 (0.2966) loss 4.2094 (3.9124) grad_norm 1.1269 (1.1366) [2022-10-01 22:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][500/1251] eta 0:03:41 lr 0.000925 time 0.2892 (0.2951) loss 4.1514 (3.9198) grad_norm 0.9009 (1.1425) [2022-10-01 22:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][600/1251] eta 0:03:11 lr 0.000924 time 0.2868 (0.2939) loss 4.3494 (3.9342) grad_norm 0.9778 (1.1424) [2022-10-01 22:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][700/1251] eta 0:02:41 lr 0.000924 time 0.2894 (0.2931) loss 3.2303 (3.9337) grad_norm 1.1256 (1.1418) [2022-10-01 22:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][800/1251] eta 0:02:11 lr 0.000924 time 0.2907 (0.2926) loss 3.3828 (3.9240) grad_norm 1.2333 (1.1446) [2022-10-01 22:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][900/1251] eta 0:01:42 lr 0.000924 time 0.2855 (0.2920) loss 4.0356 (3.9225) grad_norm 0.9641 (1.1418) [2022-10-01 22:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1000/1251] eta 0:01:13 lr 0.000923 time 0.2870 (0.2916) loss 3.0320 (3.9201) grad_norm 1.1529 (1.1377) [2022-10-01 22:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1100/1251] eta 0:00:43 lr 0.000923 time 0.2872 (0.2912) loss 3.7731 (3.9154) grad_norm 0.9746 (1.1377) [2022-10-01 22:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1200/1251] eta 0:00:14 lr 0.000923 time 0.2877 (0.2909) loss 4.2272 (3.9154) grad_norm 1.1733 (1.1417) [2022-10-01 22:33:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 53 training takes 0:06:04 [2022-10-01 22:33:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.883 (2.883) Loss 1.2236 (1.2236) Acc@1 71.973 (71.973) Acc@5 90.918 (90.918) [2022-10-01 22:33:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.368 Acc@5 90.334 [2022-10-01 22:33:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-10-01 22:33:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.55% [2022-10-01 22:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][0/1251] eta 1:04:33 lr 0.000923 time 3.0965 (3.0965) loss 2.6332 (2.6332) grad_norm 1.0385 (1.0385) [2022-10-01 22:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][100/1251] eta 0:06:03 lr 0.000923 time 0.2854 (0.3155) loss 4.2011 (3.9368) grad_norm 1.1192 (1.1386) [2022-10-01 22:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][200/1251] eta 0:05:16 lr 0.000922 time 0.2848 (0.3016) loss 3.9091 (3.9464) grad_norm 0.9904 (1.1330) [2022-10-01 22:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][300/1251] eta 0:04:42 lr 0.000922 time 0.2855 (0.2969) loss 3.5504 (3.9220) grad_norm 1.0117 (1.1351) [2022-10-01 22:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][400/1251] eta 0:04:10 lr 0.000922 time 0.2876 (0.2946) loss 4.2495 (3.9122) grad_norm 0.9970 (1.1318) [2022-10-01 22:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][500/1251] eta 0:03:40 lr 0.000922 time 0.2865 (0.2932) loss 2.9218 (3.9053) grad_norm 1.2818 (1.1289) [2022-10-01 22:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][600/1251] eta 0:03:10 lr 0.000922 time 0.2882 (0.2922) loss 3.9709 (3.9103) grad_norm 1.6213 (1.1366) [2022-10-01 22:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][700/1251] eta 0:02:40 lr 0.000921 time 0.2871 (0.2915) loss 3.9210 (3.9311) grad_norm 1.0848 (1.1373) [2022-10-01 22:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][800/1251] eta 0:02:11 lr 0.000921 time 0.2867 (0.2909) loss 4.3710 (3.9367) grad_norm 1.4047 (1.1408) [2022-10-01 22:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][900/1251] eta 0:01:41 lr 0.000921 time 0.2865 (0.2904) loss 4.5503 (3.9466) grad_norm 1.2068 (1.1436) [2022-10-01 22:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1000/1251] eta 0:01:12 lr 0.000921 time 0.2871 (0.2901) loss 4.0172 (3.9479) grad_norm 1.1606 (1.1407) [2022-10-01 22:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1100/1251] eta 0:00:43 lr 0.000920 time 0.2884 (0.2898) loss 4.5010 (3.9500) grad_norm 0.9198 (1.1407) [2022-10-01 22:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1200/1251] eta 0:00:14 lr 0.000920 time 0.2863 (0.2895) loss 4.6062 (3.9459) grad_norm 0.9807 (1.1403) [2022-10-01 22:39:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 54 training takes 0:06:02 [2022-10-01 22:40:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.484 (2.484) Loss 1.2310 (1.2310) Acc@1 72.070 (72.070) Acc@5 90.430 (90.430) [2022-10-01 22:40:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.612 Acc@5 90.352 [2022-10-01 22:40:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.6% [2022-10-01 22:40:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.61% [2022-10-01 22:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][0/1251] eta 1:03:19 lr 0.000920 time 3.0375 (3.0375) loss 3.7298 (3.7298) grad_norm 1.0449 (1.0449) [2022-10-01 22:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][100/1251] eta 0:06:06 lr 0.000920 time 0.2877 (0.3185) loss 4.2662 (3.8960) grad_norm 0.9658 (1.1481) [2022-10-01 22:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][200/1251] eta 0:05:20 lr 0.000920 time 0.2908 (0.3047) loss 4.2095 (3.9457) grad_norm 1.2579 (1.1500) [2022-10-01 22:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][300/1251] eta 0:04:45 lr 0.000919 time 0.2933 (0.3001) loss 3.5193 (3.9101) grad_norm 1.1080 (1.1466) [2022-10-01 22:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][400/1251] eta 0:04:13 lr 0.000919 time 0.2951 (0.2981) loss 2.7995 (3.9036) grad_norm 1.1278 (1.1449) [2022-10-01 22:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][500/1251] eta 0:03:42 lr 0.000919 time 0.2905 (0.2966) loss 3.1920 (3.9056) grad_norm 1.1034 (1.1444) [2022-10-01 22:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][600/1251] eta 0:03:12 lr 0.000919 time 0.2960 (0.2956) loss 3.6789 (3.9082) grad_norm 1.0150 (1.1430) [2022-10-01 22:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][700/1251] eta 0:02:42 lr 0.000919 time 0.2936 (0.2949) loss 4.9801 (3.9124) grad_norm 0.9600 (1.1413) [2022-10-01 22:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][800/1251] eta 0:02:12 lr 0.000918 time 0.2944 (0.2943) loss 3.2178 (3.9032) grad_norm 1.0542 (1.1414) [2022-10-01 22:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][900/1251] eta 0:01:43 lr 0.000918 time 0.2886 (0.2939) loss 4.1507 (3.9031) grad_norm 1.1279 (1.1423) [2022-10-01 22:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1000/1251] eta 0:01:13 lr 0.000918 time 0.2924 (0.2935) loss 4.2033 (3.9131) grad_norm 1.2136 (1.1434) [2022-10-01 22:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1100/1251] eta 0:00:44 lr 0.000918 time 0.2969 (0.2932) loss 4.3028 (3.9073) grad_norm 1.2718 (1.1431) [2022-10-01 22:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1200/1251] eta 0:00:14 lr 0.000917 time 0.2920 (0.2930) loss 3.0339 (3.9035) grad_norm 1.0677 (1.1436) [2022-10-01 22:46:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 55 training takes 0:06:06 [2022-10-01 22:46:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.740 (2.740) Loss 1.2292 (1.2292) Acc@1 71.289 (71.289) Acc@5 91.504 (91.504) [2022-10-01 22:46:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.414 Acc@5 90.234 [2022-10-01 22:46:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-10-01 22:46:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.61% [2022-10-01 22:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][0/1251] eta 1:11:53 lr 0.000917 time 3.4480 (3.4480) loss 3.8297 (3.8297) grad_norm 1.0627 (1.0627) [2022-10-01 22:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][100/1251] eta 0:06:09 lr 0.000917 time 0.2883 (0.3213) loss 3.1117 (3.8818) grad_norm 1.0052 (1.1235) [2022-10-01 22:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][200/1251] eta 0:05:21 lr 0.000917 time 0.2893 (0.3055) loss 3.0467 (3.8798) grad_norm 1.1361 (1.1181) [2022-10-01 22:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][300/1251] eta 0:04:45 lr 0.000917 time 0.2927 (0.3002) loss 4.5871 (3.8952) grad_norm 1.4376 (1.1315) [2022-10-01 22:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][400/1251] eta 0:04:13 lr 0.000916 time 0.2917 (0.2977) loss 4.4019 (3.9162) grad_norm 1.1023 (1.1297) [2022-10-01 22:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][500/1251] eta 0:03:42 lr 0.000916 time 0.2875 (0.2959) loss 4.3002 (3.9177) grad_norm 1.0464 (1.1369) [2022-10-01 22:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][600/1251] eta 0:03:11 lr 0.000916 time 0.2889 (0.2947) loss 3.9734 (3.9240) grad_norm 1.5049 (1.1341) [2022-10-01 22:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][700/1251] eta 0:02:41 lr 0.000916 time 0.2870 (0.2938) loss 3.2763 (3.9284) grad_norm 1.0982 (1.1337) [2022-10-01 22:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][800/1251] eta 0:02:12 lr 0.000915 time 0.2917 (0.2932) loss 3.9731 (3.9186) grad_norm 1.2870 (1.1347) [2022-10-01 22:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][900/1251] eta 0:01:42 lr 0.000915 time 0.2897 (0.2927) loss 4.1579 (3.9246) grad_norm 1.0886 (1.1400) [2022-10-01 22:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1000/1251] eta 0:01:13 lr 0.000915 time 0.2850 (0.2923) loss 4.2992 (3.9310) grad_norm 0.9793 (1.1400) [2022-10-01 22:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1100/1251] eta 0:00:44 lr 0.000915 time 0.2868 (0.2919) loss 3.2567 (3.9237) grad_norm 1.1200 (1.1387) [2022-10-01 22:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1200/1251] eta 0:00:14 lr 0.000915 time 0.2871 (0.2916) loss 3.2037 (3.9214) grad_norm 1.2682 (1.1392) [2022-10-01 22:52:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 56 training takes 0:06:05 [2022-10-01 22:52:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.172 (2.172) Loss 1.3066 (1.3066) Acc@1 69.141 (69.141) Acc@5 90.430 (90.430) [2022-10-01 22:52:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.788 Acc@5 90.342 [2022-10-01 22:52:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.8% [2022-10-01 22:52:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.79% [2022-10-01 22:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][0/1251] eta 0:55:58 lr 0.000914 time 2.6849 (2.6849) loss 2.4225 (2.4225) grad_norm 1.1584 (1.1584) [2022-10-01 22:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][100/1251] eta 0:06:02 lr 0.000914 time 0.2879 (0.3152) loss 4.1639 (3.8201) grad_norm 1.2357 (1.1506) [2022-10-01 22:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][200/1251] eta 0:05:17 lr 0.000914 time 0.2902 (0.3022) loss 3.1776 (3.8792) grad_norm 1.3288 (1.1605) [2022-10-01 22:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][300/1251] eta 0:04:43 lr 0.000914 time 0.2879 (0.2977) loss 3.4223 (3.8668) grad_norm 1.0333 (1.1631) [2022-10-01 22:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][400/1251] eta 0:04:11 lr 0.000913 time 0.2916 (0.2954) loss 4.1211 (3.8950) grad_norm 1.0537 (1.1518) [2022-10-01 22:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][500/1251] eta 0:03:40 lr 0.000913 time 0.2853 (0.2942) loss 4.3345 (3.8885) grad_norm 1.1541 (1.1485) [2022-10-01 22:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][600/1251] eta 0:03:10 lr 0.000913 time 0.2899 (0.2933) loss 3.0793 (3.8792) grad_norm 1.0574 (1.1462) [2022-10-01 22:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][700/1251] eta 0:02:41 lr 0.000913 time 0.2876 (0.2926) loss 4.8852 (3.8839) grad_norm 1.2038 (1.1450) [2022-10-01 22:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][800/1251] eta 0:02:11 lr 0.000913 time 0.2894 (0.2921) loss 3.7101 (3.8901) grad_norm 1.1848 (1.1497) [2022-10-01 22:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][900/1251] eta 0:01:42 lr 0.000912 time 0.2868 (0.2917) loss 4.2149 (3.8810) grad_norm 1.0285 (1.1507) [2022-10-01 22:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1000/1251] eta 0:01:13 lr 0.000912 time 0.2893 (0.2914) loss 3.8111 (3.8931) grad_norm 1.1536 (1.1527) [2022-10-01 22:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1100/1251] eta 0:00:43 lr 0.000912 time 0.2892 (0.2911) loss 4.2136 (3.8875) grad_norm 1.0063 (1.1504) [2022-10-01 22:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1200/1251] eta 0:00:14 lr 0.000912 time 0.2890 (0.2910) loss 4.2157 (3.8852) grad_norm 1.4016 (1.1513) [2022-10-01 22:58:52 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 57 training takes 0:06:04 [2022-10-01 22:58:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.358 (2.358) Loss 1.3473 (1.3473) Acc@1 67.871 (67.871) Acc@5 90.332 (90.332) [2022-10-01 22:59:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.874 Acc@5 90.622 [2022-10-01 22:59:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-10-01 22:59:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 70.87% [2022-10-01 22:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][0/1251] eta 0:56:08 lr 0.000911 time 2.6924 (2.6924) loss 4.1280 (4.1280) grad_norm 1.1373 (1.1373) [2022-10-01 22:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][100/1251] eta 0:06:00 lr 0.000911 time 0.2910 (0.3133) loss 4.3207 (3.8341) grad_norm 1.2244 (1.1593) [2022-10-01 23:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][200/1251] eta 0:05:16 lr 0.000911 time 0.2868 (0.3007) loss 3.5404 (3.8738) grad_norm 1.1033 (1.1681) [2022-10-01 23:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][300/1251] eta 0:04:42 lr 0.000911 time 0.2917 (0.2968) loss 4.1874 (3.8609) grad_norm 1.4237 (1.1717) [2022-10-01 23:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][400/1251] eta 0:04:10 lr 0.000911 time 0.2889 (0.2947) loss 3.6286 (3.8806) grad_norm 1.1493 (1.1711) [2022-10-01 23:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][500/1251] eta 0:03:40 lr 0.000910 time 0.2901 (0.2935) loss 2.7426 (3.8922) grad_norm 1.1906 (1.1670) [2022-10-01 23:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][600/1251] eta 0:03:10 lr 0.000910 time 0.2871 (0.2927) loss 4.2377 (3.9075) grad_norm 1.1432 (1.1645) [2022-10-01 23:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][700/1251] eta 0:02:40 lr 0.000910 time 0.2889 (0.2921) loss 4.0839 (3.9138) grad_norm 1.1516 (1.1585) [2022-10-01 23:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][800/1251] eta 0:02:11 lr 0.000910 time 0.2875 (0.2916) loss 2.8326 (3.9130) grad_norm 1.0568 (1.1550) [2022-10-01 23:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][900/1251] eta 0:01:42 lr 0.000909 time 0.2880 (0.2912) loss 3.8887 (3.9111) grad_norm 1.1314 (1.1551) [2022-10-01 23:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1000/1251] eta 0:01:13 lr 0.000909 time 0.2903 (0.2909) loss 3.7265 (3.9083) grad_norm 1.6490 (1.1575) [2022-10-01 23:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1100/1251] eta 0:00:43 lr 0.000909 time 0.2921 (0.2908) loss 4.2585 (3.9055) grad_norm 0.9830 (1.1583) [2022-10-01 23:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1200/1251] eta 0:00:14 lr 0.000909 time 0.2873 (0.2905) loss 4.2356 (3.8982) grad_norm 1.1271 (1.1582) [2022-10-01 23:05:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 58 training takes 0:06:03 [2022-10-01 23:05:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.302 (3.302) Loss 1.2731 (1.2731) Acc@1 69.531 (69.531) Acc@5 89.746 (89.746) [2022-10-01 23:05:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.068 Acc@5 90.626 [2022-10-01 23:05:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-10-01 23:05:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.07% [2022-10-01 23:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][0/1251] eta 1:08:00 lr 0.000908 time 3.2615 (3.2615) loss 4.2809 (4.2809) grad_norm 1.1879 (1.1879) [2022-10-01 23:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][100/1251] eta 0:06:08 lr 0.000908 time 0.2881 (0.3199) loss 3.6025 (3.8351) grad_norm 0.9760 (1.1228) [2022-10-01 23:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][200/1251] eta 0:05:20 lr 0.000908 time 0.2932 (0.3049) loss 4.3617 (3.8412) grad_norm 0.9995 (1.1232) [2022-10-01 23:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][300/1251] eta 0:04:45 lr 0.000908 time 0.2877 (0.2998) loss 4.1774 (3.8655) grad_norm 1.0920 (1.1404) [2022-10-01 23:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][400/1251] eta 0:04:12 lr 0.000908 time 0.2928 (0.2973) loss 3.5927 (3.8686) grad_norm 1.3129 (1.1382) [2022-10-01 23:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][500/1251] eta 0:03:42 lr 0.000907 time 0.2863 (0.2957) loss 3.4819 (3.8622) grad_norm 1.3646 (1.1398) [2022-10-01 23:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][600/1251] eta 0:03:11 lr 0.000907 time 0.2939 (0.2947) loss 4.0598 (3.8810) grad_norm 0.9475 (1.1445) [2022-10-01 23:08:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][700/1251] eta 0:02:41 lr 0.000907 time 0.2859 (0.2940) loss 4.3188 (3.8789) grad_norm 1.2089 (1.1495) [2022-10-01 23:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][800/1251] eta 0:02:12 lr 0.000907 time 0.2901 (0.2934) loss 3.0218 (3.8813) grad_norm 1.2027 (1.1539) [2022-10-01 23:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][900/1251] eta 0:01:42 lr 0.000906 time 0.2861 (0.2929) loss 3.7528 (3.8847) grad_norm 1.2048 (1.1525) [2022-10-01 23:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1000/1251] eta 0:01:13 lr 0.000906 time 0.2928 (0.2925) loss 3.1335 (3.8931) grad_norm 1.0740 (1.1513) [2022-10-01 23:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1100/1251] eta 0:00:44 lr 0.000906 time 0.2866 (0.2922) loss 4.2725 (3.8945) grad_norm 1.0873 (1.1509) [2022-10-01 23:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1200/1251] eta 0:00:14 lr 0.000906 time 0.2894 (0.2919) loss 4.0999 (3.8896) grad_norm 0.9108 (1.1515) [2022-10-01 23:11:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 59 training takes 0:06:05 [2022-10-01 23:11:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.935 (2.935) Loss 1.2640 (1.2640) Acc@1 70.508 (70.508) Acc@5 91.309 (91.309) [2022-10-01 23:11:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 70.908 Acc@5 90.578 [2022-10-01 23:11:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-10-01 23:11:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.07% [2022-10-01 23:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][0/1251] eta 0:50:20 lr 0.000905 time 2.4143 (2.4143) loss 3.8410 (3.8410) grad_norm 1.0268 (1.0268) [2022-10-01 23:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][100/1251] eta 0:06:03 lr 0.000905 time 0.2902 (0.3161) loss 4.8860 (3.8623) grad_norm 1.1061 (1.1449) [2022-10-01 23:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][200/1251] eta 0:05:18 lr 0.000905 time 0.2918 (0.3035) loss 4.1005 (3.8883) grad_norm 1.1693 (1.1495) [2022-10-01 23:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][300/1251] eta 0:04:44 lr 0.000905 time 0.2957 (0.2995) loss 4.4633 (3.8848) grad_norm 1.1100 (1.1584) [2022-10-01 23:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][400/1251] eta 0:04:13 lr 0.000904 time 0.2890 (0.2974) loss 4.6009 (3.8969) grad_norm 0.9339 (1.1490) [2022-10-01 23:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][500/1251] eta 0:03:42 lr 0.000904 time 0.2890 (0.2960) loss 3.7603 (3.8985) grad_norm 0.9844 (1.1483) [2022-10-01 23:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][600/1251] eta 0:03:12 lr 0.000904 time 0.2894 (0.2951) loss 3.5874 (3.9106) grad_norm 1.1555 (1.1475) [2022-10-01 23:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][700/1251] eta 0:02:42 lr 0.000904 time 0.2894 (0.2943) loss 3.9521 (3.9148) grad_norm 1.2106 (1.1492) [2022-10-01 23:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][800/1251] eta 0:02:12 lr 0.000904 time 0.2897 (0.2938) loss 2.7348 (3.9084) grad_norm 1.1469 (1.1533) [2022-10-01 23:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][900/1251] eta 0:01:42 lr 0.000903 time 0.2902 (0.2933) loss 2.7547 (3.9015) grad_norm 1.2102 (1.1521) [2022-10-01 23:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1000/1251] eta 0:01:13 lr 0.000903 time 0.2902 (0.2930) loss 3.0070 (3.8919) grad_norm 1.0655 (1.1492) [2022-10-01 23:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1100/1251] eta 0:00:44 lr 0.000903 time 0.2914 (0.2927) loss 2.9019 (3.8969) grad_norm 1.1605 (1.1507) [2022-10-01 23:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1200/1251] eta 0:00:14 lr 0.000903 time 0.2951 (0.2925) loss 4.3754 (3.8992) grad_norm 1.0265 (1.1496) [2022-10-01 23:17:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 60 training takes 0:06:06 [2022-10-01 23:17:46 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saving...... [2022-10-01 23:17:46 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saved !!! [2022-10-01 23:17:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.753 (2.753) Loss 1.2560 (1.2560) Acc@1 70.605 (70.605) Acc@5 89.648 (89.648) [2022-10-01 23:17:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.188 Acc@5 90.600 [2022-10-01 23:17:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-01 23:17:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.19% [2022-10-01 23:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][0/1251] eta 0:58:07 lr 0.000902 time 2.7878 (2.7878) loss 4.3480 (4.3480) grad_norm 1.1085 (1.1085) [2022-10-01 23:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][100/1251] eta 0:06:05 lr 0.000902 time 0.2935 (0.3174) loss 4.5093 (3.8600) grad_norm 0.9482 (1.1299) [2022-10-01 23:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][200/1251] eta 0:05:20 lr 0.000902 time 0.2924 (0.3054) loss 3.8748 (3.8554) grad_norm 1.3090 (1.1369) [2022-10-01 23:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][300/1251] eta 0:04:45 lr 0.000902 time 0.2905 (0.3006) loss 4.5975 (3.8711) grad_norm 0.9799 (1.1464) [2022-10-01 23:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][400/1251] eta 0:04:13 lr 0.000901 time 0.2951 (0.2984) loss 4.4059 (3.8769) grad_norm 1.0384 (1.1455) [2022-10-01 23:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][500/1251] eta 0:03:43 lr 0.000901 time 0.2920 (0.2971) loss 2.8451 (3.8738) grad_norm 1.3271 (1.1410) [2022-10-01 23:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][600/1251] eta 0:03:12 lr 0.000901 time 0.2897 (0.2960) loss 3.4454 (3.8877) grad_norm 1.2596 (1.1454) [2022-10-01 23:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][700/1251] eta 0:02:42 lr 0.000901 time 0.2907 (0.2953) loss 3.1527 (3.8783) grad_norm 0.9186 (1.1415) [2022-10-01 23:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][800/1251] eta 0:02:13 lr 0.000900 time 0.2901 (0.2949) loss 4.0412 (3.8857) grad_norm 1.0847 (1.1408) [2022-10-01 23:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][900/1251] eta 0:01:43 lr 0.000900 time 0.2882 (0.2944) loss 4.4115 (3.8952) grad_norm 1.0922 (1.1461) [2022-10-01 23:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1000/1251] eta 0:01:13 lr 0.000900 time 0.2910 (0.2943) loss 4.1850 (3.8948) grad_norm 1.0443 (1.1454) [2022-10-01 23:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1100/1251] eta 0:00:44 lr 0.000900 time 0.2913 (0.2940) loss 3.6536 (3.9000) grad_norm 1.4451 (1.1463) [2022-10-01 23:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1200/1251] eta 0:00:14 lr 0.000899 time 0.2903 (0.2940) loss 2.9713 (3.8960) grad_norm 1.1629 (1.1456) [2022-10-01 23:24:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 61 training takes 0:06:07 [2022-10-01 23:24:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.174 (3.174) Loss 1.2835 (1.2835) Acc@1 70.996 (70.996) Acc@5 90.332 (90.332) [2022-10-01 23:24:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.384 Acc@5 90.838 [2022-10-01 23:24:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-10-01 23:24:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.38% [2022-10-01 23:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][0/1251] eta 1:10:34 lr 0.000899 time 3.3849 (3.3849) loss 4.1336 (4.1336) grad_norm 1.2626 (1.2626) [2022-10-01 23:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][100/1251] eta 0:06:09 lr 0.000899 time 0.2865 (0.3213) loss 4.3257 (3.8212) grad_norm 1.0168 (1.1791) [2022-10-01 23:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][200/1251] eta 0:05:21 lr 0.000899 time 0.2895 (0.3056) loss 4.0561 (3.7957) grad_norm 1.2999 (1.1706) [2022-10-01 23:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][300/1251] eta 0:04:46 lr 0.000899 time 0.2895 (0.3009) loss 4.0960 (3.8247) grad_norm 1.1081 (1.1591) [2022-10-01 23:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][400/1251] eta 0:04:13 lr 0.000898 time 0.2957 (0.2984) loss 4.4119 (3.8418) grad_norm 1.3072 (1.1676) [2022-10-01 23:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][500/1251] eta 0:03:42 lr 0.000898 time 0.2931 (0.2966) loss 3.8710 (3.8415) grad_norm 1.0898 (1.1646) [2022-10-01 23:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][600/1251] eta 0:03:12 lr 0.000898 time 0.2886 (0.2953) loss 3.5482 (3.8534) grad_norm 1.1636 (1.1608) [2022-10-01 23:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][700/1251] eta 0:02:42 lr 0.000898 time 0.2870 (0.2944) loss 4.2510 (3.8581) grad_norm 1.2250 (1.1591) [2022-10-01 23:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][800/1251] eta 0:02:12 lr 0.000897 time 0.2898 (0.2937) loss 2.8433 (3.8563) grad_norm 1.0753 (1.1588) [2022-10-01 23:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][900/1251] eta 0:01:42 lr 0.000897 time 0.2910 (0.2931) loss 3.1198 (3.8507) grad_norm 1.0822 (1.1568) [2022-10-01 23:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1000/1251] eta 0:01:13 lr 0.000897 time 0.2922 (0.2927) loss 3.7448 (3.8501) grad_norm 1.0619 (1.1576) [2022-10-01 23:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1100/1251] eta 0:00:44 lr 0.000897 time 0.2854 (0.2923) loss 4.2759 (3.8572) grad_norm 1.2820 (1.1566) [2022-10-01 23:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1200/1251] eta 0:00:14 lr 0.000896 time 0.2912 (0.2920) loss 4.3684 (3.8639) grad_norm 1.0734 (1.1566) [2022-10-01 23:30:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 62 training takes 0:06:05 [2022-10-01 23:30:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.056 (3.056) Loss 1.2792 (1.2792) Acc@1 70.703 (70.703) Acc@5 90.527 (90.527) [2022-10-01 23:30:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.124 Acc@5 90.712 [2022-10-01 23:30:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-10-01 23:30:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.38% [2022-10-01 23:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][0/1251] eta 1:04:02 lr 0.000896 time 3.0718 (3.0718) loss 3.9783 (3.9783) grad_norm 0.9772 (0.9772) [2022-10-01 23:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][100/1251] eta 0:06:08 lr 0.000896 time 0.2925 (0.3200) loss 4.3351 (3.8524) grad_norm 1.2630 (1.1332) [2022-10-01 23:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][200/1251] eta 0:05:21 lr 0.000896 time 0.2886 (0.3057) loss 3.7773 (3.8252) grad_norm 1.1220 (1.1597) [2022-10-01 23:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][300/1251] eta 0:04:46 lr 0.000895 time 0.2909 (0.3009) loss 3.3727 (3.8227) grad_norm 0.9872 (1.1570) [2022-10-01 23:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][400/1251] eta 0:04:13 lr 0.000895 time 0.2893 (0.2983) loss 3.4683 (3.8431) grad_norm 0.9811 (1.1594) [2022-10-01 23:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][500/1251] eta 0:03:42 lr 0.000895 time 0.2878 (0.2968) loss 3.7330 (3.8662) grad_norm 1.0452 (1.1565) [2022-10-01 23:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][600/1251] eta 0:03:12 lr 0.000895 time 0.2937 (0.2958) loss 3.1434 (3.8671) grad_norm 1.0087 (1.1597) [2022-10-01 23:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][700/1251] eta 0:02:42 lr 0.000894 time 0.2910 (0.2950) loss 4.1849 (3.8832) grad_norm 1.0061 (1.1602) [2022-10-01 23:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][800/1251] eta 0:02:12 lr 0.000894 time 0.2898 (0.2944) loss 4.5800 (3.8720) grad_norm 0.9476 (1.1585) [2022-10-01 23:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][900/1251] eta 0:01:43 lr 0.000894 time 0.2915 (0.2940) loss 3.7090 (3.8650) grad_norm 1.0818 (1.1642) [2022-10-01 23:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1000/1251] eta 0:01:13 lr 0.000894 time 0.2899 (0.2936) loss 4.4306 (3.8651) grad_norm 1.0854 (1.1672) [2022-10-01 23:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1100/1251] eta 0:00:44 lr 0.000893 time 0.2895 (0.2933) loss 3.5378 (3.8609) grad_norm 1.1857 (1.1656) [2022-10-01 23:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1200/1251] eta 0:00:14 lr 0.000893 time 0.2880 (0.2930) loss 3.1635 (3.8679) grad_norm 0.9555 (1.1624) [2022-10-01 23:36:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 63 training takes 0:06:06 [2022-10-01 23:36:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.434 (2.434) Loss 1.2050 (1.2050) Acc@1 71.777 (71.777) Acc@5 91.504 (91.504) [2022-10-01 23:36:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.292 Acc@5 90.864 [2022-10-01 23:36:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.3% [2022-10-01 23:36:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.38% [2022-10-01 23:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][0/1251] eta 1:00:52 lr 0.000893 time 2.9197 (2.9197) loss 3.8439 (3.8439) grad_norm 1.0407 (1.0407) [2022-10-01 23:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][100/1251] eta 0:06:04 lr 0.000893 time 0.2873 (0.3167) loss 3.9912 (3.8559) grad_norm 1.0591 (1.1771) [2022-10-01 23:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][200/1251] eta 0:05:18 lr 0.000892 time 0.2874 (0.3029) loss 3.5348 (3.8542) grad_norm 1.1943 (1.1596) [2022-10-01 23:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][300/1251] eta 0:04:43 lr 0.000892 time 0.2896 (0.2983) loss 3.5022 (3.8578) grad_norm 0.9957 (1.1629) [2022-10-01 23:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][400/1251] eta 0:04:11 lr 0.000892 time 0.2855 (0.2959) loss 3.7777 (3.8513) grad_norm 0.9896 (1.1642) [2022-10-01 23:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][500/1251] eta 0:03:41 lr 0.000892 time 0.2898 (0.2945) loss 3.7264 (3.8488) grad_norm 1.0349 (1.1652) [2022-10-01 23:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][600/1251] eta 0:03:11 lr 0.000891 time 0.2880 (0.2935) loss 4.8275 (3.8577) grad_norm 1.0611 (1.1601) [2022-10-01 23:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][700/1251] eta 0:02:41 lr 0.000891 time 0.2906 (0.2929) loss 3.5698 (3.8491) grad_norm 1.0629 (1.1602) [2022-10-01 23:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][800/1251] eta 0:02:11 lr 0.000891 time 0.2863 (0.2923) loss 4.6680 (3.8552) grad_norm 1.1439 (1.1609) [2022-10-01 23:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][900/1251] eta 0:01:42 lr 0.000891 time 0.2911 (0.2919) loss 4.3123 (3.8670) grad_norm 1.0801 (1.1597) [2022-10-01 23:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1000/1251] eta 0:01:13 lr 0.000890 time 0.2886 (0.2916) loss 3.8558 (3.8717) grad_norm 1.6200 (1.1591) [2022-10-01 23:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1100/1251] eta 0:00:43 lr 0.000890 time 0.2871 (0.2912) loss 3.7155 (3.8698) grad_norm 1.2667 (1.1591) [2022-10-01 23:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1200/1251] eta 0:00:14 lr 0.000890 time 0.2875 (0.2910) loss 4.0049 (3.8672) grad_norm 1.1354 (1.1618) [2022-10-01 23:43:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 64 training takes 0:06:04 [2022-10-01 23:43:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.215 (3.215) Loss 1.2274 (1.2274) Acc@1 69.629 (69.629) Acc@5 90.918 (90.918) [2022-10-01 23:43:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.682 Acc@5 90.926 [2022-10-01 23:43:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-10-01 23:43:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.68% [2022-10-01 23:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][0/1251] eta 1:08:20 lr 0.000890 time 3.2778 (3.2778) loss 3.2374 (3.2374) grad_norm 1.0380 (1.0380) [2022-10-01 23:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][100/1251] eta 0:06:08 lr 0.000889 time 0.2932 (0.3201) loss 4.6586 (3.7704) grad_norm 1.1512 (1.1281) [2022-10-01 23:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][200/1251] eta 0:05:21 lr 0.000889 time 0.2894 (0.3055) loss 4.5806 (3.7853) grad_norm 1.0924 (1.1502) [2022-10-01 23:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][300/1251] eta 0:04:45 lr 0.000889 time 0.2894 (0.3003) loss 3.8076 (3.7880) grad_norm 0.9514 (1.1505) [2022-10-01 23:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][400/1251] eta 0:04:13 lr 0.000889 time 0.2895 (0.2977) loss 3.0428 (3.8096) grad_norm 1.0256 (1.1496) [2022-10-01 23:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][500/1251] eta 0:03:42 lr 0.000888 time 0.2885 (0.2961) loss 2.9259 (3.8172) grad_norm 1.0957 (1.1565) [2022-10-01 23:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][600/1251] eta 0:03:12 lr 0.000888 time 0.2871 (0.2951) loss 4.1823 (3.8298) grad_norm 1.0583 (1.1546) [2022-10-01 23:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][700/1251] eta 0:02:42 lr 0.000888 time 0.2898 (0.2943) loss 4.2417 (3.8444) grad_norm 1.2150 (1.1571) [2022-10-01 23:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][800/1251] eta 0:02:12 lr 0.000888 time 0.2907 (0.2938) loss 3.6180 (3.8618) grad_norm 1.0404 (1.1555) [2022-10-01 23:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][900/1251] eta 0:01:42 lr 0.000887 time 0.2888 (0.2934) loss 3.3299 (3.8604) grad_norm 0.9521 (1.1549) [2022-10-01 23:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1000/1251] eta 0:01:13 lr 0.000887 time 0.2907 (0.2931) loss 2.4720 (3.8598) grad_norm 1.0293 (1.1548) [2022-10-01 23:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1100/1251] eta 0:00:44 lr 0.000887 time 0.2887 (0.2928) loss 3.8605 (3.8577) grad_norm 1.1793 (1.1588) [2022-10-01 23:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1200/1251] eta 0:00:14 lr 0.000887 time 0.2939 (0.2926) loss 4.4572 (3.8662) grad_norm 1.0306 (1.1593) [2022-10-01 23:49:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 65 training takes 0:06:06 [2022-10-01 23:49:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.214 (3.214) Loss 1.2205 (1.2205) Acc@1 72.656 (72.656) Acc@5 89.844 (89.844) [2022-10-01 23:49:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.182 Acc@5 90.862 [2022-10-01 23:49:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-10-01 23:49:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.68% [2022-10-01 23:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][0/1251] eta 1:06:19 lr 0.000886 time 3.1811 (3.1811) loss 3.0444 (3.0444) grad_norm 1.6642 (1.6642) [2022-10-01 23:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][100/1251] eta 0:06:08 lr 0.000886 time 0.2915 (0.3197) loss 3.7629 (3.8543) grad_norm 0.9875 (1.1944) [2022-10-01 23:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][200/1251] eta 0:05:20 lr 0.000886 time 0.2890 (0.3049) loss 3.1658 (3.8687) grad_norm 1.0535 (1.1985) [2022-10-01 23:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][300/1251] eta 0:04:45 lr 0.000886 time 0.2936 (0.3000) loss 3.8459 (3.8602) grad_norm 1.2387 (1.1916) [2022-10-01 23:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][400/1251] eta 0:04:13 lr 0.000885 time 0.2873 (0.2976) loss 4.1876 (3.8683) grad_norm 1.3882 (1.1793) [2022-10-01 23:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][500/1251] eta 0:03:42 lr 0.000885 time 0.2927 (0.2961) loss 3.9193 (3.8438) grad_norm 1.0622 (1.1829) [2022-10-01 23:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][600/1251] eta 0:03:12 lr 0.000885 time 0.2869 (0.2951) loss 3.7060 (3.8566) grad_norm 1.3674 (1.1827) [2022-10-01 23:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][700/1251] eta 0:02:42 lr 0.000885 time 0.2933 (0.2943) loss 3.2055 (3.8599) grad_norm 1.0031 (1.1776) [2022-10-01 23:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][800/1251] eta 0:02:12 lr 0.000884 time 0.2867 (0.2937) loss 3.9672 (3.8565) grad_norm 1.1959 (1.1730) [2022-10-01 23:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][900/1251] eta 0:01:42 lr 0.000884 time 0.2932 (0.2932) loss 2.9991 (3.8558) grad_norm 1.1063 (1.1725) [2022-10-01 23:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1000/1251] eta 0:01:13 lr 0.000884 time 0.2985 (0.2928) loss 2.8747 (3.8565) grad_norm 1.2004 (1.1672) [2022-10-01 23:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1100/1251] eta 0:00:44 lr 0.000883 time 0.2942 (0.2925) loss 4.2542 (3.8614) grad_norm 1.1632 (1.1708) [2022-10-01 23:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1200/1251] eta 0:00:14 lr 0.000883 time 0.2879 (0.2923) loss 4.1407 (3.8622) grad_norm 0.9776 (1.1701) [2022-10-01 23:55:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 66 training takes 0:06:05 [2022-10-01 23:55:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.202 (2.202) Loss 1.1546 (1.1546) Acc@1 73.242 (73.242) Acc@5 91.602 (91.602) [2022-10-01 23:55:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.476 Acc@5 90.796 [2022-10-01 23:55:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-10-01 23:55:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.68% [2022-10-01 23:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][0/1251] eta 1:02:55 lr 0.000883 time 3.0178 (3.0178) loss 3.9027 (3.9027) grad_norm 1.1310 (1.1310) [2022-10-01 23:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][100/1251] eta 0:06:04 lr 0.000883 time 0.2894 (0.3169) loss 3.1708 (3.8110) grad_norm 1.1589 (1.1823) [2022-10-01 23:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][200/1251] eta 0:05:18 lr 0.000883 time 0.2925 (0.3029) loss 4.1818 (3.8268) grad_norm 0.9605 (1.1642) [2022-10-01 23:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][300/1251] eta 0:04:43 lr 0.000882 time 0.2929 (0.2981) loss 4.4736 (3.8360) grad_norm 1.3509 (1.1668) [2022-10-01 23:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][400/1251] eta 0:04:11 lr 0.000882 time 0.2902 (0.2958) loss 3.0559 (3.8500) grad_norm 1.4236 (1.1720) [2022-10-01 23:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][500/1251] eta 0:03:41 lr 0.000882 time 0.2896 (0.2944) loss 4.1941 (3.8573) grad_norm 1.1105 (1.1740) [2022-10-01 23:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][600/1251] eta 0:03:11 lr 0.000881 time 0.2891 (0.2935) loss 4.2068 (3.8631) grad_norm 1.2457 (1.1753) [2022-10-01 23:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][700/1251] eta 0:02:41 lr 0.000881 time 0.2890 (0.2928) loss 3.7990 (3.8604) grad_norm 1.4052 (1.1723) [2022-10-01 23:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][800/1251] eta 0:02:11 lr 0.000881 time 0.2893 (0.2923) loss 2.8054 (3.8568) grad_norm 1.2293 (1.1709) [2022-10-02 00:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][900/1251] eta 0:01:42 lr 0.000881 time 0.2911 (0.2920) loss 2.9553 (3.8663) grad_norm 1.2056 (1.1693) [2022-10-02 00:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1000/1251] eta 0:01:13 lr 0.000880 time 0.2899 (0.2916) loss 3.6474 (3.8647) grad_norm 1.1179 (1.1671) [2022-10-02 00:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1100/1251] eta 0:00:43 lr 0.000880 time 0.2862 (0.2913) loss 4.2130 (3.8698) grad_norm 1.3005 (1.1679) [2022-10-02 00:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1200/1251] eta 0:00:14 lr 0.000880 time 0.2900 (0.2910) loss 4.4729 (3.8714) grad_norm 1.0356 (1.1665) [2022-10-02 00:01:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 67 training takes 0:06:04 [2022-10-02 00:01:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.241 (2.241) Loss 1.1257 (1.1257) Acc@1 73.926 (73.926) Acc@5 92.383 (92.383) [2022-10-02 00:02:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.558 Acc@5 90.946 [2022-10-02 00:02:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-10-02 00:02:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.68% [2022-10-02 00:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][0/1251] eta 1:06:25 lr 0.000880 time 3.1859 (3.1859) loss 4.8112 (4.8112) grad_norm 1.1784 (1.1784) [2022-10-02 00:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][100/1251] eta 0:06:07 lr 0.000879 time 0.2880 (0.3194) loss 2.5622 (3.8654) grad_norm 1.0527 (1.1722) [2022-10-02 00:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][200/1251] eta 0:05:20 lr 0.000879 time 0.2914 (0.3050) loss 3.1641 (3.8366) grad_norm 1.1606 (1.1698) [2022-10-02 00:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][300/1251] eta 0:04:45 lr 0.000879 time 0.2863 (0.3002) loss 4.6305 (3.8412) grad_norm 0.9909 (1.1728) [2022-10-02 00:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][400/1251] eta 0:04:13 lr 0.000879 time 0.2933 (0.2977) loss 4.1228 (3.8442) grad_norm 1.0015 (1.1788) [2022-10-02 00:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][500/1251] eta 0:03:42 lr 0.000878 time 0.2859 (0.2962) loss 4.0933 (3.8494) grad_norm 1.1284 (1.1773) [2022-10-02 00:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][600/1251] eta 0:03:12 lr 0.000878 time 0.2900 (0.2951) loss 3.3048 (3.8470) grad_norm 1.2919 (1.1754) [2022-10-02 00:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][700/1251] eta 0:02:42 lr 0.000878 time 0.2865 (0.2942) loss 4.2930 (3.8529) grad_norm 1.2151 (1.1697) [2022-10-02 00:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][800/1251] eta 0:02:12 lr 0.000878 time 0.2878 (0.2936) loss 2.9210 (3.8569) grad_norm 1.2061 (1.1707) [2022-10-02 00:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][900/1251] eta 0:01:42 lr 0.000877 time 0.2858 (0.2931) loss 3.7916 (3.8493) grad_norm 0.9203 (1.1710) [2022-10-02 00:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1000/1251] eta 0:01:13 lr 0.000877 time 0.2877 (0.2927) loss 4.7700 (3.8438) grad_norm 1.0906 (1.1723) [2022-10-02 00:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1100/1251] eta 0:00:44 lr 0.000877 time 0.2862 (0.2923) loss 3.0091 (3.8399) grad_norm 1.2664 (1.1717) [2022-10-02 00:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1200/1251] eta 0:00:14 lr 0.000876 time 0.2855 (0.2920) loss 2.6424 (3.8380) grad_norm 1.1769 (1.1728) [2022-10-02 00:08:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 68 training takes 0:06:05 [2022-10-02 00:08:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.541 (2.541) Loss 1.2314 (1.2314) Acc@1 71.387 (71.387) Acc@5 90.625 (90.625) [2022-10-02 00:08:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.730 Acc@5 91.050 [2022-10-02 00:08:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-10-02 00:08:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 71.73% [2022-10-02 00:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][0/1251] eta 1:07:50 lr 0.000876 time 3.2539 (3.2539) loss 3.5745 (3.5745) grad_norm 1.1292 (1.1292) [2022-10-02 00:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][100/1251] eta 0:06:06 lr 0.000876 time 0.2921 (0.3187) loss 4.1567 (3.8593) grad_norm 1.0603 (1.1705) [2022-10-02 00:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][200/1251] eta 0:05:19 lr 0.000876 time 0.2921 (0.3038) loss 4.1707 (3.8242) grad_norm 1.4062 (1.1745) [2022-10-02 00:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][300/1251] eta 0:04:44 lr 0.000875 time 0.2894 (0.2989) loss 4.0644 (3.8209) grad_norm 1.1154 (1.1693) [2022-10-02 00:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][400/1251] eta 0:04:12 lr 0.000875 time 0.2888 (0.2965) loss 3.8606 (3.8078) grad_norm 1.0825 (1.1721) [2022-10-02 00:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][500/1251] eta 0:03:41 lr 0.000875 time 0.2872 (0.2951) loss 4.0011 (3.7995) grad_norm 1.0558 (1.1704) [2022-10-02 00:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][600/1251] eta 0:03:11 lr 0.000875 time 0.2883 (0.2940) loss 4.3210 (3.8147) grad_norm 1.1601 (1.1656) [2022-10-02 00:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][700/1251] eta 0:02:41 lr 0.000874 time 0.2890 (0.2933) loss 3.7957 (3.8121) grad_norm 1.0547 (1.1690) [2022-10-02 00:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][800/1251] eta 0:02:11 lr 0.000874 time 0.2871 (0.2927) loss 4.2820 (3.8186) grad_norm 1.3557 (1.1665) [2022-10-02 00:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][900/1251] eta 0:01:42 lr 0.000874 time 0.2890 (0.2921) loss 3.3509 (3.8177) grad_norm 1.2351 (1.1640) [2022-10-02 00:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1000/1251] eta 0:01:13 lr 0.000874 time 0.2951 (0.2918) loss 3.6811 (3.8284) grad_norm 1.1230 (1.1659) [2022-10-02 00:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1100/1251] eta 0:00:44 lr 0.000873 time 0.2948 (0.2915) loss 3.6424 (3.8314) grad_norm 1.0946 (1.1681) [2022-10-02 00:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1200/1251] eta 0:00:14 lr 0.000873 time 0.2880 (0.2912) loss 2.7496 (3.8293) grad_norm 1.2076 (1.1686) [2022-10-02 00:14:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 69 training takes 0:06:04 [2022-10-02 00:14:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.177 (3.177) Loss 1.2228 (1.2228) Acc@1 72.070 (72.070) Acc@5 90.723 (90.723) [2022-10-02 00:14:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.030 Acc@5 91.066 [2022-10-02 00:14:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-10-02 00:14:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.03% [2022-10-02 00:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][0/1251] eta 1:04:11 lr 0.000873 time 3.0788 (3.0788) loss 3.9995 (3.9995) grad_norm 0.9691 (0.9691) [2022-10-02 00:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][100/1251] eta 0:06:07 lr 0.000873 time 0.2876 (0.3190) loss 3.1292 (3.8134) grad_norm 1.4221 (1.1762) [2022-10-02 00:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][200/1251] eta 0:05:20 lr 0.000872 time 0.2918 (0.3049) loss 3.2119 (3.8246) grad_norm 1.3176 (1.1828) [2022-10-02 00:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][300/1251] eta 0:04:45 lr 0.000872 time 0.2903 (0.3002) loss 4.5275 (3.8241) grad_norm 0.9803 (1.1728) [2022-10-02 00:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][400/1251] eta 0:04:13 lr 0.000872 time 0.2902 (0.2978) loss 4.2136 (3.8205) grad_norm 1.2524 (1.1710) [2022-10-02 00:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][500/1251] eta 0:03:42 lr 0.000871 time 0.2868 (0.2963) loss 3.5797 (3.8201) grad_norm 1.0862 (1.1717) [2022-10-02 00:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][600/1251] eta 0:03:12 lr 0.000871 time 0.2881 (0.2953) loss 4.1424 (3.8107) grad_norm 1.2542 (1.1738) [2022-10-02 00:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][700/1251] eta 0:02:42 lr 0.000871 time 0.2885 (0.2946) loss 3.8284 (3.8151) grad_norm 1.2706 (1.1685) [2022-10-02 00:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][800/1251] eta 0:02:12 lr 0.000871 time 0.2892 (0.2941) loss 4.2471 (3.8145) grad_norm 1.0129 (1.1692) [2022-10-02 00:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][900/1251] eta 0:01:43 lr 0.000870 time 0.2873 (0.2937) loss 4.5574 (3.8290) grad_norm 1.3139 (1.1688) [2022-10-02 00:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1000/1251] eta 0:01:13 lr 0.000870 time 0.2896 (0.2933) loss 4.6515 (3.8342) grad_norm 1.1321 (1.1690) [2022-10-02 00:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1100/1251] eta 0:00:44 lr 0.000870 time 0.2857 (0.2930) loss 4.6611 (3.8450) grad_norm 1.1399 (1.1718) [2022-10-02 00:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1200/1251] eta 0:00:14 lr 0.000870 time 0.2890 (0.2927) loss 3.7693 (3.8440) grad_norm 1.0442 (1.1697) [2022-10-02 00:20:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 70 training takes 0:06:06 [2022-10-02 00:20:50 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saving...... [2022-10-02 00:20:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saved !!! [2022-10-02 00:20:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.785 (2.785) Loss 1.1911 (1.1911) Acc@1 72.852 (72.852) Acc@5 90.723 (90.723) [2022-10-02 00:21:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.674 Acc@5 90.986 [2022-10-02 00:21:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-10-02 00:21:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.03% [2022-10-02 00:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][0/1251] eta 0:47:51 lr 0.000869 time 2.2957 (2.2957) loss 3.3227 (3.3227) grad_norm 1.1179 (1.1179) [2022-10-02 00:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][100/1251] eta 0:06:05 lr 0.000869 time 0.2887 (0.3176) loss 4.2049 (3.8391) grad_norm 1.3441 (1.1860) [2022-10-02 00:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][200/1251] eta 0:05:19 lr 0.000869 time 0.2873 (0.3037) loss 3.9604 (3.8112) grad_norm 1.0364 (1.1732) [2022-10-02 00:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][300/1251] eta 0:04:44 lr 0.000869 time 0.2862 (0.2990) loss 4.4407 (3.7899) grad_norm 1.1212 (1.1762) [2022-10-02 00:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][400/1251] eta 0:04:12 lr 0.000868 time 0.2895 (0.2965) loss 3.3000 (3.7791) grad_norm 1.0108 (1.1683) [2022-10-02 00:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][500/1251] eta 0:03:41 lr 0.000868 time 0.2898 (0.2949) loss 3.6612 (3.7812) grad_norm 1.0522 (1.1678) [2022-10-02 00:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][600/1251] eta 0:03:11 lr 0.000868 time 0.2896 (0.2940) loss 3.0311 (3.7795) grad_norm 1.1440 (1.1611) [2022-10-02 00:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][700/1251] eta 0:02:41 lr 0.000867 time 0.2922 (0.2933) loss 4.0382 (3.7991) grad_norm 1.1848 (1.1638) [2022-10-02 00:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][800/1251] eta 0:02:12 lr 0.000867 time 0.2886 (0.2928) loss 4.5218 (3.8131) grad_norm 1.1324 (1.1641) [2022-10-02 00:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][900/1251] eta 0:01:42 lr 0.000867 time 0.2896 (0.2924) loss 2.7593 (3.8099) grad_norm 1.0071 (1.1663) [2022-10-02 00:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1000/1251] eta 0:01:13 lr 0.000867 time 0.2898 (0.2921) loss 3.7836 (3.8158) grad_norm 1.0454 (1.1658) [2022-10-02 00:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1100/1251] eta 0:00:44 lr 0.000866 time 0.2897 (0.2918) loss 3.8996 (3.8151) grad_norm 1.0862 (1.1647) [2022-10-02 00:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1200/1251] eta 0:00:14 lr 0.000866 time 0.2876 (0.2915) loss 4.3020 (3.8128) grad_norm 1.4201 (1.1668) [2022-10-02 00:27:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 71 training takes 0:06:04 [2022-10-02 00:27:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.097 (3.097) Loss 1.2386 (1.2386) Acc@1 72.070 (72.070) Acc@5 90.234 (90.234) [2022-10-02 00:27:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.086 Acc@5 91.194 [2022-10-02 00:27:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-10-02 00:27:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.09% [2022-10-02 00:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][0/1251] eta 1:10:12 lr 0.000866 time 3.3676 (3.3676) loss 3.9860 (3.9860) grad_norm 1.0230 (1.0230) [2022-10-02 00:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][100/1251] eta 0:06:06 lr 0.000866 time 0.2912 (0.3188) loss 2.8163 (3.7927) grad_norm 1.2023 (1.1929) [2022-10-02 00:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][200/1251] eta 0:05:18 lr 0.000865 time 0.2862 (0.3035) loss 4.6133 (3.8177) grad_norm 1.4955 (1.2035) [2022-10-02 00:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][300/1251] eta 0:04:43 lr 0.000865 time 0.2937 (0.2984) loss 4.4376 (3.7945) grad_norm 1.0036 (1.1891) [2022-10-02 00:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][400/1251] eta 0:04:11 lr 0.000865 time 0.2852 (0.2959) loss 4.5122 (3.8017) grad_norm 1.1746 (1.1844) [2022-10-02 00:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][500/1251] eta 0:03:41 lr 0.000864 time 0.2922 (0.2943) loss 3.0766 (3.8106) grad_norm 1.0731 (1.1881) [2022-10-02 00:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][600/1251] eta 0:03:10 lr 0.000864 time 0.2857 (0.2933) loss 4.2895 (3.7948) grad_norm 1.2218 (1.1863) [2022-10-02 00:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][700/1251] eta 0:02:41 lr 0.000864 time 0.2899 (0.2925) loss 4.4430 (3.8082) grad_norm 1.3299 (1.1854) [2022-10-02 00:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][800/1251] eta 0:02:11 lr 0.000864 time 0.2847 (0.2919) loss 4.2800 (3.8156) grad_norm 1.1801 (1.1865) [2022-10-02 00:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][900/1251] eta 0:01:42 lr 0.000863 time 0.2931 (0.2914) loss 4.0464 (3.8205) grad_norm 1.2858 (1.1860) [2022-10-02 00:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1000/1251] eta 0:01:13 lr 0.000863 time 0.2868 (0.2910) loss 4.0221 (3.8173) grad_norm 1.1881 (1.1865) [2022-10-02 00:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1100/1251] eta 0:00:43 lr 0.000863 time 0.2920 (0.2907) loss 4.7432 (3.8249) grad_norm 1.1244 (1.1873) [2022-10-02 00:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1200/1251] eta 0:00:14 lr 0.000862 time 0.2866 (0.2904) loss 3.2202 (3.8207) grad_norm 1.3927 (1.1842) [2022-10-02 00:33:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 72 training takes 0:06:03 [2022-10-02 00:33:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.589 (3.589) Loss 1.2738 (1.2738) Acc@1 70.117 (70.117) Acc@5 90.234 (90.234) [2022-10-02 00:33:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.110 Acc@5 91.282 [2022-10-02 00:33:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-10-02 00:33:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.11% [2022-10-02 00:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][0/1251] eta 1:08:24 lr 0.000862 time 3.2811 (3.2811) loss 3.9057 (3.9057) grad_norm 1.1401 (1.1401) [2022-10-02 00:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][100/1251] eta 0:06:07 lr 0.000862 time 0.2882 (0.3190) loss 3.5454 (3.6923) grad_norm 1.1634 (1.1922) [2022-10-02 00:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][200/1251] eta 0:05:19 lr 0.000862 time 0.2898 (0.3041) loss 3.3664 (3.7415) grad_norm 1.1871 (1.1771) [2022-10-02 00:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][300/1251] eta 0:04:44 lr 0.000861 time 0.2906 (0.2992) loss 3.6244 (3.7702) grad_norm 1.1806 (1.1722) [2022-10-02 00:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][400/1251] eta 0:04:12 lr 0.000861 time 0.2885 (0.2966) loss 3.3717 (3.7731) grad_norm 1.0918 (1.1721) [2022-10-02 00:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][500/1251] eta 0:03:41 lr 0.000861 time 0.2903 (0.2950) loss 4.5109 (3.7586) grad_norm 1.1292 (1.1773) [2022-10-02 00:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][600/1251] eta 0:03:11 lr 0.000861 time 0.2881 (0.2938) loss 3.8147 (3.7522) grad_norm 1.2461 (1.1764) [2022-10-02 00:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][700/1251] eta 0:02:41 lr 0.000860 time 0.2865 (0.2930) loss 4.4777 (3.7627) grad_norm 1.3805 (1.1744) [2022-10-02 00:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][800/1251] eta 0:02:11 lr 0.000860 time 0.2889 (0.2923) loss 4.4667 (3.7622) grad_norm 1.0902 (1.1763) [2022-10-02 00:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][900/1251] eta 0:01:42 lr 0.000860 time 0.2870 (0.2917) loss 4.5521 (3.7712) grad_norm 1.1175 (1.1764) [2022-10-02 00:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1000/1251] eta 0:01:13 lr 0.000859 time 0.2868 (0.2912) loss 4.2330 (3.7734) grad_norm 1.1755 (1.1747) [2022-10-02 00:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1100/1251] eta 0:00:43 lr 0.000859 time 0.2862 (0.2909) loss 4.1002 (3.7724) grad_norm 1.2155 (1.1778) [2022-10-02 00:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1200/1251] eta 0:00:14 lr 0.000859 time 0.2860 (0.2906) loss 2.8911 (3.7726) grad_norm 1.0763 (1.1790) [2022-10-02 00:39:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 73 training takes 0:06:03 [2022-10-02 00:39:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.628 (2.628) Loss 1.1673 (1.1673) Acc@1 73.828 (73.828) Acc@5 91.504 (91.504) [2022-10-02 00:39:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.056 Acc@5 91.352 [2022-10-02 00:39:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-10-02 00:39:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.11% [2022-10-02 00:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][0/1251] eta 0:49:34 lr 0.000859 time 2.3777 (2.3777) loss 3.7550 (3.7550) grad_norm 1.2340 (1.2340) [2022-10-02 00:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][100/1251] eta 0:06:07 lr 0.000858 time 0.2898 (0.3193) loss 4.0344 (3.7893) grad_norm 1.2951 (1.1963) [2022-10-02 00:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][200/1251] eta 0:05:20 lr 0.000858 time 0.2911 (0.3051) loss 2.6496 (3.7957) grad_norm 1.4831 (1.1917) [2022-10-02 00:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][300/1251] eta 0:04:45 lr 0.000858 time 0.2972 (0.3002) loss 4.4860 (3.7678) grad_norm 1.3702 (1.1926) [2022-10-02 00:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][400/1251] eta 0:04:13 lr 0.000858 time 0.2913 (0.2976) loss 3.8154 (3.8007) grad_norm 1.0999 (1.1864) [2022-10-02 00:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][500/1251] eta 0:03:42 lr 0.000857 time 0.2908 (0.2960) loss 3.4132 (3.8041) grad_norm 1.1321 (1.1898) [2022-10-02 00:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][600/1251] eta 0:03:11 lr 0.000857 time 0.2930 (0.2949) loss 3.1575 (3.8121) grad_norm 1.2080 (1.1928) [2022-10-02 00:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][700/1251] eta 0:02:41 lr 0.000857 time 0.2882 (0.2940) loss 4.1678 (3.8041) grad_norm 1.0907 (1.1958) [2022-10-02 00:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][800/1251] eta 0:02:12 lr 0.000856 time 0.2879 (0.2933) loss 4.5547 (3.7999) grad_norm 1.1737 (1.1967) [2022-10-02 00:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][900/1251] eta 0:01:42 lr 0.000856 time 0.2881 (0.2928) loss 4.3976 (3.8041) grad_norm 1.4608 (1.1983) [2022-10-02 00:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1000/1251] eta 0:01:13 lr 0.000856 time 0.2883 (0.2924) loss 4.0096 (3.8055) grad_norm 1.1543 (1.1985) [2022-10-02 00:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1100/1251] eta 0:00:44 lr 0.000855 time 0.2873 (0.2920) loss 4.1719 (3.8084) grad_norm 1.1335 (1.1982) [2022-10-02 00:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1200/1251] eta 0:00:14 lr 0.000855 time 0.2899 (0.2917) loss 3.1778 (3.8058) grad_norm 1.0319 (1.1976) [2022-10-02 00:45:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 74 training takes 0:06:05 [2022-10-02 00:46:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.046 (3.046) Loss 1.2200 (1.2200) Acc@1 70.215 (70.215) Acc@5 91.211 (91.211) [2022-10-02 00:46:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.136 Acc@5 91.226 [2022-10-02 00:46:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.1% [2022-10-02 00:46:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.14% [2022-10-02 00:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][0/1251] eta 1:09:22 lr 0.000855 time 3.3272 (3.3272) loss 4.2151 (4.2151) grad_norm 0.9611 (0.9611) [2022-10-02 00:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][100/1251] eta 0:06:07 lr 0.000855 time 0.2918 (0.3197) loss 3.8779 (3.7897) grad_norm 0.9475 (1.1579) [2022-10-02 00:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][200/1251] eta 0:05:19 lr 0.000854 time 0.2888 (0.3043) loss 4.3321 (3.8269) grad_norm 1.1417 (1.1790) [2022-10-02 00:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][300/1251] eta 0:04:44 lr 0.000854 time 0.2866 (0.2991) loss 3.8322 (3.8291) grad_norm 1.1788 (1.1845) [2022-10-02 00:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][400/1251] eta 0:04:12 lr 0.000854 time 0.2917 (0.2964) loss 4.3876 (3.8267) grad_norm 1.1195 (1.1873) [2022-10-02 00:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][500/1251] eta 0:03:41 lr 0.000854 time 0.2868 (0.2949) loss 3.4030 (3.8255) grad_norm 1.2315 (1.1842) [2022-10-02 00:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][600/1251] eta 0:03:11 lr 0.000853 time 0.2858 (0.2938) loss 3.4760 (3.8378) grad_norm 1.1388 (1.1849) [2022-10-02 00:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][700/1251] eta 0:02:41 lr 0.000853 time 0.2885 (0.2929) loss 3.6345 (3.8463) grad_norm 1.1575 (1.1866) [2022-10-02 00:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][800/1251] eta 0:02:11 lr 0.000853 time 0.2868 (0.2922) loss 3.3059 (3.8427) grad_norm 1.1075 (1.1897) [2022-10-02 00:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][900/1251] eta 0:01:42 lr 0.000852 time 0.2883 (0.2918) loss 4.0190 (3.8398) grad_norm 1.1682 (1.1889) [2022-10-02 00:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1000/1251] eta 0:01:13 lr 0.000852 time 0.2867 (0.2914) loss 4.5055 (3.8418) grad_norm 1.4854 (1.1883) [2022-10-02 00:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1100/1251] eta 0:00:43 lr 0.000852 time 0.2895 (0.2911) loss 3.7476 (3.8419) grad_norm 1.1393 (1.1887) [2022-10-02 00:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1200/1251] eta 0:00:14 lr 0.000851 time 0.2868 (0.2908) loss 4.5560 (3.8384) grad_norm 1.2373 (1.1893) [2022-10-02 00:52:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 75 training takes 0:06:03 [2022-10-02 00:52:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.642 (2.642) Loss 1.2756 (1.2756) Acc@1 72.070 (72.070) Acc@5 90.625 (90.625) [2022-10-02 00:52:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 71.890 Acc@5 91.276 [2022-10-02 00:52:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 71.9% [2022-10-02 00:52:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.14% [2022-10-02 00:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][0/1251] eta 1:04:24 lr 0.000851 time 3.0893 (3.0893) loss 4.0601 (4.0601) grad_norm 1.1915 (1.1915) [2022-10-02 00:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][100/1251] eta 0:06:05 lr 0.000851 time 0.2883 (0.3177) loss 3.9353 (3.8602) grad_norm 1.2278 (1.2170) [2022-10-02 00:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][200/1251] eta 0:05:19 lr 0.000851 time 0.2842 (0.3037) loss 2.6761 (3.8411) grad_norm 1.4152 (1.2116) [2022-10-02 00:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][300/1251] eta 0:04:44 lr 0.000850 time 0.2865 (0.2989) loss 3.3617 (3.7949) grad_norm 1.3650 (1.2135) [2022-10-02 00:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][400/1251] eta 0:04:12 lr 0.000850 time 0.2897 (0.2966) loss 2.8869 (3.7853) grad_norm 1.3430 (1.2143) [2022-10-02 00:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][500/1251] eta 0:03:41 lr 0.000850 time 0.2903 (0.2952) loss 3.1818 (3.7915) grad_norm 1.0679 (1.2069) [2022-10-02 00:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][600/1251] eta 0:03:11 lr 0.000850 time 0.2876 (0.2941) loss 4.1468 (3.7975) grad_norm 1.2161 (1.2074) [2022-10-02 00:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][700/1251] eta 0:02:41 lr 0.000849 time 0.2905 (0.2934) loss 4.5672 (3.8127) grad_norm 1.2170 (1.2079) [2022-10-02 00:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][800/1251] eta 0:02:12 lr 0.000849 time 0.2871 (0.2928) loss 3.9551 (3.8173) grad_norm 1.0682 (1.2037) [2022-10-02 00:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][900/1251] eta 0:01:42 lr 0.000849 time 0.2903 (0.2924) loss 4.1739 (3.8177) grad_norm 1.3326 (1.2013) [2022-10-02 00:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1000/1251] eta 0:01:13 lr 0.000848 time 0.2849 (0.2920) loss 4.1966 (3.8092) grad_norm 1.1573 (1.2032) [2022-10-02 00:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1100/1251] eta 0:00:44 lr 0.000848 time 0.2888 (0.2917) loss 3.9084 (3.8111) grad_norm 1.4648 (1.2044) [2022-10-02 00:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1200/1251] eta 0:00:14 lr 0.000848 time 0.2872 (0.2914) loss 4.5029 (3.8034) grad_norm 1.2182 (1.2038) [2022-10-02 00:58:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 76 training takes 0:06:04 [2022-10-02 00:58:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.400 (2.400) Loss 1.2272 (1.2272) Acc@1 71.387 (71.387) Acc@5 91.016 (91.016) [2022-10-02 00:58:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.264 Acc@5 91.302 [2022-10-02 00:58:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.3% [2022-10-02 00:58:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.26% [2022-10-02 00:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][0/1251] eta 1:08:11 lr 0.000848 time 3.2702 (3.2702) loss 2.8817 (2.8817) grad_norm 1.0583 (1.0583) [2022-10-02 00:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][100/1251] eta 0:06:09 lr 0.000847 time 0.2901 (0.3209) loss 3.5484 (3.8519) grad_norm 1.0676 (1.1727) [2022-10-02 00:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][200/1251] eta 0:05:21 lr 0.000847 time 0.2879 (0.3055) loss 2.8387 (3.8431) grad_norm 1.0119 (1.1843) [2022-10-02 01:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][300/1251] eta 0:04:45 lr 0.000847 time 0.2859 (0.3003) loss 3.8666 (3.8488) grad_norm 1.1786 (1.1862) [2022-10-02 01:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][400/1251] eta 0:04:13 lr 0.000846 time 0.2892 (0.2977) loss 3.7523 (3.8474) grad_norm 0.9307 (1.1924) [2022-10-02 01:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][500/1251] eta 0:03:42 lr 0.000846 time 0.2881 (0.2961) loss 2.8320 (3.8518) grad_norm 1.4641 (1.2027) [2022-10-02 01:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][600/1251] eta 0:03:12 lr 0.000846 time 0.2895 (0.2950) loss 3.7844 (3.8384) grad_norm 1.3037 (1.2060) [2022-10-02 01:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][700/1251] eta 0:02:42 lr 0.000846 time 0.2877 (0.2941) loss 4.4011 (3.8336) grad_norm 1.3188 (1.2056) [2022-10-02 01:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][800/1251] eta 0:02:12 lr 0.000845 time 0.2920 (0.2935) loss 2.7575 (3.8336) grad_norm 1.2258 (1.2014) [2022-10-02 01:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][900/1251] eta 0:01:42 lr 0.000845 time 0.2884 (0.2929) loss 2.6839 (3.8268) grad_norm 1.1463 (1.2028) [2022-10-02 01:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1000/1251] eta 0:01:13 lr 0.000845 time 0.2887 (0.2925) loss 2.9210 (3.8259) grad_norm 1.2645 (1.2057) [2022-10-02 01:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1100/1251] eta 0:00:44 lr 0.000844 time 0.2857 (0.2922) loss 4.5731 (3.8288) grad_norm 1.2581 (1.2089) [2022-10-02 01:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1200/1251] eta 0:00:14 lr 0.000844 time 0.2900 (0.2919) loss 3.9764 (3.8167) grad_norm 1.0532 (1.2071) [2022-10-02 01:04:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 77 training takes 0:06:05 [2022-10-02 01:04:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.545 (2.545) Loss 1.2124 (1.2124) Acc@1 72.070 (72.070) Acc@5 91.895 (91.895) [2022-10-02 01:05:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.330 Acc@5 91.350 [2022-10-02 01:05:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.3% [2022-10-02 01:05:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.33% [2022-10-02 01:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][0/1251] eta 1:08:01 lr 0.000844 time 3.2626 (3.2626) loss 4.0860 (4.0860) grad_norm 1.8120 (1.8120) [2022-10-02 01:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][100/1251] eta 0:06:06 lr 0.000844 time 0.2869 (0.3186) loss 4.0913 (3.8263) grad_norm 1.1247 (1.2577) [2022-10-02 01:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][200/1251] eta 0:05:19 lr 0.000843 time 0.2864 (0.3036) loss 4.2135 (3.7702) grad_norm 1.2807 (1.2131) [2022-10-02 01:06:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][300/1251] eta 0:04:44 lr 0.000843 time 0.2863 (0.2988) loss 2.4952 (3.7654) grad_norm 1.1572 (1.2135) [2022-10-02 01:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][400/1251] eta 0:04:12 lr 0.000843 time 0.2941 (0.2962) loss 4.5265 (3.7629) grad_norm 1.1557 (1.2087) [2022-10-02 01:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][500/1251] eta 0:03:41 lr 0.000842 time 0.2864 (0.2947) loss 4.6189 (3.7753) grad_norm 1.1445 (1.2100) [2022-10-02 01:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][600/1251] eta 0:03:11 lr 0.000842 time 0.2921 (0.2936) loss 2.3093 (3.7628) grad_norm 1.0480 (1.2047) [2022-10-02 01:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][700/1251] eta 0:02:41 lr 0.000842 time 0.2865 (0.2928) loss 4.2776 (3.7680) grad_norm 1.2313 (1.2066) [2022-10-02 01:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][800/1251] eta 0:02:11 lr 0.000841 time 0.2874 (0.2924) loss 3.0033 (3.7826) grad_norm 1.3181 (1.2042) [2022-10-02 01:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][900/1251] eta 0:01:42 lr 0.000841 time 0.2881 (0.2920) loss 3.7302 (3.7870) grad_norm 1.2442 (1.2068) [2022-10-02 01:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1000/1251] eta 0:01:13 lr 0.000841 time 0.2912 (0.2917) loss 2.6816 (3.7969) grad_norm 1.1276 (1.2072) [2022-10-02 01:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1100/1251] eta 0:00:44 lr 0.000841 time 0.2875 (0.2915) loss 2.8566 (3.8001) grad_norm 1.1495 (1.2103) [2022-10-02 01:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1200/1251] eta 0:00:14 lr 0.000840 time 0.2911 (0.2913) loss 4.7764 (3.8036) grad_norm 1.3965 (1.2091) [2022-10-02 01:11:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 78 training takes 0:06:04 [2022-10-02 01:11:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.706 (2.706) Loss 1.0630 (1.0630) Acc@1 76.172 (76.172) Acc@5 92.578 (92.578) [2022-10-02 01:11:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.706 Acc@5 91.510 [2022-10-02 01:11:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-10-02 01:11:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.71% [2022-10-02 01:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][0/1251] eta 1:11:22 lr 0.000840 time 3.4236 (3.4236) loss 2.9139 (2.9139) grad_norm 1.1078 (1.1078) [2022-10-02 01:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][100/1251] eta 0:06:09 lr 0.000840 time 0.2882 (0.3212) loss 3.6344 (3.7030) grad_norm 1.4009 (1.1966) [2022-10-02 01:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][200/1251] eta 0:05:21 lr 0.000839 time 0.2916 (0.3061) loss 3.8558 (3.7747) grad_norm 1.1648 (1.2059) [2022-10-02 01:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][300/1251] eta 0:04:46 lr 0.000839 time 0.2904 (0.3010) loss 3.7022 (3.7711) grad_norm 1.2024 (1.2061) [2022-10-02 01:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][400/1251] eta 0:04:13 lr 0.000839 time 0.2892 (0.2984) loss 2.8729 (3.7677) grad_norm 1.1212 (1.2063) [2022-10-02 01:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][500/1251] eta 0:03:42 lr 0.000839 time 0.2861 (0.2968) loss 3.5891 (3.7919) grad_norm 1.1860 (1.2077) [2022-10-02 01:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][600/1251] eta 0:03:12 lr 0.000838 time 0.2973 (0.2957) loss 3.5143 (3.8067) grad_norm 1.2964 (1.2120) [2022-10-02 01:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][700/1251] eta 0:02:42 lr 0.000838 time 0.2862 (0.2949) loss 4.3644 (3.8138) grad_norm 1.0910 (1.2115) [2022-10-02 01:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][800/1251] eta 0:02:12 lr 0.000838 time 0.2893 (0.2943) loss 3.8252 (3.8035) grad_norm 1.2802 (1.2140) [2022-10-02 01:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][900/1251] eta 0:01:43 lr 0.000837 time 0.2875 (0.2938) loss 3.6597 (3.8001) grad_norm 1.5338 (1.2163) [2022-10-02 01:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1000/1251] eta 0:01:13 lr 0.000837 time 0.2886 (0.2933) loss 3.4650 (3.7962) grad_norm 1.1726 (1.2133) [2022-10-02 01:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1100/1251] eta 0:00:44 lr 0.000837 time 0.2886 (0.2930) loss 4.3709 (3.7976) grad_norm 1.0369 (1.2124) [2022-10-02 01:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1200/1251] eta 0:00:14 lr 0.000836 time 0.2905 (0.2926) loss 3.4051 (3.7986) grad_norm 1.2148 (1.2140) [2022-10-02 01:17:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 79 training takes 0:06:06 [2022-10-02 01:17:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.176 (3.176) Loss 1.2068 (1.2068) Acc@1 71.387 (71.387) Acc@5 91.504 (91.504) [2022-10-02 01:17:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.468 Acc@5 91.344 [2022-10-02 01:17:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-02 01:17:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.71% [2022-10-02 01:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][0/1251] eta 0:55:52 lr 0.000836 time 2.6799 (2.6799) loss 3.1161 (3.1161) grad_norm 1.0705 (1.0705) [2022-10-02 01:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][100/1251] eta 0:06:07 lr 0.000836 time 0.2932 (0.3190) loss 4.0704 (3.7652) grad_norm 1.3466 (1.2200) [2022-10-02 01:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][200/1251] eta 0:05:20 lr 0.000836 time 0.2884 (0.3047) loss 4.0922 (3.7873) grad_norm 1.0775 (1.2161) [2022-10-02 01:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][300/1251] eta 0:04:45 lr 0.000835 time 0.2899 (0.2999) loss 3.2611 (3.7989) grad_norm 1.0767 (1.2119) [2022-10-02 01:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][400/1251] eta 0:04:13 lr 0.000835 time 0.2881 (0.2976) loss 3.6086 (3.7802) grad_norm 1.2022 (1.2207) [2022-10-02 01:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][500/1251] eta 0:03:42 lr 0.000835 time 0.2923 (0.2961) loss 4.0908 (3.7732) grad_norm 1.0740 (1.2188) [2022-10-02 01:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][600/1251] eta 0:03:12 lr 0.000834 time 0.2905 (0.2951) loss 3.1859 (3.7682) grad_norm 1.0538 (1.2156) [2022-10-02 01:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][700/1251] eta 0:02:42 lr 0.000834 time 0.2870 (0.2942) loss 3.1237 (3.7716) grad_norm 1.1858 (1.2128) [2022-10-02 01:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][800/1251] eta 0:02:12 lr 0.000834 time 0.2896 (0.2936) loss 4.1686 (3.7761) grad_norm 1.7009 (1.2136) [2022-10-02 01:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][900/1251] eta 0:01:42 lr 0.000833 time 0.2860 (0.2930) loss 4.0891 (3.7669) grad_norm 1.0258 (1.2112) [2022-10-02 01:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1000/1251] eta 0:01:13 lr 0.000833 time 0.2901 (0.2925) loss 2.9574 (3.7691) grad_norm 1.1580 (1.2150) [2022-10-02 01:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1100/1251] eta 0:00:44 lr 0.000833 time 0.2870 (0.2920) loss 2.7925 (3.7713) grad_norm 1.0711 (1.2137) [2022-10-02 01:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1200/1251] eta 0:00:14 lr 0.000833 time 0.2897 (0.2917) loss 4.2678 (3.7698) grad_norm 1.0772 (1.2135) [2022-10-02 01:23:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 80 training takes 0:06:05 [2022-10-02 01:23:44 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saving...... [2022-10-02 01:23:45 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saved !!! [2022-10-02 01:23:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.766 (2.766) Loss 1.1714 (1.1714) Acc@1 72.949 (72.949) Acc@5 91.895 (91.895) [2022-10-02 01:23:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.626 Acc@5 91.442 [2022-10-02 01:23:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-10-02 01:23:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.71% [2022-10-02 01:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][0/1251] eta 1:05:49 lr 0.000832 time 3.1574 (3.1574) loss 3.1056 (3.1056) grad_norm 1.0838 (1.0838) [2022-10-02 01:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][100/1251] eta 0:06:06 lr 0.000832 time 0.2873 (0.3184) loss 3.8444 (3.7638) grad_norm 1.1563 (1.2173) [2022-10-02 01:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][200/1251] eta 0:05:19 lr 0.000832 time 0.2895 (0.3038) loss 3.4845 (3.7686) grad_norm 1.2296 (1.2030) [2022-10-02 01:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][300/1251] eta 0:04:44 lr 0.000831 time 0.2878 (0.2989) loss 4.0933 (3.7681) grad_norm 1.1859 (1.2062) [2022-10-02 01:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][400/1251] eta 0:04:12 lr 0.000831 time 0.2885 (0.2965) loss 4.3243 (3.7837) grad_norm 1.2437 (1.2049) [2022-10-02 01:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][500/1251] eta 0:03:41 lr 0.000831 time 0.2868 (0.2950) loss 4.3000 (3.7944) grad_norm 1.2777 (1.2046) [2022-10-02 01:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][600/1251] eta 0:03:11 lr 0.000830 time 0.2872 (0.2940) loss 4.0517 (3.7930) grad_norm 1.3525 (1.2072) [2022-10-02 01:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][700/1251] eta 0:02:41 lr 0.000830 time 0.2895 (0.2935) loss 4.1730 (3.7981) grad_norm 1.0447 (1.2106) [2022-10-02 01:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][800/1251] eta 0:02:12 lr 0.000830 time 0.2947 (0.2930) loss 4.2829 (3.7871) grad_norm 1.2923 (1.2089) [2022-10-02 01:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][900/1251] eta 0:01:42 lr 0.000830 time 0.2929 (0.2925) loss 4.3076 (3.7967) grad_norm 1.2260 (1.2071) [2022-10-02 01:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1000/1251] eta 0:01:13 lr 0.000829 time 0.2879 (0.2921) loss 3.9641 (3.7941) grad_norm 1.2648 (1.2084) [2022-10-02 01:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1100/1251] eta 0:00:44 lr 0.000829 time 0.2910 (0.2917) loss 3.5427 (3.7899) grad_norm 1.1882 (1.2077) [2022-10-02 01:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1200/1251] eta 0:00:14 lr 0.000829 time 0.2884 (0.2915) loss 3.3150 (3.7877) grad_norm 1.1955 (1.2090) [2022-10-02 01:30:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 81 training takes 0:06:04 [2022-10-02 01:30:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.383 (2.383) Loss 1.1848 (1.1848) Acc@1 71.094 (71.094) Acc@5 91.797 (91.797) [2022-10-02 01:30:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.522 Acc@5 91.526 [2022-10-02 01:30:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-02 01:30:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.71% [2022-10-02 01:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][0/1251] eta 1:06:06 lr 0.000828 time 3.1707 (3.1707) loss 2.8856 (2.8856) grad_norm 1.0591 (1.0591) [2022-10-02 01:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][100/1251] eta 0:06:09 lr 0.000828 time 0.2901 (0.3211) loss 4.2576 (3.7530) grad_norm 1.2416 (1.2105) [2022-10-02 01:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][200/1251] eta 0:05:22 lr 0.000828 time 0.2910 (0.3068) loss 3.1404 (3.7790) grad_norm 1.2109 (1.1955) [2022-10-02 01:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][300/1251] eta 0:04:47 lr 0.000828 time 0.2893 (0.3020) loss 3.9348 (3.7885) grad_norm 1.3127 (1.2025) [2022-10-02 01:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][400/1251] eta 0:04:14 lr 0.000827 time 0.2906 (0.2994) loss 2.8789 (3.7823) grad_norm 1.3234 (1.2039) [2022-10-02 01:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][500/1251] eta 0:03:43 lr 0.000827 time 0.2868 (0.2979) loss 3.9159 (3.7652) grad_norm 0.9732 (1.2068) [2022-10-02 01:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][600/1251] eta 0:03:13 lr 0.000827 time 0.2906 (0.2968) loss 4.5378 (3.7640) grad_norm 1.3011 (1.2073) [2022-10-02 01:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][700/1251] eta 0:02:43 lr 0.000826 time 0.2896 (0.2959) loss 4.5955 (3.7760) grad_norm 1.3943 (1.2089) [2022-10-02 01:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][800/1251] eta 0:02:13 lr 0.000826 time 0.2914 (0.2953) loss 3.0630 (3.7742) grad_norm 1.0052 (1.2058) [2022-10-02 01:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][900/1251] eta 0:01:43 lr 0.000826 time 0.2894 (0.2948) loss 3.9654 (3.7792) grad_norm 1.3318 (1.2069) [2022-10-02 01:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1000/1251] eta 0:01:13 lr 0.000825 time 0.2924 (0.2943) loss 3.3516 (3.7852) grad_norm 1.0746 (1.2044) [2022-10-02 01:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1100/1251] eta 0:00:44 lr 0.000825 time 0.2884 (0.2940) loss 4.5481 (3.7822) grad_norm 1.0950 (1.2062) [2022-10-02 01:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1200/1251] eta 0:00:14 lr 0.000825 time 0.2925 (0.2937) loss 4.6567 (3.7770) grad_norm 1.2512 (1.2096) [2022-10-02 01:36:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 82 training takes 0:06:07 [2022-10-02 01:36:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.981 (2.981) Loss 1.2773 (1.2773) Acc@1 70.605 (70.605) Acc@5 91.504 (91.504) [2022-10-02 01:36:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.476 Acc@5 91.532 [2022-10-02 01:36:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-10-02 01:36:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.71% [2022-10-02 01:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][0/1251] eta 1:01:14 lr 0.000825 time 2.9371 (2.9371) loss 3.4254 (3.4254) grad_norm 1.0703 (1.0703) [2022-10-02 01:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][100/1251] eta 0:06:02 lr 0.000824 time 0.2842 (0.3151) loss 3.4652 (3.8421) grad_norm 1.3526 (1.2152) [2022-10-02 01:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][200/1251] eta 0:05:17 lr 0.000824 time 0.2875 (0.3018) loss 3.3566 (3.8033) grad_norm 1.1615 (1.2050) [2022-10-02 01:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][300/1251] eta 0:04:42 lr 0.000824 time 0.2854 (0.2974) loss 4.0674 (3.7893) grad_norm 1.1136 (1.2250) [2022-10-02 01:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][400/1251] eta 0:04:11 lr 0.000823 time 0.2871 (0.2952) loss 3.5417 (3.7707) grad_norm 1.0824 (1.2196) [2022-10-02 01:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][500/1251] eta 0:03:40 lr 0.000823 time 0.2862 (0.2938) loss 3.2408 (3.7613) grad_norm 1.2534 (1.2169) [2022-10-02 01:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][600/1251] eta 0:03:10 lr 0.000823 time 0.2864 (0.2929) loss 2.6068 (3.7565) grad_norm 1.2916 (1.2150) [2022-10-02 01:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][700/1251] eta 0:02:41 lr 0.000822 time 0.2879 (0.2922) loss 2.7724 (3.7618) grad_norm 1.0178 (1.2104) [2022-10-02 01:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][800/1251] eta 0:02:11 lr 0.000822 time 0.2868 (0.2917) loss 3.4257 (3.7574) grad_norm 1.1407 (1.2135) [2022-10-02 01:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][900/1251] eta 0:01:42 lr 0.000822 time 0.2861 (0.2914) loss 3.3300 (3.7574) grad_norm 1.2651 (1.2124) [2022-10-02 01:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1000/1251] eta 0:01:13 lr 0.000821 time 0.2869 (0.2911) loss 4.3821 (3.7601) grad_norm 1.3375 (1.2133) [2022-10-02 01:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1100/1251] eta 0:00:43 lr 0.000821 time 0.2890 (0.2908) loss 3.6601 (3.7535) grad_norm 1.1718 (1.2137) [2022-10-02 01:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1200/1251] eta 0:00:14 lr 0.000821 time 0.2875 (0.2906) loss 3.4198 (3.7582) grad_norm 1.0880 (1.2142) [2022-10-02 01:42:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 83 training takes 0:06:03 [2022-10-02 01:42:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.494 (2.494) Loss 1.2108 (1.2108) Acc@1 72.363 (72.363) Acc@5 91.113 (91.113) [2022-10-02 01:42:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.866 Acc@5 91.616 [2022-10-02 01:42:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-10-02 01:42:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.87% [2022-10-02 01:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][0/1251] eta 1:08:39 lr 0.000821 time 3.2931 (3.2931) loss 3.4944 (3.4944) grad_norm 1.0712 (1.0712) [2022-10-02 01:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][100/1251] eta 0:06:09 lr 0.000820 time 0.2897 (0.3206) loss 2.7983 (3.7616) grad_norm 0.9899 (1.1759) [2022-10-02 01:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][200/1251] eta 0:05:21 lr 0.000820 time 0.2937 (0.3057) loss 4.1650 (3.7635) grad_norm 1.2069 (1.1947) [2022-10-02 01:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][300/1251] eta 0:04:45 lr 0.000820 time 0.2869 (0.3004) loss 2.6251 (3.7447) grad_norm 1.3267 (1.2141) [2022-10-02 01:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][400/1251] eta 0:04:13 lr 0.000819 time 0.2912 (0.2978) loss 2.6246 (3.7714) grad_norm 1.1955 (1.2189) [2022-10-02 01:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][500/1251] eta 0:03:42 lr 0.000819 time 0.2900 (0.2964) loss 3.8591 (3.7825) grad_norm 1.2524 (1.2119) [2022-10-02 01:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][600/1251] eta 0:03:12 lr 0.000819 time 0.2883 (0.2954) loss 3.9805 (3.7754) grad_norm 1.1019 (1.2133) [2022-10-02 01:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][700/1251] eta 0:02:42 lr 0.000818 time 0.2878 (0.2946) loss 2.9299 (3.7823) grad_norm 1.1868 (1.2116) [2022-10-02 01:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][800/1251] eta 0:02:12 lr 0.000818 time 0.2869 (0.2939) loss 3.8654 (3.7822) grad_norm 1.1757 (1.2102) [2022-10-02 01:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][900/1251] eta 0:01:42 lr 0.000818 time 0.2891 (0.2934) loss 4.0374 (3.7888) grad_norm 1.2045 (1.2162) [2022-10-02 01:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1000/1251] eta 0:01:13 lr 0.000817 time 0.2875 (0.2930) loss 2.9594 (3.7896) grad_norm 1.7412 (1.2192) [2022-10-02 01:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1100/1251] eta 0:00:44 lr 0.000817 time 0.2881 (0.2925) loss 3.1051 (3.7867) grad_norm 1.1764 (1.2225) [2022-10-02 01:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1200/1251] eta 0:00:14 lr 0.000817 time 0.2859 (0.2921) loss 4.4157 (3.7855) grad_norm 1.0558 (1.2185) [2022-10-02 01:48:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 84 training takes 0:06:05 [2022-10-02 01:49:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.817 (2.817) Loss 1.1432 (1.1432) Acc@1 72.168 (72.168) Acc@5 92.578 (92.578) [2022-10-02 01:49:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.834 Acc@5 91.664 [2022-10-02 01:49:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-02 01:49:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.87% [2022-10-02 01:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][0/1251] eta 0:49:28 lr 0.000817 time 2.3731 (2.3731) loss 2.9677 (2.9677) grad_norm 1.3390 (1.3390) [2022-10-02 01:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][100/1251] eta 0:06:03 lr 0.000816 time 0.2883 (0.3158) loss 2.7285 (3.7673) grad_norm 1.1862 (1.2231) [2022-10-02 01:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][200/1251] eta 0:05:18 lr 0.000816 time 0.2879 (0.3035) loss 3.9622 (3.7555) grad_norm 1.3215 (1.2138) [2022-10-02 01:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][300/1251] eta 0:04:44 lr 0.000816 time 0.2897 (0.2995) loss 4.5268 (3.7605) grad_norm 1.3332 (1.2177) [2022-10-02 01:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][400/1251] eta 0:04:13 lr 0.000815 time 0.2897 (0.2975) loss 4.1054 (3.7699) grad_norm 1.1775 (1.2177) [2022-10-02 01:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][500/1251] eta 0:03:42 lr 0.000815 time 0.2906 (0.2964) loss 3.3045 (3.7574) grad_norm 1.1333 (1.2162) [2022-10-02 01:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][600/1251] eta 0:03:12 lr 0.000815 time 0.2945 (0.2956) loss 3.3210 (3.7578) grad_norm 1.1338 (1.2212) [2022-10-02 01:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][700/1251] eta 0:02:42 lr 0.000814 time 0.2858 (0.2949) loss 4.7940 (3.7583) grad_norm 1.0318 (1.2233) [2022-10-02 01:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][800/1251] eta 0:02:12 lr 0.000814 time 0.2912 (0.2945) loss 3.8830 (3.7556) grad_norm 1.6797 (1.2262) [2022-10-02 01:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][900/1251] eta 0:01:43 lr 0.000814 time 0.2859 (0.2941) loss 4.2329 (3.7532) grad_norm 1.1541 (1.2259) [2022-10-02 01:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1000/1251] eta 0:01:13 lr 0.000813 time 0.2943 (0.2937) loss 3.4184 (3.7568) grad_norm 1.1791 (1.2210) [2022-10-02 01:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1100/1251] eta 0:00:44 lr 0.000813 time 0.2860 (0.2934) loss 4.3797 (3.7565) grad_norm 1.1102 (1.2218) [2022-10-02 01:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1200/1251] eta 0:00:14 lr 0.000813 time 0.2890 (0.2931) loss 4.1887 (3.7620) grad_norm 1.4241 (1.2237) [2022-10-02 01:55:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 85 training takes 0:06:06 [2022-10-02 01:55:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.013 (3.013) Loss 1.1384 (1.1384) Acc@1 75.098 (75.098) Acc@5 92.090 (92.090) [2022-10-02 01:55:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.842 Acc@5 91.724 [2022-10-02 01:55:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-02 01:55:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 72.87% [2022-10-02 01:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][0/1251] eta 1:09:22 lr 0.000812 time 3.3276 (3.3276) loss 4.4212 (4.4212) grad_norm 1.0709 (1.0709) [2022-10-02 01:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][100/1251] eta 0:06:07 lr 0.000812 time 0.2858 (0.3191) loss 4.4146 (3.8380) grad_norm 1.1056 (1.2246) [2022-10-02 01:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][200/1251] eta 0:05:19 lr 0.000812 time 0.2976 (0.3042) loss 4.0376 (3.8175) grad_norm 1.1582 (1.2203) [2022-10-02 01:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][300/1251] eta 0:04:44 lr 0.000811 time 0.2870 (0.2990) loss 3.9084 (3.8550) grad_norm 1.1865 (1.2085) [2022-10-02 01:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][400/1251] eta 0:04:12 lr 0.000811 time 0.2881 (0.2964) loss 4.0713 (3.8292) grad_norm 1.1164 (1.2157) [2022-10-02 01:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][500/1251] eta 0:03:41 lr 0.000811 time 0.2865 (0.2948) loss 3.8742 (3.8044) grad_norm 1.2251 (1.2194) [2022-10-02 01:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][600/1251] eta 0:03:11 lr 0.000811 time 0.2901 (0.2936) loss 4.5699 (3.7972) grad_norm 1.1358 (1.2224) [2022-10-02 01:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][700/1251] eta 0:02:41 lr 0.000810 time 0.2874 (0.2928) loss 3.0880 (3.8028) grad_norm 1.0335 (1.2232) [2022-10-02 01:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][800/1251] eta 0:02:11 lr 0.000810 time 0.2885 (0.2923) loss 4.5081 (3.7954) grad_norm 1.2123 (1.2225) [2022-10-02 01:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][900/1251] eta 0:01:42 lr 0.000810 time 0.2864 (0.2918) loss 4.3402 (3.7965) grad_norm 1.1081 (1.2201) [2022-10-02 02:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1000/1251] eta 0:01:13 lr 0.000809 time 0.2890 (0.2915) loss 3.9005 (3.7864) grad_norm 1.0533 (1.2230) [2022-10-02 02:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1100/1251] eta 0:00:43 lr 0.000809 time 0.2860 (0.2911) loss 3.8485 (3.7829) grad_norm 1.2044 (1.2236) [2022-10-02 02:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1200/1251] eta 0:00:14 lr 0.000809 time 0.2903 (0.2909) loss 4.4232 (3.7781) grad_norm 1.2444 (1.2251) [2022-10-02 02:01:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 86 training takes 0:06:04 [2022-10-02 02:01:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.383 (3.383) Loss 1.2462 (1.2462) Acc@1 69.434 (69.434) Acc@5 91.016 (91.016) [2022-10-02 02:01:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.996 Acc@5 91.582 [2022-10-02 02:01:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-02 02:01:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.00% [2022-10-02 02:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][0/1251] eta 1:06:00 lr 0.000808 time 3.1655 (3.1655) loss 3.9191 (3.9191) grad_norm 1.3450 (1.3450) [2022-10-02 02:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][100/1251] eta 0:06:07 lr 0.000808 time 0.2940 (0.3189) loss 4.0427 (3.7973) grad_norm 1.0611 (1.2082) [2022-10-02 02:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][200/1251] eta 0:05:19 lr 0.000808 time 0.2880 (0.3044) loss 3.3029 (3.6993) grad_norm 1.4548 (1.2231) [2022-10-02 02:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][300/1251] eta 0:04:44 lr 0.000807 time 0.2905 (0.2996) loss 3.5556 (3.7181) grad_norm 1.0892 (1.2375) [2022-10-02 02:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][400/1251] eta 0:04:13 lr 0.000807 time 0.2891 (0.2974) loss 2.5181 (3.7221) grad_norm 1.3525 (1.2445) [2022-10-02 02:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][500/1251] eta 0:03:42 lr 0.000807 time 0.2925 (0.2961) loss 4.1596 (3.7308) grad_norm 1.2677 (1.2478) [2022-10-02 02:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][600/1251] eta 0:03:12 lr 0.000806 time 0.2891 (0.2951) loss 3.3885 (3.7442) grad_norm 1.2022 (1.2401) [2022-10-02 02:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][700/1251] eta 0:02:42 lr 0.000806 time 0.2913 (0.2944) loss 3.1829 (3.7441) grad_norm 1.0672 (1.2426) [2022-10-02 02:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][800/1251] eta 0:02:12 lr 0.000806 time 0.2880 (0.2936) loss 3.6389 (3.7448) grad_norm 1.4854 (1.2381) [2022-10-02 02:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][900/1251] eta 0:01:42 lr 0.000805 time 0.2903 (0.2930) loss 3.6991 (3.7319) grad_norm 1.2699 (1.2374) [2022-10-02 02:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1000/1251] eta 0:01:13 lr 0.000805 time 0.2851 (0.2925) loss 2.3977 (3.7328) grad_norm 1.1292 (1.2334) [2022-10-02 02:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1100/1251] eta 0:00:44 lr 0.000805 time 0.2873 (0.2921) loss 4.4610 (3.7411) grad_norm 1.0711 (1.2323) [2022-10-02 02:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1200/1251] eta 0:00:14 lr 0.000804 time 0.2843 (0.2917) loss 3.9336 (3.7471) grad_norm 1.2016 (1.2325) [2022-10-02 02:07:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 87 training takes 0:06:05 [2022-10-02 02:07:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.860 (2.860) Loss 1.0566 (1.0566) Acc@1 74.414 (74.414) Acc@5 93.848 (93.848) [2022-10-02 02:08:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.038 Acc@5 91.798 [2022-10-02 02:08:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-02 02:08:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.04% [2022-10-02 02:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][0/1251] eta 0:56:12 lr 0.000804 time 2.6962 (2.6962) loss 4.2656 (4.2656) grad_norm 1.6228 (1.6228) [2022-10-02 02:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][100/1251] eta 0:06:07 lr 0.000804 time 0.2896 (0.3194) loss 4.1050 (3.7106) grad_norm 1.1896 (1.2566) [2022-10-02 02:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][200/1251] eta 0:05:20 lr 0.000804 time 0.2925 (0.3053) loss 4.3890 (3.7215) grad_norm 1.2899 (1.2464) [2022-10-02 02:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][300/1251] eta 0:04:45 lr 0.000803 time 0.2961 (0.3004) loss 4.1623 (3.7243) grad_norm 1.1305 (1.2389) [2022-10-02 02:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][400/1251] eta 0:04:13 lr 0.000803 time 0.2886 (0.2978) loss 3.4842 (3.7113) grad_norm 1.1068 (1.2341) [2022-10-02 02:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][500/1251] eta 0:03:42 lr 0.000803 time 0.2928 (0.2962) loss 2.9155 (3.7275) grad_norm 1.1009 (1.2345) [2022-10-02 02:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][600/1251] eta 0:03:12 lr 0.000802 time 0.2870 (0.2950) loss 4.0501 (3.7289) grad_norm 1.2513 (1.2362) [2022-10-02 02:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][700/1251] eta 0:02:42 lr 0.000802 time 0.2919 (0.2942) loss 4.5158 (3.7277) grad_norm 1.3236 (1.2394) [2022-10-02 02:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][800/1251] eta 0:02:12 lr 0.000802 time 0.2870 (0.2934) loss 2.6967 (3.7289) grad_norm 1.1333 (1.2371) [2022-10-02 02:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][900/1251] eta 0:01:42 lr 0.000801 time 0.2966 (0.2928) loss 3.1903 (3.7367) grad_norm 1.3294 (1.2341) [2022-10-02 02:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1000/1251] eta 0:01:13 lr 0.000801 time 0.2880 (0.2923) loss 4.2400 (3.7497) grad_norm 1.2350 (1.2330) [2022-10-02 02:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1100/1251] eta 0:00:44 lr 0.000801 time 0.2874 (0.2918) loss 3.2986 (3.7567) grad_norm 1.1430 (1.2340) [2022-10-02 02:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1200/1251] eta 0:00:14 lr 0.000800 time 0.2852 (0.2915) loss 3.8314 (3.7556) grad_norm 1.3899 (1.2330) [2022-10-02 02:14:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 88 training takes 0:06:04 [2022-10-02 02:14:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.171 (3.171) Loss 1.1962 (1.1962) Acc@1 72.559 (72.559) Acc@5 91.895 (91.895) [2022-10-02 02:14:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 72.840 Acc@5 91.796 [2022-10-02 02:14:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-10-02 02:14:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.04% [2022-10-02 02:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][0/1251] eta 0:53:22 lr 0.000800 time 2.5596 (2.5596) loss 3.8191 (3.8191) grad_norm 1.0683 (1.0683) [2022-10-02 02:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][100/1251] eta 0:06:04 lr 0.000800 time 0.2924 (0.3170) loss 3.7945 (3.6321) grad_norm 1.0029 (1.2571) [2022-10-02 02:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][200/1251] eta 0:05:19 lr 0.000799 time 0.2917 (0.3037) loss 4.1593 (3.7011) grad_norm 1.0430 (1.2404) [2022-10-02 02:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][300/1251] eta 0:04:44 lr 0.000799 time 0.2933 (0.2989) loss 4.2841 (3.7367) grad_norm 1.4600 (1.2472) [2022-10-02 02:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][400/1251] eta 0:04:12 lr 0.000799 time 0.2858 (0.2965) loss 4.2226 (3.7278) grad_norm 1.1671 (1.2500) [2022-10-02 02:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][500/1251] eta 0:03:41 lr 0.000798 time 0.2894 (0.2950) loss 4.1028 (3.7271) grad_norm 1.1685 (1.2507) [2022-10-02 02:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][600/1251] eta 0:03:11 lr 0.000798 time 0.2928 (0.2940) loss 4.4846 (3.7308) grad_norm 1.1993 (1.2498) [2022-10-02 02:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][700/1251] eta 0:02:41 lr 0.000798 time 0.2914 (0.2933) loss 3.7947 (3.7414) grad_norm 1.0627 (1.2445) [2022-10-02 02:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][800/1251] eta 0:02:12 lr 0.000797 time 0.2891 (0.2927) loss 3.8695 (3.7476) grad_norm 1.6286 (1.2452) [2022-10-02 02:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][900/1251] eta 0:01:42 lr 0.000797 time 0.2905 (0.2922) loss 3.4932 (3.7503) grad_norm 1.2015 (1.2417) [2022-10-02 02:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1000/1251] eta 0:01:13 lr 0.000797 time 0.2894 (0.2918) loss 3.1143 (3.7532) grad_norm 1.1433 (1.2396) [2022-10-02 02:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1100/1251] eta 0:00:44 lr 0.000796 time 0.2908 (0.2915) loss 4.2412 (3.7551) grad_norm 1.0280 (1.2381) [2022-10-02 02:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1200/1251] eta 0:00:14 lr 0.000796 time 0.2878 (0.2912) loss 4.1470 (3.7480) grad_norm 1.3957 (1.2370) [2022-10-02 02:20:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 89 training takes 0:06:04 [2022-10-02 02:20:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.947 (2.947) Loss 1.1605 (1.1605) Acc@1 74.121 (74.121) Acc@5 91.699 (91.699) [2022-10-02 02:20:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.006 Acc@5 91.794 [2022-10-02 02:20:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-10-02 02:20:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.04% [2022-10-02 02:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][0/1251] eta 1:06:31 lr 0.000796 time 3.1906 (3.1906) loss 4.0502 (4.0502) grad_norm 1.0467 (1.0467) [2022-10-02 02:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][100/1251] eta 0:06:05 lr 0.000796 time 0.2897 (0.3173) loss 4.4529 (3.7748) grad_norm 1.3223 (1.2315) [2022-10-02 02:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][200/1251] eta 0:05:18 lr 0.000795 time 0.2906 (0.3027) loss 3.8258 (3.7095) grad_norm 1.1296 (1.2294) [2022-10-02 02:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][300/1251] eta 0:04:43 lr 0.000795 time 0.2910 (0.2981) loss 4.1725 (3.7275) grad_norm 1.1994 (1.2334) [2022-10-02 02:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][400/1251] eta 0:04:11 lr 0.000795 time 0.2884 (0.2959) loss 3.1852 (3.7396) grad_norm 1.2072 (1.2333) [2022-10-02 02:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][500/1251] eta 0:03:41 lr 0.000794 time 0.2871 (0.2944) loss 3.6590 (3.7457) grad_norm 1.3138 (1.2290) [2022-10-02 02:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][600/1251] eta 0:03:11 lr 0.000794 time 0.2904 (0.2934) loss 3.7412 (3.7345) grad_norm 1.1472 (1.2366) [2022-10-02 02:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][700/1251] eta 0:02:41 lr 0.000794 time 0.2868 (0.2927) loss 4.0452 (3.7363) grad_norm 1.3799 (1.2406) [2022-10-02 02:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][800/1251] eta 0:02:11 lr 0.000793 time 0.2912 (0.2922) loss 4.4301 (3.7350) grad_norm 1.0461 (1.2402) [2022-10-02 02:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][900/1251] eta 0:01:42 lr 0.000793 time 0.2852 (0.2918) loss 3.8308 (3.7395) grad_norm 1.0460 (1.2371) [2022-10-02 02:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1000/1251] eta 0:01:13 lr 0.000793 time 0.2898 (0.2914) loss 3.3126 (3.7410) grad_norm 1.2718 (1.2340) [2022-10-02 02:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1100/1251] eta 0:00:43 lr 0.000792 time 0.2850 (0.2912) loss 3.4633 (3.7387) grad_norm 1.3049 (1.2342) [2022-10-02 02:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1200/1251] eta 0:00:14 lr 0.000792 time 0.2900 (0.2910) loss 2.9969 (3.7335) grad_norm 1.0782 (1.2326) [2022-10-02 02:26:42 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 90 training takes 0:06:04 [2022-10-02 02:26:42 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saving...... [2022-10-02 02:26:43 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saved !!! [2022-10-02 02:26:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.071 (3.071) Loss 1.2015 (1.2015) Acc@1 70.996 (70.996) Acc@5 91.504 (91.504) [2022-10-02 02:26:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.392 Acc@5 91.860 [2022-10-02 02:26:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-02 02:26:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.39% [2022-10-02 02:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][0/1251] eta 1:06:54 lr 0.000792 time 3.2089 (3.2089) loss 3.4245 (3.4245) grad_norm 1.3323 (1.3323) [2022-10-02 02:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][100/1251] eta 0:06:08 lr 0.000791 time 0.2891 (0.3198) loss 3.9683 (3.7820) grad_norm 1.0898 (1.2345) [2022-10-02 02:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][200/1251] eta 0:05:20 lr 0.000791 time 0.2924 (0.3049) loss 4.1722 (3.7610) grad_norm 1.2623 (1.2495) [2022-10-02 02:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][300/1251] eta 0:04:45 lr 0.000791 time 0.2932 (0.3001) loss 3.8392 (3.7441) grad_norm 1.1651 (1.2503) [2022-10-02 02:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][400/1251] eta 0:04:13 lr 0.000790 time 0.2939 (0.2976) loss 2.9971 (3.7662) grad_norm 1.8666 (1.2479) [2022-10-02 02:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][500/1251] eta 0:03:42 lr 0.000790 time 0.2860 (0.2962) loss 3.5022 (3.7587) grad_norm 1.2511 (1.2428) [2022-10-02 02:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][600/1251] eta 0:03:12 lr 0.000790 time 0.2914 (0.2952) loss 4.1823 (3.7641) grad_norm 1.1002 (1.2449) [2022-10-02 02:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][700/1251] eta 0:02:42 lr 0.000789 time 0.2872 (0.2943) loss 4.4792 (3.7749) grad_norm 1.2275 (1.2430) [2022-10-02 02:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][800/1251] eta 0:02:12 lr 0.000789 time 0.2935 (0.2937) loss 3.9016 (3.7777) grad_norm 1.1898 (1.2441) [2022-10-02 02:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][900/1251] eta 0:01:42 lr 0.000789 time 0.2874 (0.2933) loss 4.1085 (3.7805) grad_norm 1.2282 (1.2449) [2022-10-02 02:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1000/1251] eta 0:01:13 lr 0.000788 time 0.2946 (0.2929) loss 4.0544 (3.7843) grad_norm 1.2823 (1.2419) [2022-10-02 02:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1100/1251] eta 0:00:44 lr 0.000788 time 0.2859 (0.2925) loss 3.7054 (3.7839) grad_norm 1.2766 (1.2426) [2022-10-02 02:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1200/1251] eta 0:00:14 lr 0.000788 time 0.2943 (0.2923) loss 2.8488 (3.7765) grad_norm 1.1533 (1.2406) [2022-10-02 02:33:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 91 training takes 0:06:05 [2022-10-02 02:33:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.561 (2.561) Loss 1.0876 (1.0876) Acc@1 75.488 (75.488) Acc@5 91.699 (91.699) [2022-10-02 02:33:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.228 Acc@5 91.802 [2022-10-02 02:33:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.2% [2022-10-02 02:33:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.39% [2022-10-02 02:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][0/1251] eta 1:11:52 lr 0.000788 time 3.4475 (3.4475) loss 3.7150 (3.7150) grad_norm 1.4957 (1.4957) [2022-10-02 02:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][100/1251] eta 0:06:09 lr 0.000787 time 0.2906 (0.3211) loss 4.0218 (3.6591) grad_norm 0.9655 (1.2578) [2022-10-02 02:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][200/1251] eta 0:05:21 lr 0.000787 time 0.2906 (0.3058) loss 3.3213 (3.7409) grad_norm 1.4102 (1.2554) [2022-10-02 02:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][300/1251] eta 0:04:45 lr 0.000786 time 0.2881 (0.3006) loss 3.5716 (3.7477) grad_norm 0.9587 (1.2431) [2022-10-02 02:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][400/1251] eta 0:04:13 lr 0.000786 time 0.2885 (0.2979) loss 3.7621 (3.7672) grad_norm 1.1526 (1.2388) [2022-10-02 02:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][500/1251] eta 0:03:42 lr 0.000786 time 0.2888 (0.2961) loss 2.8098 (3.7729) grad_norm 1.5934 (1.2410) [2022-10-02 02:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][600/1251] eta 0:03:12 lr 0.000785 time 0.2935 (0.2951) loss 4.4170 (3.7712) grad_norm 1.3825 (1.2385) [2022-10-02 02:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][700/1251] eta 0:02:42 lr 0.000785 time 0.2859 (0.2941) loss 3.8256 (3.7717) grad_norm 1.3927 (1.2402) [2022-10-02 02:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][800/1251] eta 0:02:12 lr 0.000785 time 0.2912 (0.2934) loss 4.0789 (3.7707) grad_norm 1.1400 (1.2375) [2022-10-02 02:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][900/1251] eta 0:01:42 lr 0.000784 time 0.2897 (0.2929) loss 4.3342 (3.7656) grad_norm 1.2765 (1.2360) [2022-10-02 02:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1000/1251] eta 0:01:13 lr 0.000784 time 0.2925 (0.2924) loss 4.4702 (3.7590) grad_norm 1.0724 (1.2378) [2022-10-02 02:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1100/1251] eta 0:00:44 lr 0.000784 time 0.2908 (0.2920) loss 3.9987 (3.7563) grad_norm 1.1566 (1.2360) [2022-10-02 02:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1200/1251] eta 0:00:14 lr 0.000783 time 0.2876 (0.2917) loss 3.2975 (3.7567) grad_norm 1.0548 (1.2362) [2022-10-02 02:39:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 92 training takes 0:06:05 [2022-10-02 02:39:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.465 (2.465) Loss 1.2323 (1.2323) Acc@1 72.559 (72.559) Acc@5 90.430 (90.430) [2022-10-02 02:39:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.360 Acc@5 91.850 [2022-10-02 02:39:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-02 02:39:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.39% [2022-10-02 02:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][0/1251] eta 1:09:04 lr 0.000783 time 3.3126 (3.3126) loss 4.3175 (4.3175) grad_norm 1.2732 (1.2732) [2022-10-02 02:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][100/1251] eta 0:06:07 lr 0.000783 time 0.2919 (0.3190) loss 4.2831 (3.8218) grad_norm 1.2801 (1.2505) [2022-10-02 02:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][200/1251] eta 0:05:19 lr 0.000783 time 0.2924 (0.3044) loss 4.1054 (3.7951) grad_norm 1.0730 (1.2560) [2022-10-02 02:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][300/1251] eta 0:04:44 lr 0.000782 time 0.2950 (0.2994) loss 2.5494 (3.8001) grad_norm 1.1707 (1.2545) [2022-10-02 02:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][400/1251] eta 0:04:12 lr 0.000782 time 0.2906 (0.2968) loss 3.1746 (3.7881) grad_norm 1.2600 (1.2563) [2022-10-02 02:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][500/1251] eta 0:03:41 lr 0.000782 time 0.2860 (0.2951) loss 4.2335 (3.7869) grad_norm 1.5862 (1.2604) [2022-10-02 02:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][600/1251] eta 0:03:11 lr 0.000781 time 0.2879 (0.2940) loss 3.7328 (3.7788) grad_norm 1.1545 (1.2648) [2022-10-02 02:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][700/1251] eta 0:02:41 lr 0.000781 time 0.2878 (0.2931) loss 4.3189 (3.7750) grad_norm 1.6317 (1.2584) [2022-10-02 02:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][800/1251] eta 0:02:11 lr 0.000780 time 0.2888 (0.2924) loss 2.9681 (3.7779) grad_norm 1.3113 (1.2584) [2022-10-02 02:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][900/1251] eta 0:01:42 lr 0.000780 time 0.2881 (0.2919) loss 2.6619 (3.7712) grad_norm 1.2403 (1.2543) [2022-10-02 02:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1000/1251] eta 0:01:13 lr 0.000780 time 0.2864 (0.2915) loss 4.5485 (3.7680) grad_norm 1.3844 (1.2534) [2022-10-02 02:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1100/1251] eta 0:00:43 lr 0.000779 time 0.2861 (0.2912) loss 3.7686 (3.7667) grad_norm 1.1272 (1.2555) [2022-10-02 02:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1200/1251] eta 0:00:14 lr 0.000779 time 0.2870 (0.2909) loss 3.5731 (3.7636) grad_norm 1.2137 (1.2538) [2022-10-02 02:45:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 93 training takes 0:06:04 [2022-10-02 02:45:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.024 (3.024) Loss 1.2528 (1.2528) Acc@1 70.117 (70.117) Acc@5 91.211 (91.211) [2022-10-02 02:45:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.420 Acc@5 91.932 [2022-10-02 02:45:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-02 02:45:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.42% [2022-10-02 02:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][0/1251] eta 1:10:23 lr 0.000779 time 3.3761 (3.3761) loss 3.4339 (3.4339) grad_norm 1.0120 (1.0120) [2022-10-02 02:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][100/1251] eta 0:06:09 lr 0.000779 time 0.2886 (0.3212) loss 4.2984 (3.6835) grad_norm 1.1848 (1.2577) [2022-10-02 02:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][200/1251] eta 0:05:20 lr 0.000778 time 0.2900 (0.3054) loss 2.6818 (3.7284) grad_norm 1.1875 (1.2503) [2022-10-02 02:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][300/1251] eta 0:04:45 lr 0.000778 time 0.2859 (0.3000) loss 3.3044 (3.7274) grad_norm 1.0838 (1.2499) [2022-10-02 02:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][400/1251] eta 0:04:12 lr 0.000778 time 0.2909 (0.2972) loss 3.7883 (3.7585) grad_norm 1.0953 (1.2483) [2022-10-02 02:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][500/1251] eta 0:03:41 lr 0.000777 time 0.2865 (0.2956) loss 4.3422 (3.7395) grad_norm 1.3655 (1.2457) [2022-10-02 02:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][600/1251] eta 0:03:11 lr 0.000777 time 0.2881 (0.2945) loss 2.8628 (3.7368) grad_norm 1.0207 (1.2460) [2022-10-02 02:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][700/1251] eta 0:02:41 lr 0.000777 time 0.2869 (0.2936) loss 3.6232 (3.7484) grad_norm 1.3014 (1.2482) [2022-10-02 02:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][800/1251] eta 0:02:12 lr 0.000776 time 0.2881 (0.2930) loss 3.6479 (3.7555) grad_norm 1.1064 (1.2514) [2022-10-02 02:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][900/1251] eta 0:01:42 lr 0.000776 time 0.2905 (0.2925) loss 4.2790 (3.7510) grad_norm 1.1053 (1.2522) [2022-10-02 02:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1000/1251] eta 0:01:13 lr 0.000775 time 0.2880 (0.2921) loss 4.3053 (3.7465) grad_norm 1.2225 (1.2504) [2022-10-02 02:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1100/1251] eta 0:00:44 lr 0.000775 time 0.2854 (0.2917) loss 3.6810 (3.7449) grad_norm 1.2772 (1.2507) [2022-10-02 02:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1200/1251] eta 0:00:14 lr 0.000775 time 0.2894 (0.2914) loss 3.2289 (3.7392) grad_norm 1.7028 (1.2514) [2022-10-02 02:51:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 94 training takes 0:06:04 [2022-10-02 02:51:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.827 (2.827) Loss 1.1303 (1.1303) Acc@1 72.949 (72.949) Acc@5 92.480 (92.480) [2022-10-02 02:52:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.386 Acc@5 92.072 [2022-10-02 02:52:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-02 02:52:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.42% [2022-10-02 02:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][0/1251] eta 0:57:02 lr 0.000775 time 2.7359 (2.7359) loss 4.1048 (4.1048) grad_norm 1.1327 (1.1327) [2022-10-02 02:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][100/1251] eta 0:06:04 lr 0.000774 time 0.2876 (0.3166) loss 2.9275 (3.7496) grad_norm 1.1953 (1.2546) [2022-10-02 02:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][200/1251] eta 0:05:18 lr 0.000774 time 0.2899 (0.3027) loss 3.3639 (3.7312) grad_norm 1.5521 (1.2623) [2022-10-02 02:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][300/1251] eta 0:04:43 lr 0.000774 time 0.2865 (0.2980) loss 4.2157 (3.7641) grad_norm 1.0947 (1.2712) [2022-10-02 02:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][400/1251] eta 0:04:11 lr 0.000773 time 0.2908 (0.2958) loss 4.4270 (3.7614) grad_norm 1.2164 (1.2631) [2022-10-02 02:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][500/1251] eta 0:03:41 lr 0.000773 time 0.2857 (0.2944) loss 2.4895 (3.7469) grad_norm 1.1961 (1.2622) [2022-10-02 02:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][600/1251] eta 0:03:11 lr 0.000773 time 0.2890 (0.2935) loss 2.4573 (3.7487) grad_norm 1.3525 (1.2663) [2022-10-02 02:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][700/1251] eta 0:02:41 lr 0.000772 time 0.2885 (0.2928) loss 3.4890 (3.7441) grad_norm 1.3334 (1.2673) [2022-10-02 02:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][800/1251] eta 0:02:11 lr 0.000772 time 0.2859 (0.2922) loss 3.7644 (3.7367) grad_norm 1.0909 (1.2650) [2022-10-02 02:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][900/1251] eta 0:01:42 lr 0.000771 time 0.2854 (0.2916) loss 3.7647 (3.7371) grad_norm 1.3312 (1.2639) [2022-10-02 02:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1000/1251] eta 0:01:13 lr 0.000771 time 0.2857 (0.2912) loss 3.0813 (3.7340) grad_norm 1.1752 (1.2625) [2022-10-02 02:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1100/1251] eta 0:00:43 lr 0.000771 time 0.2874 (0.2909) loss 4.5003 (3.7343) grad_norm 1.1377 (1.2630) [2022-10-02 02:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1200/1251] eta 0:00:14 lr 0.000770 time 0.2825 (0.2906) loss 3.9123 (3.7312) grad_norm 1.0021 (1.2638) [2022-10-02 02:58:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 95 training takes 0:06:03 [2022-10-02 02:58:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.077 (3.077) Loss 1.1514 (1.1514) Acc@1 75.977 (75.977) Acc@5 91.406 (91.406) [2022-10-02 02:58:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.404 Acc@5 91.930 [2022-10-02 02:58:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-10-02 02:58:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.42% [2022-10-02 02:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][0/1251] eta 1:06:09 lr 0.000770 time 3.1731 (3.1731) loss 4.4992 (4.4992) grad_norm 1.3058 (1.3058) [2022-10-02 02:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][100/1251] eta 0:06:09 lr 0.000770 time 0.2901 (0.3212) loss 4.3682 (3.6228) grad_norm 1.3500 (1.2882) [2022-10-02 02:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][200/1251] eta 0:05:21 lr 0.000770 time 0.2920 (0.3064) loss 2.6962 (3.6931) grad_norm 1.4384 (1.2686) [2022-10-02 02:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][300/1251] eta 0:04:46 lr 0.000769 time 0.2901 (0.3013) loss 2.4461 (3.6950) grad_norm 1.0510 (1.2660) [2022-10-02 03:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][400/1251] eta 0:04:14 lr 0.000769 time 0.2901 (0.2986) loss 2.7295 (3.7099) grad_norm 1.4594 (1.2634) [2022-10-02 03:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][500/1251] eta 0:03:42 lr 0.000768 time 0.2931 (0.2968) loss 2.9656 (3.7078) grad_norm 1.3343 (1.2661) [2022-10-02 03:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][600/1251] eta 0:03:12 lr 0.000768 time 0.2887 (0.2955) loss 3.8889 (3.7304) grad_norm 1.5954 (1.2673) [2022-10-02 03:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][700/1251] eta 0:02:42 lr 0.000768 time 0.2903 (0.2946) loss 4.2877 (3.7331) grad_norm 1.3045 (1.2683) [2022-10-02 03:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][800/1251] eta 0:02:12 lr 0.000767 time 0.2867 (0.2939) loss 4.3864 (3.7425) grad_norm 1.3799 (1.2686) [2022-10-02 03:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][900/1251] eta 0:01:42 lr 0.000767 time 0.2887 (0.2934) loss 4.4127 (3.7455) grad_norm 1.3127 (1.2668) [2022-10-02 03:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1000/1251] eta 0:01:13 lr 0.000767 time 0.2891 (0.2929) loss 3.5227 (3.7387) grad_norm 1.2848 (1.2654) [2022-10-02 03:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1100/1251] eta 0:00:44 lr 0.000766 time 0.2885 (0.2925) loss 3.3618 (3.7392) grad_norm 1.3495 (1.2669) [2022-10-02 03:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1200/1251] eta 0:00:14 lr 0.000766 time 0.2858 (0.2922) loss 3.9890 (3.7434) grad_norm 1.1678 (1.2669) [2022-10-02 03:04:28 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 96 training takes 0:06:05 [2022-10-02 03:04:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.772 (2.772) Loss 1.0017 (1.0017) Acc@1 76.562 (76.562) Acc@5 93.457 (93.457) [2022-10-02 03:04:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.456 Acc@5 91.926 [2022-10-02 03:04:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-02 03:04:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.46% [2022-10-02 03:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][0/1251] eta 1:06:20 lr 0.000766 time 3.1815 (3.1815) loss 3.5047 (3.5047) grad_norm 1.5119 (1.5119) [2022-10-02 03:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][100/1251] eta 0:06:08 lr 0.000765 time 0.2936 (0.3203) loss 4.2054 (3.6822) grad_norm 1.5699 (1.2326) [2022-10-02 03:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][200/1251] eta 0:05:20 lr 0.000765 time 0.2881 (0.3053) loss 3.4053 (3.7573) grad_norm 1.0866 (1.2404) [2022-10-02 03:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][300/1251] eta 0:04:45 lr 0.000765 time 0.2953 (0.3005) loss 4.3803 (3.7552) grad_norm 1.0887 (1.2523) [2022-10-02 03:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][400/1251] eta 0:04:13 lr 0.000764 time 0.2884 (0.2980) loss 4.1376 (3.7590) grad_norm 1.1197 (1.2561) [2022-10-02 03:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][500/1251] eta 0:03:42 lr 0.000764 time 0.2948 (0.2964) loss 4.2595 (3.7457) grad_norm 1.1193 (1.2506) [2022-10-02 03:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][600/1251] eta 0:03:12 lr 0.000764 time 0.2868 (0.2954) loss 4.3607 (3.7417) grad_norm 1.1823 (1.2528) [2022-10-02 03:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][700/1251] eta 0:02:42 lr 0.000763 time 0.2941 (0.2946) loss 3.8547 (3.7450) grad_norm 1.5068 (1.2566) [2022-10-02 03:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][800/1251] eta 0:02:12 lr 0.000763 time 0.2864 (0.2940) loss 4.8211 (3.7344) grad_norm 1.1892 (1.2585) [2022-10-02 03:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][900/1251] eta 0:01:43 lr 0.000763 time 0.2899 (0.2936) loss 4.2456 (3.7383) grad_norm 1.1047 (1.2595) [2022-10-02 03:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1000/1251] eta 0:01:13 lr 0.000762 time 0.2870 (0.2931) loss 2.6730 (3.7261) grad_norm 1.1602 (1.2566) [2022-10-02 03:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1100/1251] eta 0:00:44 lr 0.000762 time 0.2910 (0.2927) loss 4.0190 (3.7195) grad_norm 1.3283 (1.2543) [2022-10-02 03:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1200/1251] eta 0:00:14 lr 0.000762 time 0.2876 (0.2923) loss 3.3451 (3.7235) grad_norm 1.1750 (1.2559) [2022-10-02 03:10:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 97 training takes 0:06:05 [2022-10-02 03:10:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.101 (3.101) Loss 1.2555 (1.2555) Acc@1 71.777 (71.777) Acc@5 91.504 (91.504) [2022-10-02 03:10:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.594 Acc@5 91.912 [2022-10-02 03:10:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-02 03:10:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.59% [2022-10-02 03:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][0/1251] eta 0:55:02 lr 0.000761 time 2.6397 (2.6397) loss 4.2950 (4.2950) grad_norm 1.1173 (1.1173) [2022-10-02 03:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][100/1251] eta 0:06:03 lr 0.000761 time 0.2910 (0.3157) loss 4.3102 (3.7061) grad_norm 1.1416 (1.2959) [2022-10-02 03:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][200/1251] eta 0:05:17 lr 0.000761 time 0.2896 (0.3020) loss 4.5296 (3.6917) grad_norm 1.4174 (1.2769) [2022-10-02 03:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][300/1251] eta 0:04:42 lr 0.000760 time 0.2864 (0.2973) loss 3.7582 (3.7089) grad_norm 1.2374 (1.2699) [2022-10-02 03:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][400/1251] eta 0:04:10 lr 0.000760 time 0.2870 (0.2949) loss 3.2829 (3.7158) grad_norm 1.4063 (1.2732) [2022-10-02 03:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][500/1251] eta 0:03:40 lr 0.000760 time 0.2884 (0.2935) loss 3.6491 (3.7091) grad_norm 1.2684 (1.2682) [2022-10-02 03:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][600/1251] eta 0:03:10 lr 0.000759 time 0.2877 (0.2925) loss 3.7892 (3.7033) grad_norm 1.3740 (1.2732) [2022-10-02 03:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][700/1251] eta 0:02:40 lr 0.000759 time 0.2857 (0.2918) loss 2.8945 (3.7010) grad_norm 1.1243 (1.2712) [2022-10-02 03:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][800/1251] eta 0:02:11 lr 0.000759 time 0.2866 (0.2913) loss 3.3356 (3.7040) grad_norm 1.3288 (1.2732) [2022-10-02 03:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][900/1251] eta 0:01:42 lr 0.000758 time 0.2886 (0.2909) loss 3.5454 (3.6999) grad_norm 1.4312 (1.2708) [2022-10-02 03:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1000/1251] eta 0:01:12 lr 0.000758 time 0.2867 (0.2906) loss 4.0089 (3.7077) grad_norm 1.0685 (1.2670) [2022-10-02 03:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1100/1251] eta 0:00:43 lr 0.000758 time 0.2877 (0.2903) loss 3.9889 (3.7119) grad_norm 1.2922 (1.2688) [2022-10-02 03:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1200/1251] eta 0:00:14 lr 0.000757 time 0.2864 (0.2902) loss 2.7168 (3.7135) grad_norm 1.0629 (1.2709) [2022-10-02 03:17:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 98 training takes 0:06:03 [2022-10-02 03:17:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.759 (2.759) Loss 1.1482 (1.1482) Acc@1 73.633 (73.633) Acc@5 92.090 (92.090) [2022-10-02 03:17:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.452 Acc@5 91.984 [2022-10-02 03:17:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-02 03:17:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.59% [2022-10-02 03:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][0/1251] eta 1:07:01 lr 0.000757 time 3.2147 (3.2147) loss 3.0402 (3.0402) grad_norm 1.3500 (1.3500) [2022-10-02 03:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][100/1251] eta 0:06:06 lr 0.000757 time 0.2861 (0.3183) loss 2.4576 (3.6753) grad_norm 1.1999 (1.2711) [2022-10-02 03:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][200/1251] eta 0:05:19 lr 0.000756 time 0.2879 (0.3038) loss 3.1214 (3.6653) grad_norm 1.3615 (1.2917) [2022-10-02 03:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][300/1251] eta 0:04:44 lr 0.000756 time 0.2858 (0.2989) loss 2.6664 (3.6688) grad_norm 1.3059 (1.2898) [2022-10-02 03:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][400/1251] eta 0:04:12 lr 0.000756 time 0.2880 (0.2965) loss 3.1878 (3.7080) grad_norm 1.1507 (1.2866) [2022-10-02 03:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][500/1251] eta 0:03:41 lr 0.000755 time 0.2844 (0.2951) loss 3.0193 (3.7085) grad_norm 1.1625 (1.2720) [2022-10-02 03:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][600/1251] eta 0:03:11 lr 0.000755 time 0.2919 (0.2941) loss 4.4601 (3.6957) grad_norm 1.1371 (1.2692) [2022-10-02 03:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][700/1251] eta 0:02:41 lr 0.000754 time 0.2869 (0.2934) loss 3.5595 (3.6949) grad_norm 1.3459 (1.2675) [2022-10-02 03:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][800/1251] eta 0:02:12 lr 0.000754 time 0.2932 (0.2928) loss 4.1831 (3.6941) grad_norm 1.2997 (1.2671) [2022-10-02 03:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][900/1251] eta 0:01:42 lr 0.000754 time 0.2875 (0.2924) loss 3.3136 (3.7055) grad_norm 1.5056 (1.2703) [2022-10-02 03:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1000/1251] eta 0:01:13 lr 0.000753 time 0.2930 (0.2921) loss 3.3898 (3.7107) grad_norm 1.2979 (1.2682) [2022-10-02 03:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1100/1251] eta 0:00:44 lr 0.000753 time 0.2887 (0.2918) loss 2.8197 (3.7141) grad_norm 1.2522 (1.2665) [2022-10-02 03:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1200/1251] eta 0:00:14 lr 0.000753 time 0.2919 (0.2916) loss 3.8466 (3.7187) grad_norm 1.1303 (1.2677) [2022-10-02 03:23:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 99 training takes 0:06:05 [2022-10-02 03:23:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.345 (2.345) Loss 1.0128 (1.0128) Acc@1 77.539 (77.539) Acc@5 93.066 (93.066) [2022-10-02 03:23:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.512 Acc@5 92.138 [2022-10-02 03:23:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-10-02 03:23:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.59% [2022-10-02 03:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][0/1251] eta 1:06:15 lr 0.000753 time 3.1776 (3.1776) loss 2.9960 (2.9960) grad_norm 1.4010 (1.4010) [2022-10-02 03:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][100/1251] eta 0:06:06 lr 0.000752 time 0.2838 (0.3181) loss 2.9544 (3.7127) grad_norm 1.7474 (1.2509) [2022-10-02 03:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][200/1251] eta 0:05:19 lr 0.000752 time 0.2914 (0.3037) loss 3.1958 (3.6593) grad_norm 1.0803 (1.2637) [2022-10-02 03:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][300/1251] eta 0:04:44 lr 0.000751 time 0.2869 (0.2987) loss 3.4850 (3.6850) grad_norm 1.2907 (1.2596) [2022-10-02 03:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][400/1251] eta 0:04:12 lr 0.000751 time 0.2872 (0.2961) loss 3.5585 (3.6983) grad_norm 1.3294 (1.2585) [2022-10-02 03:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][500/1251] eta 0:03:41 lr 0.000751 time 0.2881 (0.2944) loss 3.9610 (3.7129) grad_norm 1.1754 (1.2622) [2022-10-02 03:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][600/1251] eta 0:03:10 lr 0.000750 time 0.2882 (0.2933) loss 3.5857 (3.7239) grad_norm 1.2116 (1.2686) [2022-10-02 03:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][700/1251] eta 0:02:41 lr 0.000750 time 0.2880 (0.2924) loss 4.1461 (3.7274) grad_norm 1.3444 (1.2684) [2022-10-02 03:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][800/1251] eta 0:02:11 lr 0.000750 time 0.2866 (0.2917) loss 3.7352 (3.7139) grad_norm 1.1638 (1.2690) [2022-10-02 03:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][900/1251] eta 0:01:42 lr 0.000749 time 0.2883 (0.2912) loss 4.4009 (3.7111) grad_norm 1.2993 (1.2701) [2022-10-02 03:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1000/1251] eta 0:01:12 lr 0.000749 time 0.2846 (0.2908) loss 3.4993 (3.7103) grad_norm 1.3575 (1.2730) [2022-10-02 03:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1100/1251] eta 0:00:43 lr 0.000749 time 0.2850 (0.2905) loss 3.4347 (3.7142) grad_norm 1.3947 (1.2772) [2022-10-02 03:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1200/1251] eta 0:00:14 lr 0.000748 time 0.2838 (0.2901) loss 3.2783 (3.7139) grad_norm 1.2024 (1.2767) [2022-10-02 03:29:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 100 training takes 0:06:03 [2022-10-02 03:29:36 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saving...... [2022-10-02 03:29:36 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saved !!! [2022-10-02 03:29:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.481 (2.481) Loss 1.2016 (1.2016) Acc@1 72.754 (72.754) Acc@5 90.918 (90.918) [2022-10-02 03:29:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.560 Acc@5 92.182 [2022-10-02 03:29:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-02 03:29:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.59% [2022-10-02 03:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][0/1251] eta 1:05:32 lr 0.000748 time 3.1438 (3.1438) loss 3.2942 (3.2942) grad_norm 1.2299 (1.2299) [2022-10-02 03:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][100/1251] eta 0:06:06 lr 0.000748 time 0.2881 (0.3188) loss 4.1605 (3.7328) grad_norm 1.2815 (1.2873) [2022-10-02 03:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][200/1251] eta 0:05:20 lr 0.000747 time 0.2920 (0.3047) loss 3.4984 (3.7285) grad_norm 1.0986 (1.2811) [2022-10-02 03:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][300/1251] eta 0:04:45 lr 0.000747 time 0.2848 (0.3000) loss 3.8195 (3.6964) grad_norm 1.4486 (1.2788) [2022-10-02 03:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][400/1251] eta 0:04:13 lr 0.000747 time 0.2911 (0.2976) loss 2.8905 (3.6945) grad_norm 1.0900 (1.2818) [2022-10-02 03:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][500/1251] eta 0:03:42 lr 0.000746 time 0.2857 (0.2960) loss 2.7951 (3.6908) grad_norm 1.1806 (1.2763) [2022-10-02 03:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][600/1251] eta 0:03:12 lr 0.000746 time 0.2893 (0.2950) loss 3.3514 (3.7033) grad_norm 1.2401 (1.2816) [2022-10-02 03:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][700/1251] eta 0:02:42 lr 0.000745 time 0.2906 (0.2943) loss 4.0085 (3.7015) grad_norm 1.2639 (1.2796) [2022-10-02 03:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][800/1251] eta 0:02:12 lr 0.000745 time 0.2931 (0.2936) loss 4.0242 (3.7102) grad_norm 1.2055 (1.2758) [2022-10-02 03:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][900/1251] eta 0:01:42 lr 0.000745 time 0.2863 (0.2932) loss 3.5977 (3.7086) grad_norm 1.2796 (1.2749) [2022-10-02 03:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1000/1251] eta 0:01:13 lr 0.000744 time 0.2920 (0.2928) loss 3.2756 (3.7070) grad_norm 1.2940 (1.2778) [2022-10-02 03:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1100/1251] eta 0:00:44 lr 0.000744 time 0.2880 (0.2926) loss 3.8144 (3.7092) grad_norm 1.4531 (1.2754) [2022-10-02 03:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1200/1251] eta 0:00:14 lr 0.000744 time 0.2925 (0.2924) loss 4.0678 (3.7134) grad_norm 1.1607 (1.2769) [2022-10-02 03:35:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 101 training takes 0:06:06 [2022-10-02 03:35:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.160 (3.160) Loss 1.1927 (1.1927) Acc@1 70.801 (70.801) Acc@5 91.504 (91.504) [2022-10-02 03:36:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.834 Acc@5 92.122 [2022-10-02 03:36:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-02 03:36:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.83% [2022-10-02 03:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][0/1251] eta 1:09:38 lr 0.000743 time 3.3404 (3.3404) loss 3.8927 (3.8927) grad_norm 1.1101 (1.1101) [2022-10-02 03:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][100/1251] eta 0:06:08 lr 0.000743 time 0.2899 (0.3198) loss 4.4487 (3.6724) grad_norm 1.7012 (1.2746) [2022-10-02 03:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][200/1251] eta 0:05:20 lr 0.000743 time 0.2873 (0.3046) loss 4.1394 (3.6505) grad_norm 1.2921 (1.2725) [2022-10-02 03:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][300/1251] eta 0:04:44 lr 0.000742 time 0.2903 (0.2996) loss 3.7926 (3.7082) grad_norm 1.6733 (1.2746) [2022-10-02 03:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][400/1251] eta 0:04:12 lr 0.000742 time 0.3008 (0.2971) loss 3.9278 (3.7227) grad_norm 1.1281 (1.2801) [2022-10-02 03:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][500/1251] eta 0:03:41 lr 0.000742 time 0.2882 (0.2955) loss 3.3714 (3.7134) grad_norm 1.3776 (1.2815) [2022-10-02 03:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][600/1251] eta 0:03:11 lr 0.000741 time 0.2866 (0.2946) loss 3.4805 (3.7032) grad_norm 1.2286 (1.2787) [2022-10-02 03:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][700/1251] eta 0:02:41 lr 0.000741 time 0.2897 (0.2939) loss 4.6014 (3.6995) grad_norm 1.2916 (1.2788) [2022-10-02 03:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][800/1251] eta 0:02:12 lr 0.000741 time 0.2867 (0.2933) loss 3.8183 (3.7016) grad_norm 1.3207 (1.2767) [2022-10-02 03:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][900/1251] eta 0:01:42 lr 0.000740 time 0.2882 (0.2929) loss 4.3206 (3.7038) grad_norm 1.3078 (1.2768) [2022-10-02 03:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1000/1251] eta 0:01:13 lr 0.000740 time 0.2863 (0.2926) loss 4.1701 (3.7067) grad_norm 1.4220 (1.2794) [2022-10-02 03:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1100/1251] eta 0:00:44 lr 0.000739 time 0.2898 (0.2923) loss 4.4564 (3.7057) grad_norm 1.2926 (1.2821) [2022-10-02 03:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1200/1251] eta 0:00:14 lr 0.000739 time 0.2874 (0.2920) loss 3.2325 (3.7008) grad_norm 1.3319 (1.2835) [2022-10-02 03:42:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 102 training takes 0:06:05 [2022-10-02 03:42:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.437 (2.437) Loss 1.1496 (1.1496) Acc@1 72.461 (72.461) Acc@5 92.578 (92.578) [2022-10-02 03:42:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.574 Acc@5 92.134 [2022-10-02 03:42:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-10-02 03:42:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.83% [2022-10-02 03:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][0/1251] eta 0:46:12 lr 0.000739 time 2.2165 (2.2165) loss 3.8037 (3.8037) grad_norm 1.2457 (1.2457) [2022-10-02 03:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][100/1251] eta 0:06:02 lr 0.000739 time 0.2902 (0.3150) loss 3.7599 (3.6892) grad_norm 1.2380 (1.2973) [2022-10-02 03:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][200/1251] eta 0:05:17 lr 0.000738 time 0.2918 (0.3020) loss 3.8690 (3.6871) grad_norm 1.4409 (1.3040) [2022-10-02 03:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][300/1251] eta 0:04:42 lr 0.000738 time 0.2877 (0.2973) loss 3.0496 (3.6708) grad_norm 1.1262 (1.2940) [2022-10-02 03:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][400/1251] eta 0:04:11 lr 0.000737 time 0.2869 (0.2951) loss 4.0987 (3.6735) grad_norm 1.2261 (1.2840) [2022-10-02 03:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][500/1251] eta 0:03:40 lr 0.000737 time 0.2886 (0.2936) loss 3.9010 (3.6702) grad_norm 1.3796 (1.2849) [2022-10-02 03:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][600/1251] eta 0:03:10 lr 0.000737 time 0.2912 (0.2927) loss 2.7185 (3.6714) grad_norm 1.2713 (1.2840) [2022-10-02 03:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][700/1251] eta 0:02:40 lr 0.000736 time 0.2880 (0.2921) loss 2.6904 (3.6860) grad_norm 1.4181 (1.2821) [2022-10-02 03:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][800/1251] eta 0:02:11 lr 0.000736 time 0.2886 (0.2916) loss 4.0266 (3.6802) grad_norm 1.1679 (1.2878) [2022-10-02 03:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][900/1251] eta 0:01:42 lr 0.000736 time 0.2859 (0.2912) loss 3.9245 (3.6794) grad_norm 1.2155 (1.2901) [2022-10-02 03:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1000/1251] eta 0:01:13 lr 0.000735 time 0.2903 (0.2909) loss 3.6924 (3.6927) grad_norm 1.1228 (1.2883) [2022-10-02 03:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1100/1251] eta 0:00:43 lr 0.000735 time 0.2872 (0.2906) loss 3.9735 (3.6917) grad_norm 1.1832 (1.2889) [2022-10-02 03:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1200/1251] eta 0:00:14 lr 0.000735 time 0.2881 (0.2904) loss 4.3192 (3.6973) grad_norm 1.0772 (1.2909) [2022-10-02 03:48:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 103 training takes 0:06:03 [2022-10-02 03:48:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.126 (2.126) Loss 1.1133 (1.1133) Acc@1 75.098 (75.098) Acc@5 91.797 (91.797) [2022-10-02 03:48:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.886 Acc@5 92.182 [2022-10-02 03:48:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-10-02 03:48:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.89% [2022-10-02 03:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][0/1251] eta 0:48:59 lr 0.000734 time 2.3496 (2.3496) loss 3.6932 (3.6932) grad_norm 1.2075 (1.2075) [2022-10-02 03:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][100/1251] eta 0:05:57 lr 0.000734 time 0.2898 (0.3103) loss 3.8705 (3.6553) grad_norm 1.1598 (1.2898) [2022-10-02 03:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][200/1251] eta 0:05:14 lr 0.000734 time 0.2883 (0.2991) loss 3.2227 (3.6988) grad_norm 1.2221 (1.2927) [2022-10-02 03:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][300/1251] eta 0:04:40 lr 0.000733 time 0.2863 (0.2951) loss 3.9691 (3.7021) grad_norm 1.5607 (1.2777) [2022-10-02 03:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][400/1251] eta 0:04:09 lr 0.000733 time 0.2885 (0.2931) loss 3.6346 (3.6957) grad_norm 1.5644 (1.2789) [2022-10-02 03:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][500/1251] eta 0:03:39 lr 0.000732 time 0.2851 (0.2919) loss 3.9164 (3.6895) grad_norm 1.4816 (1.2803) [2022-10-02 03:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][600/1251] eta 0:03:09 lr 0.000732 time 0.2872 (0.2911) loss 3.5670 (3.6932) grad_norm 1.0831 (1.2783) [2022-10-02 03:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][700/1251] eta 0:02:40 lr 0.000732 time 0.2864 (0.2906) loss 4.0299 (3.6895) grad_norm 1.3500 (1.2793) [2022-10-02 03:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][800/1251] eta 0:02:10 lr 0.000731 time 0.2877 (0.2902) loss 4.2719 (3.6919) grad_norm 1.2979 (1.2811) [2022-10-02 03:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][900/1251] eta 0:01:41 lr 0.000731 time 0.2884 (0.2898) loss 3.8765 (3.6896) grad_norm 1.1648 (1.2816) [2022-10-02 03:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1000/1251] eta 0:01:12 lr 0.000731 time 0.2903 (0.2897) loss 3.7434 (3.6821) grad_norm 1.6044 (1.2822) [2022-10-02 03:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1100/1251] eta 0:00:43 lr 0.000730 time 0.2889 (0.2895) loss 4.3591 (3.6809) grad_norm 1.3004 (1.2837) [2022-10-02 03:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1200/1251] eta 0:00:14 lr 0.000730 time 0.2875 (0.2893) loss 2.6490 (3.6807) grad_norm 1.1498 (1.2868) [2022-10-02 03:54:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 104 training takes 0:06:02 [2022-10-02 03:54:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.233 (3.233) Loss 1.1863 (1.1863) Acc@1 71.973 (71.973) Acc@5 91.602 (91.602) [2022-10-02 03:54:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.754 Acc@5 92.334 [2022-10-02 03:54:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.8% [2022-10-02 03:54:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.89% [2022-10-02 03:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][0/1251] eta 1:01:19 lr 0.000730 time 2.9413 (2.9413) loss 3.8201 (3.8201) grad_norm 1.7263 (1.7263) [2022-10-02 03:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][100/1251] eta 0:06:05 lr 0.000729 time 0.2913 (0.3179) loss 4.2815 (3.6352) grad_norm 1.2665 (1.3197) [2022-10-02 03:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][200/1251] eta 0:05:20 lr 0.000729 time 0.2868 (0.3045) loss 4.2191 (3.6326) grad_norm 1.1380 (1.2984) [2022-10-02 03:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][300/1251] eta 0:04:45 lr 0.000729 time 0.2904 (0.2999) loss 4.1297 (3.6603) grad_norm 1.3938 (1.3037) [2022-10-02 03:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][400/1251] eta 0:04:13 lr 0.000728 time 0.2898 (0.2974) loss 3.2997 (3.6505) grad_norm 1.2427 (1.3053) [2022-10-02 03:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][500/1251] eta 0:03:42 lr 0.000728 time 0.2878 (0.2959) loss 2.9614 (3.6566) grad_norm 1.1587 (1.3081) [2022-10-02 03:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][600/1251] eta 0:03:11 lr 0.000728 time 0.2872 (0.2948) loss 3.4269 (3.6490) grad_norm 1.2513 (1.3051) [2022-10-02 03:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][700/1251] eta 0:02:42 lr 0.000727 time 0.2910 (0.2941) loss 4.1600 (3.6581) grad_norm 1.3034 (1.2985) [2022-10-02 03:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][800/1251] eta 0:02:12 lr 0.000727 time 0.2970 (0.2935) loss 4.1021 (3.6659) grad_norm 1.2246 (1.2970) [2022-10-02 03:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][900/1251] eta 0:01:42 lr 0.000726 time 0.2923 (0.2930) loss 3.4553 (3.6710) grad_norm 1.4057 (1.2982) [2022-10-02 03:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1000/1251] eta 0:01:13 lr 0.000726 time 0.2909 (0.2926) loss 3.9219 (3.6733) grad_norm 1.1879 (1.2949) [2022-10-02 04:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1100/1251] eta 0:00:44 lr 0.000726 time 0.2914 (0.2923) loss 3.0061 (3.6786) grad_norm 1.1315 (1.2939) [2022-10-02 04:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1200/1251] eta 0:00:14 lr 0.000725 time 0.2896 (0.2920) loss 4.0611 (3.6871) grad_norm 1.1515 (1.2953) [2022-10-02 04:01:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 105 training takes 0:06:05 [2022-10-02 04:01:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.492 (2.492) Loss 1.1445 (1.1445) Acc@1 73.633 (73.633) Acc@5 91.895 (91.895) [2022-10-02 04:01:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.868 Acc@5 92.230 [2022-10-02 04:01:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-10-02 04:01:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 73.89% [2022-10-02 04:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][0/1251] eta 0:48:05 lr 0.000725 time 2.3063 (2.3063) loss 3.8390 (3.8390) grad_norm 1.2867 (1.2867) [2022-10-02 04:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][100/1251] eta 0:06:05 lr 0.000725 time 0.2915 (0.3176) loss 3.5767 (3.6192) grad_norm 1.2090 (1.2969) [2022-10-02 04:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][200/1251] eta 0:05:19 lr 0.000724 time 0.2969 (0.3042) loss 3.8187 (3.6234) grad_norm 1.2623 (1.3014) [2022-10-02 04:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][300/1251] eta 0:04:44 lr 0.000724 time 0.2892 (0.2995) loss 3.5391 (3.6697) grad_norm 1.4707 (1.2981) [2022-10-02 04:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][400/1251] eta 0:04:12 lr 0.000724 time 0.2924 (0.2972) loss 3.3097 (3.6848) grad_norm 1.1430 (1.2920) [2022-10-02 04:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][500/1251] eta 0:03:42 lr 0.000723 time 0.2916 (0.2959) loss 4.0153 (3.6957) grad_norm 1.1902 (1.2920) [2022-10-02 04:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][600/1251] eta 0:03:12 lr 0.000723 time 0.2899 (0.2950) loss 3.3131 (3.6921) grad_norm 1.1897 (1.2952) [2022-10-02 04:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][700/1251] eta 0:02:42 lr 0.000722 time 0.2925 (0.2943) loss 2.6902 (3.7069) grad_norm 1.3112 (1.2928) [2022-10-02 04:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][800/1251] eta 0:02:12 lr 0.000722 time 0.2897 (0.2938) loss 4.1113 (3.7072) grad_norm 1.3097 (1.2957) [2022-10-02 04:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][900/1251] eta 0:01:42 lr 0.000722 time 0.2887 (0.2933) loss 3.7369 (3.7111) grad_norm 1.0392 (1.2924) [2022-10-02 04:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1000/1251] eta 0:01:13 lr 0.000721 time 0.2911 (0.2929) loss 3.8403 (3.7124) grad_norm 1.3251 (1.2923) [2022-10-02 04:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1100/1251] eta 0:00:44 lr 0.000721 time 0.2924 (0.2926) loss 3.9894 (3.7130) grad_norm 1.7103 (1.2931) [2022-10-02 04:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1200/1251] eta 0:00:14 lr 0.000721 time 0.2882 (0.2923) loss 2.6325 (3.7044) grad_norm 1.2418 (1.2906) [2022-10-02 04:07:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 106 training takes 0:06:05 [2022-10-02 04:07:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.243 (3.243) Loss 1.0501 (1.0501) Acc@1 75.586 (75.586) Acc@5 93.652 (93.652) [2022-10-02 04:07:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.086 Acc@5 92.280 [2022-10-02 04:07:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-10-02 04:07:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.09% [2022-10-02 04:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][0/1251] eta 1:10:03 lr 0.000720 time 3.3604 (3.3604) loss 2.6586 (2.6586) grad_norm 1.2573 (1.2573) [2022-10-02 04:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][100/1251] eta 0:06:10 lr 0.000720 time 0.2914 (0.3216) loss 3.9063 (3.6540) grad_norm 1.3759 (1.3095) [2022-10-02 04:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][200/1251] eta 0:05:21 lr 0.000720 time 0.2954 (0.3061) loss 3.1216 (3.6342) grad_norm 1.6521 (1.3122) [2022-10-02 04:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][300/1251] eta 0:04:46 lr 0.000719 time 0.2879 (0.3008) loss 3.7936 (3.6685) grad_norm 1.3460 (1.3054) [2022-10-02 04:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][400/1251] eta 0:04:13 lr 0.000719 time 0.2916 (0.2982) loss 4.5612 (3.6809) grad_norm 1.1479 (1.3047) [2022-10-02 04:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][500/1251] eta 0:03:42 lr 0.000719 time 0.2973 (0.2964) loss 2.8159 (3.6788) grad_norm 1.4966 (1.3043) [2022-10-02 04:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][600/1251] eta 0:03:12 lr 0.000718 time 0.2906 (0.2953) loss 3.2186 (3.6737) grad_norm 1.2482 (1.3083) [2022-10-02 04:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][700/1251] eta 0:02:42 lr 0.000718 time 0.2876 (0.2945) loss 4.0229 (3.6740) grad_norm 1.1543 (1.3074) [2022-10-02 04:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][800/1251] eta 0:02:12 lr 0.000717 time 0.2910 (0.2939) loss 3.8761 (3.6715) grad_norm 1.0728 (1.3067) [2022-10-02 04:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][900/1251] eta 0:01:43 lr 0.000717 time 0.2915 (0.2936) loss 3.4690 (3.6740) grad_norm 1.3016 (1.3096) [2022-10-02 04:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1000/1251] eta 0:01:13 lr 0.000717 time 0.2940 (0.2933) loss 4.1119 (3.6836) grad_norm 1.7793 (1.3135) [2022-10-02 04:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1100/1251] eta 0:00:44 lr 0.000716 time 0.2901 (0.2931) loss 4.2790 (3.6898) grad_norm 1.2698 (1.3119) [2022-10-02 04:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1200/1251] eta 0:00:14 lr 0.000716 time 0.2918 (0.2928) loss 4.0796 (3.6871) grad_norm 1.4476 (1.3124) [2022-10-02 04:13:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 107 training takes 0:06:06 [2022-10-02 04:13:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.395 (3.395) Loss 1.0746 (1.0746) Acc@1 73.828 (73.828) Acc@5 92.188 (92.188) [2022-10-02 04:13:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.880 Acc@5 92.300 [2022-10-02 04:13:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-10-02 04:13:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.09% [2022-10-02 04:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][0/1251] eta 0:50:58 lr 0.000716 time 2.4448 (2.4448) loss 4.1101 (4.1101) grad_norm 1.3181 (1.3181) [2022-10-02 04:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][100/1251] eta 0:06:02 lr 0.000715 time 0.2940 (0.3153) loss 4.0066 (3.6994) grad_norm 1.2360 (1.3106) [2022-10-02 04:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][200/1251] eta 0:05:18 lr 0.000715 time 0.2880 (0.3028) loss 4.3105 (3.7170) grad_norm 1.1662 (1.3391) [2022-10-02 04:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][300/1251] eta 0:04:43 lr 0.000715 time 0.2886 (0.2986) loss 4.1863 (3.7238) grad_norm 1.1109 (1.3328) [2022-10-02 04:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][400/1251] eta 0:04:12 lr 0.000714 time 0.2923 (0.2964) loss 3.9659 (3.6952) grad_norm 1.0874 (1.3287) [2022-10-02 04:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][500/1251] eta 0:03:41 lr 0.000714 time 0.2876 (0.2951) loss 2.6598 (3.7088) grad_norm 1.4601 (1.3211) [2022-10-02 04:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][600/1251] eta 0:03:11 lr 0.000714 time 0.2934 (0.2943) loss 2.9069 (3.7169) grad_norm 1.1915 (1.3178) [2022-10-02 04:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][700/1251] eta 0:02:41 lr 0.000713 time 0.2902 (0.2936) loss 2.9028 (3.7153) grad_norm 1.4943 (1.3161) [2022-10-02 04:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][800/1251] eta 0:02:12 lr 0.000713 time 0.2902 (0.2931) loss 4.0055 (3.7153) grad_norm 1.2882 (1.3091) [2022-10-02 04:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][900/1251] eta 0:01:42 lr 0.000712 time 0.2904 (0.2927) loss 2.9333 (3.7169) grad_norm 1.1803 (1.3099) [2022-10-02 04:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1000/1251] eta 0:01:13 lr 0.000712 time 0.2895 (0.2923) loss 3.8346 (3.7211) grad_norm 1.1917 (1.3115) [2022-10-02 04:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1100/1251] eta 0:00:44 lr 0.000712 time 0.2880 (0.2921) loss 3.4274 (3.7191) grad_norm 1.2426 (1.3111) [2022-10-02 04:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1200/1251] eta 0:00:14 lr 0.000711 time 0.2890 (0.2918) loss 3.2528 (3.7240) grad_norm 1.1561 (1.3084) [2022-10-02 04:19:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 108 training takes 0:06:05 [2022-10-02 04:20:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.964 (2.964) Loss 1.0439 (1.0439) Acc@1 75.879 (75.879) Acc@5 92.676 (92.676) [2022-10-02 04:20:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.854 Acc@5 92.282 [2022-10-02 04:20:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-10-02 04:20:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.09% [2022-10-02 04:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][0/1251] eta 1:06:55 lr 0.000711 time 3.2099 (3.2099) loss 3.6038 (3.6038) grad_norm 1.1358 (1.1358) [2022-10-02 04:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][100/1251] eta 0:06:06 lr 0.000711 time 0.2922 (0.3188) loss 3.3021 (3.6345) grad_norm 1.1849 (1.3323) [2022-10-02 04:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][200/1251] eta 0:05:19 lr 0.000710 time 0.2915 (0.3043) loss 3.6414 (3.6007) grad_norm 1.3657 (1.3320) [2022-10-02 04:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][300/1251] eta 0:04:44 lr 0.000710 time 0.2901 (0.2994) loss 3.9998 (3.6334) grad_norm 1.5299 (1.3246) [2022-10-02 04:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][400/1251] eta 0:04:12 lr 0.000710 time 0.2888 (0.2969) loss 2.8224 (3.6340) grad_norm 1.5318 (1.3291) [2022-10-02 04:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][500/1251] eta 0:03:41 lr 0.000709 time 0.2889 (0.2954) loss 4.4133 (3.6471) grad_norm 1.5200 (1.3230) [2022-10-02 04:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][600/1251] eta 0:03:11 lr 0.000709 time 0.2916 (0.2944) loss 3.7091 (3.6540) grad_norm 1.3675 (1.3169) [2022-10-02 04:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][700/1251] eta 0:02:41 lr 0.000708 time 0.2867 (0.2936) loss 3.7685 (3.6695) grad_norm 1.4567 (1.3161) [2022-10-02 04:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][800/1251] eta 0:02:12 lr 0.000708 time 0.2934 (0.2931) loss 4.5623 (3.6679) grad_norm 1.1658 (1.3164) [2022-10-02 04:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][900/1251] eta 0:01:42 lr 0.000708 time 0.2884 (0.2926) loss 3.3163 (3.6664) grad_norm 1.1779 (1.3178) [2022-10-02 04:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1000/1251] eta 0:01:13 lr 0.000707 time 0.2885 (0.2923) loss 2.7301 (3.6694) grad_norm 1.2225 (1.3174) [2022-10-02 04:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1100/1251] eta 0:00:44 lr 0.000707 time 0.2907 (0.2919) loss 3.9989 (3.6664) grad_norm 1.2506 (1.3163) [2022-10-02 04:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1200/1251] eta 0:00:14 lr 0.000707 time 0.2904 (0.2917) loss 3.0716 (3.6653) grad_norm 1.2124 (1.3156) [2022-10-02 04:26:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 109 training takes 0:06:05 [2022-10-02 04:26:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.671 (2.671) Loss 1.0942 (1.0942) Acc@1 72.949 (72.949) Acc@5 93.066 (93.066) [2022-10-02 04:26:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.272 Acc@5 92.398 [2022-10-02 04:26:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-02 04:26:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.27% [2022-10-02 04:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][0/1251] eta 1:07:50 lr 0.000706 time 3.2538 (3.2538) loss 4.1729 (4.1729) grad_norm 1.0863 (1.0863) [2022-10-02 04:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][100/1251] eta 0:06:06 lr 0.000706 time 0.2907 (0.3188) loss 3.4164 (3.6678) grad_norm 1.3033 (1.3240) [2022-10-02 04:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][200/1251] eta 0:05:19 lr 0.000706 time 0.2886 (0.3045) loss 4.0651 (3.6811) grad_norm 1.2678 (1.3099) [2022-10-02 04:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][300/1251] eta 0:04:44 lr 0.000705 time 0.2900 (0.2995) loss 3.0226 (3.6645) grad_norm 1.2022 (1.3142) [2022-10-02 04:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][400/1251] eta 0:04:12 lr 0.000705 time 0.2882 (0.2969) loss 3.4553 (3.6674) grad_norm 1.1891 (1.3148) [2022-10-02 04:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][500/1251] eta 0:03:41 lr 0.000704 time 0.2893 (0.2953) loss 3.8992 (3.6717) grad_norm 1.3736 (1.3156) [2022-10-02 04:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][600/1251] eta 0:03:11 lr 0.000704 time 0.2881 (0.2942) loss 4.4349 (3.6737) grad_norm 1.4248 (1.3163) [2022-10-02 04:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][700/1251] eta 0:02:41 lr 0.000704 time 0.2912 (0.2933) loss 4.1371 (3.6701) grad_norm 1.3476 (1.3134) [2022-10-02 04:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][800/1251] eta 0:02:12 lr 0.000703 time 0.2920 (0.2928) loss 3.1119 (3.6626) grad_norm 1.5492 (1.3155) [2022-10-02 04:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][900/1251] eta 0:01:42 lr 0.000703 time 0.2912 (0.2924) loss 4.0662 (3.6628) grad_norm 1.1569 (1.3170) [2022-10-02 04:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1000/1251] eta 0:01:13 lr 0.000703 time 0.2919 (0.2920) loss 4.2714 (3.6581) grad_norm 1.1167 (1.3183) [2022-10-02 04:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1100/1251] eta 0:00:44 lr 0.000702 time 0.2862 (0.2918) loss 4.0349 (3.6522) grad_norm 1.4359 (1.3184) [2022-10-02 04:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1200/1251] eta 0:00:14 lr 0.000702 time 0.2892 (0.2915) loss 3.8545 (3.6525) grad_norm 1.1771 (1.3164) [2022-10-02 04:32:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 110 training takes 0:06:04 [2022-10-02 04:32:34 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saving...... [2022-10-02 04:32:34 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saved !!! [2022-10-02 04:32:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.158 (2.158) Loss 0.9859 (0.9859) Acc@1 78.516 (78.516) Acc@5 94.043 (94.043) [2022-10-02 04:32:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.160 Acc@5 92.386 [2022-10-02 04:32:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-10-02 04:32:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.27% [2022-10-02 04:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][0/1251] eta 1:00:34 lr 0.000702 time 2.9051 (2.9051) loss 3.4148 (3.4148) grad_norm 1.2769 (1.2769) [2022-10-02 04:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][100/1251] eta 0:06:05 lr 0.000701 time 0.2895 (0.3179) loss 3.7292 (3.6604) grad_norm 1.3230 (1.3212) [2022-10-02 04:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][200/1251] eta 0:05:19 lr 0.000701 time 0.2896 (0.3042) loss 3.7817 (3.6431) grad_norm 1.1504 (1.3036) [2022-10-02 04:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][300/1251] eta 0:04:45 lr 0.000700 time 0.2929 (0.2998) loss 2.6674 (3.6520) grad_norm 1.5995 (1.3059) [2022-10-02 04:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][400/1251] eta 0:04:13 lr 0.000700 time 0.2872 (0.2976) loss 2.7689 (3.6497) grad_norm 1.1281 (1.3092) [2022-10-02 04:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][500/1251] eta 0:03:42 lr 0.000700 time 0.2863 (0.2961) loss 3.7077 (3.6614) grad_norm 1.1774 (1.3105) [2022-10-02 04:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][600/1251] eta 0:03:12 lr 0.000699 time 0.2902 (0.2952) loss 3.8812 (3.6713) grad_norm 1.1838 (1.3129) [2022-10-02 04:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][700/1251] eta 0:02:42 lr 0.000699 time 0.2886 (0.2946) loss 3.8782 (3.6784) grad_norm 1.4417 (1.3121) [2022-10-02 04:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][800/1251] eta 0:02:12 lr 0.000699 time 0.2867 (0.2941) loss 2.8602 (3.6757) grad_norm 1.3063 (1.3131) [2022-10-02 04:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][900/1251] eta 0:01:43 lr 0.000698 time 0.2895 (0.2936) loss 4.4272 (3.6740) grad_norm 1.5255 (1.3153) [2022-10-02 04:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1000/1251] eta 0:01:13 lr 0.000698 time 0.2910 (0.2932) loss 3.3290 (3.6831) grad_norm 1.3238 (1.3149) [2022-10-02 04:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1100/1251] eta 0:00:44 lr 0.000697 time 0.2888 (0.2929) loss 2.4795 (3.6843) grad_norm 1.3238 (1.3155) [2022-10-02 04:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1200/1251] eta 0:00:14 lr 0.000697 time 0.2880 (0.2926) loss 4.6032 (3.6857) grad_norm 1.3287 (1.3165) [2022-10-02 04:38:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 111 training takes 0:06:06 [2022-10-02 04:38:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.111 (3.111) Loss 1.0791 (1.0791) Acc@1 75.293 (75.293) Acc@5 92.188 (92.188) [2022-10-02 04:39:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.174 Acc@5 92.434 [2022-10-02 04:39:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-10-02 04:39:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.27% [2022-10-02 04:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][0/1251] eta 1:08:20 lr 0.000697 time 3.2778 (3.2778) loss 4.2792 (4.2792) grad_norm 1.3091 (1.3091) [2022-10-02 04:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][100/1251] eta 0:06:07 lr 0.000696 time 0.2918 (0.3189) loss 3.8422 (3.6478) grad_norm 1.1553 (1.3240) [2022-10-02 04:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][200/1251] eta 0:05:19 lr 0.000696 time 0.2871 (0.3039) loss 4.0342 (3.6427) grad_norm 1.3772 (1.3390) [2022-10-02 04:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][300/1251] eta 0:04:44 lr 0.000696 time 0.2874 (0.2988) loss 3.8021 (3.6437) grad_norm 1.2853 (1.3331) [2022-10-02 04:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][400/1251] eta 0:04:12 lr 0.000695 time 0.2925 (0.2962) loss 3.6336 (3.6660) grad_norm 1.6082 (1.3314) [2022-10-02 04:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][500/1251] eta 0:03:41 lr 0.000695 time 0.2929 (0.2947) loss 3.8163 (3.6697) grad_norm 1.2511 (1.3293) [2022-10-02 04:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][600/1251] eta 0:03:11 lr 0.000695 time 0.2871 (0.2936) loss 3.9409 (3.6738) grad_norm 1.2923 (1.3358) [2022-10-02 04:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][700/1251] eta 0:02:41 lr 0.000694 time 0.2898 (0.2929) loss 3.2030 (3.6716) grad_norm 1.3880 (1.3370) [2022-10-02 04:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][800/1251] eta 0:02:11 lr 0.000694 time 0.2873 (0.2923) loss 3.2450 (3.6667) grad_norm 1.2907 (1.3362) [2022-10-02 04:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][900/1251] eta 0:01:42 lr 0.000693 time 0.2870 (0.2918) loss 3.2403 (3.6678) grad_norm 1.3141 (1.3315) [2022-10-02 04:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1000/1251] eta 0:01:13 lr 0.000693 time 0.2864 (0.2913) loss 4.2350 (3.6754) grad_norm 1.2154 (1.3322) [2022-10-02 04:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1100/1251] eta 0:00:43 lr 0.000693 time 0.2886 (0.2910) loss 3.8240 (3.6674) grad_norm 1.3859 (1.3300) [2022-10-02 04:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1200/1251] eta 0:00:14 lr 0.000692 time 0.2865 (0.2907) loss 3.7842 (3.6574) grad_norm 1.4132 (1.3298) [2022-10-02 04:45:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 112 training takes 0:06:03 [2022-10-02 04:45:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.243 (3.243) Loss 1.0561 (1.0561) Acc@1 75.293 (75.293) Acc@5 92.969 (92.969) [2022-10-02 04:45:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.296 Acc@5 92.364 [2022-10-02 04:45:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-02 04:45:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.30% [2022-10-02 04:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][0/1251] eta 0:53:33 lr 0.000692 time 2.5687 (2.5687) loss 2.7653 (2.7653) grad_norm 1.2887 (1.2887) [2022-10-02 04:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][100/1251] eta 0:05:59 lr 0.000692 time 0.2855 (0.3128) loss 3.6351 (3.6015) grad_norm 1.1603 (1.3368) [2022-10-02 04:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][200/1251] eta 0:05:15 lr 0.000691 time 0.2881 (0.3003) loss 4.6968 (3.6325) grad_norm 1.3085 (1.3294) [2022-10-02 04:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][300/1251] eta 0:04:41 lr 0.000691 time 0.2831 (0.2961) loss 4.0636 (3.6410) grad_norm 1.2394 (1.3144) [2022-10-02 04:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][400/1251] eta 0:04:10 lr 0.000690 time 0.2855 (0.2940) loss 3.9197 (3.6671) grad_norm 1.3926 (1.3181) [2022-10-02 04:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][500/1251] eta 0:03:39 lr 0.000690 time 0.2852 (0.2928) loss 3.9842 (3.6657) grad_norm 1.3107 (1.3248) [2022-10-02 04:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][600/1251] eta 0:03:10 lr 0.000690 time 0.2882 (0.2919) loss 3.7571 (3.6602) grad_norm 1.4085 (1.3326) [2022-10-02 04:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][700/1251] eta 0:02:40 lr 0.000689 time 0.2890 (0.2915) loss 4.3097 (3.6531) grad_norm 1.1294 (1.3303) [2022-10-02 04:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][800/1251] eta 0:02:11 lr 0.000689 time 0.2892 (0.2913) loss 3.5647 (3.6603) grad_norm 1.2993 (1.3330) [2022-10-02 04:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][900/1251] eta 0:01:42 lr 0.000689 time 0.2874 (0.2910) loss 4.1912 (3.6578) grad_norm 1.1677 (1.3356) [2022-10-02 04:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1000/1251] eta 0:01:12 lr 0.000688 time 0.2874 (0.2908) loss 3.9598 (3.6593) grad_norm 1.2392 (1.3390) [2022-10-02 04:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1100/1251] eta 0:00:43 lr 0.000688 time 0.2895 (0.2906) loss 4.2322 (3.6599) grad_norm 1.3060 (1.3360) [2022-10-02 04:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1200/1251] eta 0:00:14 lr 0.000687 time 0.2874 (0.2903) loss 3.8498 (3.6617) grad_norm 1.5275 (1.3362) [2022-10-02 04:51:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 113 training takes 0:06:03 [2022-10-02 04:51:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.816 (2.816) Loss 1.1171 (1.1171) Acc@1 75.977 (75.977) Acc@5 91.211 (91.211) [2022-10-02 04:51:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 73.992 Acc@5 92.310 [2022-10-02 04:51:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-10-02 04:51:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.30% [2022-10-02 04:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][0/1251] eta 0:45:13 lr 0.000687 time 2.1693 (2.1693) loss 3.8457 (3.8457) grad_norm 1.4415 (1.4415) [2022-10-02 04:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][100/1251] eta 0:06:00 lr 0.000687 time 0.2880 (0.3131) loss 4.5817 (3.6153) grad_norm 1.6491 (1.3287) [2022-10-02 04:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][200/1251] eta 0:05:16 lr 0.000686 time 0.2905 (0.3010) loss 4.1738 (3.6506) grad_norm 1.3220 (1.3274) [2022-10-02 04:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][300/1251] eta 0:04:42 lr 0.000686 time 0.2866 (0.2970) loss 3.6203 (3.6566) grad_norm 1.3186 (1.3315) [2022-10-02 04:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][400/1251] eta 0:04:10 lr 0.000686 time 0.2871 (0.2949) loss 4.2478 (3.6586) grad_norm 1.1633 (1.3330) [2022-10-02 04:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][500/1251] eta 0:03:40 lr 0.000685 time 0.2878 (0.2936) loss 2.8826 (3.6397) grad_norm 1.2223 (1.3299) [2022-10-02 04:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][600/1251] eta 0:03:10 lr 0.000685 time 0.2854 (0.2927) loss 4.0599 (3.6250) grad_norm 1.1277 (1.3299) [2022-10-02 04:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][700/1251] eta 0:02:40 lr 0.000685 time 0.2870 (0.2920) loss 2.9666 (3.6378) grad_norm 1.1675 (1.3302) [2022-10-02 04:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][800/1251] eta 0:02:11 lr 0.000684 time 0.2854 (0.2916) loss 4.4330 (3.6442) grad_norm 1.2545 (1.3299) [2022-10-02 04:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][900/1251] eta 0:01:42 lr 0.000684 time 0.2866 (0.2912) loss 3.6924 (3.6526) grad_norm 1.3666 (1.3371) [2022-10-02 04:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1000/1251] eta 0:01:13 lr 0.000683 time 0.2921 (0.2909) loss 3.7079 (3.6501) grad_norm 1.4593 (1.3375) [2022-10-02 04:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1100/1251] eta 0:00:43 lr 0.000683 time 0.2886 (0.2907) loss 3.7958 (3.6466) grad_norm 1.4349 (1.3398) [2022-10-02 04:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1200/1251] eta 0:00:14 lr 0.000683 time 0.2843 (0.2905) loss 3.8413 (3.6495) grad_norm 1.1173 (1.3383) [2022-10-02 04:57:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 114 training takes 0:06:03 [2022-10-02 04:57:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.027 (3.027) Loss 1.0748 (1.0748) Acc@1 74.902 (74.902) Acc@5 93.555 (93.555) [2022-10-02 04:57:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.182 Acc@5 92.430 [2022-10-02 04:57:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-10-02 04:57:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.30% [2022-10-02 04:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][0/1251] eta 1:05:48 lr 0.000682 time 3.1562 (3.1562) loss 4.1248 (4.1248) grad_norm 1.2781 (1.2781) [2022-10-02 04:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][100/1251] eta 0:06:06 lr 0.000682 time 0.2933 (0.3187) loss 2.6295 (3.6102) grad_norm 1.2403 (1.3294) [2022-10-02 04:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][200/1251] eta 0:05:19 lr 0.000682 time 0.2884 (0.3040) loss 4.1122 (3.5986) grad_norm 1.3075 (1.3232) [2022-10-02 04:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][300/1251] eta 0:04:44 lr 0.000681 time 0.2938 (0.2992) loss 4.2239 (3.6412) grad_norm 1.2179 (1.3380) [2022-10-02 04:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][400/1251] eta 0:04:12 lr 0.000681 time 0.2873 (0.2966) loss 2.6293 (3.6383) grad_norm 1.2303 (1.3397) [2022-10-02 05:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][500/1251] eta 0:03:41 lr 0.000680 time 0.2925 (0.2949) loss 2.4993 (3.6302) grad_norm 1.3366 (1.3559) [2022-10-02 05:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][600/1251] eta 0:03:11 lr 0.000680 time 0.2855 (0.2939) loss 3.8409 (3.6300) grad_norm 1.3411 (1.3527) [2022-10-02 05:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][700/1251] eta 0:02:41 lr 0.000680 time 0.2922 (0.2931) loss 3.7824 (3.6360) grad_norm 1.2402 (1.3526) [2022-10-02 05:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][800/1251] eta 0:02:11 lr 0.000679 time 0.2888 (0.2924) loss 4.2866 (3.6453) grad_norm 1.0917 (1.3543) [2022-10-02 05:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][900/1251] eta 0:01:42 lr 0.000679 time 0.2892 (0.2919) loss 4.0965 (3.6340) grad_norm 1.7601 (1.3512) [2022-10-02 05:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1000/1251] eta 0:01:13 lr 0.000679 time 0.2901 (0.2915) loss 3.3936 (3.6352) grad_norm 1.2626 (1.3504) [2022-10-02 05:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1100/1251] eta 0:00:43 lr 0.000678 time 0.2917 (0.2912) loss 3.1816 (3.6340) grad_norm 1.4874 (1.3483) [2022-10-02 05:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1200/1251] eta 0:00:14 lr 0.000678 time 0.2889 (0.2909) loss 3.5619 (3.6321) grad_norm 1.1809 (1.3491) [2022-10-02 05:03:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 115 training takes 0:06:04 [2022-10-02 05:04:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.274 (3.274) Loss 1.1322 (1.1322) Acc@1 74.609 (74.609) Acc@5 91.992 (91.992) [2022-10-02 05:04:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.348 Acc@5 92.584 [2022-10-02 05:04:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-02 05:04:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.35% [2022-10-02 05:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][0/1251] eta 1:08:02 lr 0.000678 time 3.2634 (3.2634) loss 3.7953 (3.7953) grad_norm 1.1893 (1.1893) [2022-10-02 05:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][100/1251] eta 0:06:04 lr 0.000677 time 0.2895 (0.3169) loss 4.3896 (3.6591) grad_norm 1.2718 (1.3757) [2022-10-02 05:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][200/1251] eta 0:05:17 lr 0.000677 time 0.2873 (0.3023) loss 3.4058 (3.6478) grad_norm 1.2409 (1.3744) [2022-10-02 05:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][300/1251] eta 0:04:42 lr 0.000676 time 0.2904 (0.2973) loss 3.8022 (3.6357) grad_norm 1.7780 (1.3722) [2022-10-02 05:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][400/1251] eta 0:04:10 lr 0.000676 time 0.2847 (0.2948) loss 2.8181 (3.6432) grad_norm 1.2874 (1.3589) [2022-10-02 05:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][500/1251] eta 0:03:40 lr 0.000676 time 0.2894 (0.2933) loss 3.4493 (3.6488) grad_norm 1.2867 (1.3583) [2022-10-02 05:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][600/1251] eta 0:03:10 lr 0.000675 time 0.2904 (0.2925) loss 3.9700 (3.6466) grad_norm 1.2260 (1.3559) [2022-10-02 05:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][700/1251] eta 0:02:40 lr 0.000675 time 0.2924 (0.2919) loss 3.3185 (3.6491) grad_norm 1.3883 (1.3537) [2022-10-02 05:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][800/1251] eta 0:02:11 lr 0.000674 time 0.2881 (0.2915) loss 4.0531 (3.6542) grad_norm 1.3135 (1.3513) [2022-10-02 05:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][900/1251] eta 0:01:42 lr 0.000674 time 0.2916 (0.2911) loss 3.8084 (3.6567) grad_norm 1.4396 (1.3469) [2022-10-02 05:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1000/1251] eta 0:01:12 lr 0.000674 time 0.2865 (0.2908) loss 3.5682 (3.6596) grad_norm 1.1226 (1.3466) [2022-10-02 05:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1100/1251] eta 0:00:43 lr 0.000673 time 0.2885 (0.2905) loss 3.7858 (3.6627) grad_norm 1.8852 (1.3474) [2022-10-02 05:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1200/1251] eta 0:00:14 lr 0.000673 time 0.2866 (0.2903) loss 3.7710 (3.6562) grad_norm 1.2139 (1.3486) [2022-10-02 05:10:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 116 training takes 0:06:03 [2022-10-02 05:10:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.664 (2.664) Loss 1.0467 (1.0467) Acc@1 76.367 (76.367) Acc@5 93.555 (93.555) [2022-10-02 05:10:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.276 Acc@5 92.492 [2022-10-02 05:10:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-10-02 05:10:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.35% [2022-10-02 05:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][0/1251] eta 1:09:21 lr 0.000673 time 3.3268 (3.3268) loss 3.8410 (3.8410) grad_norm 1.3365 (1.3365) [2022-10-02 05:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][100/1251] eta 0:06:08 lr 0.000672 time 0.2872 (0.3203) loss 4.3914 (3.6912) grad_norm 1.5370 (1.3471) [2022-10-02 05:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][200/1251] eta 0:05:20 lr 0.000672 time 0.2965 (0.3052) loss 3.1575 (3.6945) grad_norm 1.1281 (1.3426) [2022-10-02 05:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][300/1251] eta 0:04:45 lr 0.000672 time 0.2884 (0.2999) loss 2.6699 (3.6561) grad_norm 1.3086 (1.3396) [2022-10-02 05:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][400/1251] eta 0:04:13 lr 0.000671 time 0.2895 (0.2974) loss 2.8130 (3.6517) grad_norm 1.3177 (1.3415) [2022-10-02 05:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][500/1251] eta 0:03:42 lr 0.000671 time 0.2882 (0.2959) loss 3.2411 (3.6690) grad_norm 1.2818 (1.3419) [2022-10-02 05:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][600/1251] eta 0:03:11 lr 0.000670 time 0.2922 (0.2948) loss 3.4544 (3.6646) grad_norm 1.5343 (1.3505) [2022-10-02 05:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][700/1251] eta 0:02:42 lr 0.000670 time 0.2903 (0.2941) loss 3.8396 (3.6523) grad_norm 1.4475 (1.3486) [2022-10-02 05:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][800/1251] eta 0:02:12 lr 0.000670 time 0.2912 (0.2935) loss 4.2439 (3.6590) grad_norm 1.1365 (1.3441) [2022-10-02 05:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][900/1251] eta 0:01:42 lr 0.000669 time 0.2923 (0.2931) loss 4.1020 (3.6562) grad_norm 1.6655 (1.3443) [2022-10-02 05:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1000/1251] eta 0:01:13 lr 0.000669 time 0.2893 (0.2927) loss 3.1597 (3.6608) grad_norm 1.1281 (1.3484) [2022-10-02 05:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1100/1251] eta 0:00:44 lr 0.000668 time 0.2878 (0.2923) loss 4.2341 (3.6590) grad_norm 1.4590 (1.3481) [2022-10-02 05:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1200/1251] eta 0:00:14 lr 0.000668 time 0.2884 (0.2920) loss 4.0786 (3.6559) grad_norm 1.2908 (1.3507) [2022-10-02 05:16:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 117 training takes 0:06:05 [2022-10-02 05:16:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.934 (2.934) Loss 1.0752 (1.0752) Acc@1 74.512 (74.512) Acc@5 93.262 (93.262) [2022-10-02 05:16:47 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.524 Acc@5 92.282 [2022-10-02 05:16:47 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-10-02 05:16:47 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.52% [2022-10-02 05:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][0/1251] eta 1:08:43 lr 0.000668 time 3.2963 (3.2963) loss 2.5848 (2.5848) grad_norm 1.6773 (1.6773) [2022-10-02 05:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][100/1251] eta 0:06:08 lr 0.000667 time 0.2927 (0.3202) loss 3.6330 (3.6471) grad_norm 1.5421 (1.3069) [2022-10-02 05:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][200/1251] eta 0:05:20 lr 0.000667 time 0.2892 (0.3054) loss 2.8362 (3.6740) grad_norm 1.2265 (1.3263) [2022-10-02 05:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][300/1251] eta 0:04:45 lr 0.000667 time 0.2859 (0.3005) loss 3.5769 (3.6344) grad_norm 1.1904 (1.3275) [2022-10-02 05:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][400/1251] eta 0:04:13 lr 0.000666 time 0.2920 (0.2980) loss 3.1040 (3.6389) grad_norm 1.2438 (1.3253) [2022-10-02 05:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][500/1251] eta 0:03:42 lr 0.000666 time 0.2905 (0.2964) loss 3.3383 (3.6386) grad_norm 1.2708 (1.3299) [2022-10-02 05:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][600/1251] eta 0:03:12 lr 0.000665 time 0.2899 (0.2953) loss 4.2154 (3.6494) grad_norm 1.4967 (1.3295) [2022-10-02 05:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][700/1251] eta 0:02:42 lr 0.000665 time 0.2888 (0.2945) loss 3.5394 (3.6478) grad_norm 1.3820 (1.3406) [2022-10-02 05:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][800/1251] eta 0:02:12 lr 0.000665 time 0.2864 (0.2939) loss 4.1816 (3.6514) grad_norm 1.1632 (1.3412) [2022-10-02 05:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][900/1251] eta 0:01:43 lr 0.000664 time 0.2913 (0.2935) loss 3.9830 (3.6449) grad_norm 1.2518 (1.3429) [2022-10-02 05:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1000/1251] eta 0:01:13 lr 0.000664 time 0.2863 (0.2931) loss 3.9960 (3.6437) grad_norm 1.3107 (1.3480) [2022-10-02 05:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1100/1251] eta 0:00:44 lr 0.000663 time 0.2876 (0.2927) loss 3.3648 (3.6429) grad_norm 1.2880 (1.3490) [2022-10-02 05:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1200/1251] eta 0:00:14 lr 0.000663 time 0.2897 (0.2925) loss 3.8661 (3.6471) grad_norm 1.5418 (1.3515) [2022-10-02 05:22:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 118 training takes 0:06:06 [2022-10-02 05:22:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.939 (2.939) Loss 1.0758 (1.0758) Acc@1 75.684 (75.684) Acc@5 92.773 (92.773) [2022-10-02 05:23:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.568 Acc@5 92.532 [2022-10-02 05:23:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-10-02 05:23:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.57% [2022-10-02 05:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][0/1251] eta 1:01:32 lr 0.000663 time 2.9519 (2.9519) loss 4.0281 (4.0281) grad_norm 1.2286 (1.2286) [2022-10-02 05:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][100/1251] eta 0:06:03 lr 0.000662 time 0.2861 (0.3159) loss 3.7127 (3.5832) grad_norm 1.3337 (1.3384) [2022-10-02 05:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][200/1251] eta 0:05:17 lr 0.000662 time 0.2916 (0.3025) loss 4.0556 (3.5901) grad_norm 1.2355 (1.3580) [2022-10-02 05:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][300/1251] eta 0:04:43 lr 0.000662 time 0.2899 (0.2978) loss 3.4087 (3.5959) grad_norm 1.4889 (1.3464) [2022-10-02 05:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][400/1251] eta 0:04:11 lr 0.000661 time 0.2858 (0.2955) loss 3.2201 (3.5954) grad_norm 1.1975 (1.3567) [2022-10-02 05:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][500/1251] eta 0:03:41 lr 0.000661 time 0.2881 (0.2944) loss 3.2471 (3.5997) grad_norm 1.2189 (1.3547) [2022-10-02 05:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][600/1251] eta 0:03:11 lr 0.000661 time 0.2884 (0.2935) loss 3.1084 (3.6163) grad_norm 1.1842 (1.3586) [2022-10-02 05:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][700/1251] eta 0:02:41 lr 0.000660 time 0.2905 (0.2929) loss 2.5600 (3.6314) grad_norm 1.2949 (1.3626) [2022-10-02 05:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][800/1251] eta 0:02:11 lr 0.000660 time 0.2901 (0.2924) loss 4.0345 (3.6222) grad_norm 1.4701 (1.3622) [2022-10-02 05:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][900/1251] eta 0:01:42 lr 0.000659 time 0.2888 (0.2921) loss 2.5830 (3.6298) grad_norm 1.2142 (1.3635) [2022-10-02 05:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1000/1251] eta 0:01:13 lr 0.000659 time 0.2890 (0.2918) loss 3.7741 (3.6278) grad_norm 1.0634 (1.3637) [2022-10-02 05:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1100/1251] eta 0:00:44 lr 0.000659 time 0.2861 (0.2916) loss 3.8483 (3.6372) grad_norm 1.2732 (1.3598) [2022-10-02 05:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1200/1251] eta 0:00:14 lr 0.000658 time 0.2887 (0.2914) loss 4.3505 (3.6438) grad_norm 1.5063 (1.3591) [2022-10-02 05:29:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 119 training takes 0:06:04 [2022-10-02 05:29:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.894 (2.894) Loss 1.0765 (1.0765) Acc@1 75.195 (75.195) Acc@5 92.676 (92.676) [2022-10-02 05:29:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.652 Acc@5 92.632 [2022-10-02 05:29:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-02 05:29:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.65% [2022-10-02 05:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][0/1251] eta 1:05:26 lr 0.000658 time 3.1384 (3.1384) loss 2.8735 (2.8735) grad_norm 1.3902 (1.3902) [2022-10-02 05:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][100/1251] eta 0:06:05 lr 0.000658 time 0.2913 (0.3174) loss 4.2229 (3.6203) grad_norm 1.3224 (1.3995) [2022-10-02 05:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][200/1251] eta 0:05:18 lr 0.000657 time 0.2875 (0.3032) loss 4.0201 (3.6318) grad_norm 1.4921 (1.3705) [2022-10-02 05:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][300/1251] eta 0:04:43 lr 0.000657 time 0.2910 (0.2984) loss 3.1424 (3.6566) grad_norm 1.4727 (1.3663) [2022-10-02 05:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][400/1251] eta 0:04:11 lr 0.000656 time 0.2864 (0.2961) loss 2.5752 (3.6669) grad_norm 1.2954 (1.3746) [2022-10-02 05:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][500/1251] eta 0:03:41 lr 0.000656 time 0.2958 (0.2945) loss 3.3220 (3.6654) grad_norm 1.2680 (1.3786) [2022-10-02 05:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][600/1251] eta 0:03:11 lr 0.000656 time 0.2863 (0.2935) loss 3.6863 (3.6640) grad_norm 1.3105 (1.3726) [2022-10-02 05:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][700/1251] eta 0:02:41 lr 0.000655 time 0.2878 (0.2927) loss 2.7968 (3.6584) grad_norm 1.2007 (1.3726) [2022-10-02 05:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][800/1251] eta 0:02:11 lr 0.000655 time 0.2862 (0.2920) loss 3.1143 (3.6526) grad_norm 1.4016 (1.3709) [2022-10-02 05:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][900/1251] eta 0:01:42 lr 0.000654 time 0.2921 (0.2915) loss 3.3976 (3.6605) grad_norm 1.2277 (1.3720) [2022-10-02 05:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1000/1251] eta 0:01:13 lr 0.000654 time 0.2862 (0.2911) loss 3.3494 (3.6656) grad_norm 1.3785 (1.3756) [2022-10-02 05:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1100/1251] eta 0:00:43 lr 0.000654 time 0.2862 (0.2908) loss 4.0692 (3.6615) grad_norm 1.2688 (1.3747) [2022-10-02 05:35:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1200/1251] eta 0:00:14 lr 0.000653 time 0.2863 (0.2905) loss 4.4407 (3.6552) grad_norm 1.6354 (1.3732) [2022-10-02 05:35:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 120 training takes 0:06:03 [2022-10-02 05:35:27 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saving...... [2022-10-02 05:35:27 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saved !!! [2022-10-02 05:35:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.639 (2.639) Loss 1.0365 (1.0365) Acc@1 75.781 (75.781) Acc@5 93.066 (93.066) [2022-10-02 05:35:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.786 Acc@5 92.584 [2022-10-02 05:35:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-02 05:35:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-10-02 05:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][0/1251] eta 1:07:15 lr 0.000653 time 3.2256 (3.2256) loss 3.3649 (3.3649) grad_norm 1.3825 (1.3825) [2022-10-02 05:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][100/1251] eta 0:06:06 lr 0.000653 time 0.2944 (0.3183) loss 3.3298 (3.6525) grad_norm 1.0681 (1.3777) [2022-10-02 05:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][200/1251] eta 0:05:19 lr 0.000652 time 0.2896 (0.3037) loss 3.4756 (3.6000) grad_norm 1.4327 (1.3692) [2022-10-02 05:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][300/1251] eta 0:04:44 lr 0.000652 time 0.2906 (0.2988) loss 3.4147 (3.6059) grad_norm 1.4201 (1.3769) [2022-10-02 05:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][400/1251] eta 0:04:12 lr 0.000651 time 0.2917 (0.2963) loss 3.6643 (3.6139) grad_norm 1.4113 (1.3806) [2022-10-02 05:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][500/1251] eta 0:03:41 lr 0.000651 time 0.2880 (0.2947) loss 3.9749 (3.6352) grad_norm 1.4074 (1.3796) [2022-10-02 05:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][600/1251] eta 0:03:11 lr 0.000651 time 0.2904 (0.2936) loss 3.5394 (3.6346) grad_norm 1.2462 (1.3733) [2022-10-02 05:39:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][700/1251] eta 0:02:41 lr 0.000650 time 0.2896 (0.2928) loss 3.9171 (3.6386) grad_norm 1.2403 (1.3713) [2022-10-02 05:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][800/1251] eta 0:02:11 lr 0.000650 time 0.2866 (0.2922) loss 3.2571 (3.6473) grad_norm 1.3648 (1.3698) [2022-10-02 05:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][900/1251] eta 0:01:42 lr 0.000649 time 0.2897 (0.2918) loss 4.2363 (3.6442) grad_norm 1.5477 (1.3685) [2022-10-02 05:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1000/1251] eta 0:01:13 lr 0.000649 time 0.2882 (0.2914) loss 3.4929 (3.6419) grad_norm 1.4821 (1.3671) [2022-10-02 05:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1100/1251] eta 0:00:43 lr 0.000649 time 0.2896 (0.2910) loss 3.4205 (3.6404) grad_norm 1.6788 (1.3690) [2022-10-02 05:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1200/1251] eta 0:00:14 lr 0.000648 time 0.2892 (0.2907) loss 2.6702 (3.6339) grad_norm 1.2855 (1.3701) [2022-10-02 05:41:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 121 training takes 0:06:03 [2022-10-02 05:41:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.131 (3.131) Loss 1.0698 (1.0698) Acc@1 74.609 (74.609) Acc@5 92.285 (92.285) [2022-10-02 05:41:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.698 Acc@5 92.568 [2022-10-02 05:41:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-02 05:41:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-10-02 05:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][0/1251] eta 1:07:01 lr 0.000648 time 3.2149 (3.2149) loss 3.8396 (3.8396) grad_norm 1.6794 (1.6794) [2022-10-02 05:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][100/1251] eta 0:06:06 lr 0.000648 time 0.2875 (0.3182) loss 2.8439 (3.5222) grad_norm 1.2771 (1.3454) [2022-10-02 05:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][200/1251] eta 0:05:19 lr 0.000647 time 0.2901 (0.3038) loss 3.5705 (3.5491) grad_norm 1.5337 (1.3504) [2022-10-02 05:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][300/1251] eta 0:04:44 lr 0.000647 time 0.2893 (0.2992) loss 3.2207 (3.5662) grad_norm 1.2486 (1.3478) [2022-10-02 05:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][400/1251] eta 0:04:12 lr 0.000646 time 0.2959 (0.2969) loss 3.7860 (3.5627) grad_norm 1.1795 (1.3438) [2022-10-02 05:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][500/1251] eta 0:03:41 lr 0.000646 time 0.2907 (0.2956) loss 4.1800 (3.5728) grad_norm 1.2288 (1.3526) [2022-10-02 05:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][600/1251] eta 0:03:11 lr 0.000646 time 0.2982 (0.2948) loss 2.7575 (3.5719) grad_norm 1.4574 (1.3575) [2022-10-02 05:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][700/1251] eta 0:02:42 lr 0.000645 time 0.2884 (0.2941) loss 4.0748 (3.5954) grad_norm 1.3818 (1.3621) [2022-10-02 05:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][800/1251] eta 0:02:12 lr 0.000645 time 0.2917 (0.2937) loss 3.6376 (3.6103) grad_norm 1.2488 (1.3612) [2022-10-02 05:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][900/1251] eta 0:01:42 lr 0.000644 time 0.2912 (0.2933) loss 3.9385 (3.6173) grad_norm 1.2732 (1.3616) [2022-10-02 05:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1000/1251] eta 0:01:13 lr 0.000644 time 0.2918 (0.2930) loss 3.9022 (3.6158) grad_norm 1.4031 (1.3632) [2022-10-02 05:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1100/1251] eta 0:00:44 lr 0.000644 time 0.2859 (0.2928) loss 3.6416 (3.6147) grad_norm 1.1922 (1.3639) [2022-10-02 05:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1200/1251] eta 0:00:14 lr 0.000643 time 0.2906 (0.2926) loss 3.9485 (3.6172) grad_norm 1.5067 (1.3635) [2022-10-02 05:48:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 122 training takes 0:06:06 [2022-10-02 05:48:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.203 (3.203) Loss 1.0724 (1.0724) Acc@1 74.219 (74.219) Acc@5 92.773 (92.773) [2022-10-02 05:48:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.660 Acc@5 92.698 [2022-10-02 05:48:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-02 05:48:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-10-02 05:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][0/1251] eta 0:51:02 lr 0.000643 time 2.4482 (2.4482) loss 3.8473 (3.8473) grad_norm 1.2267 (1.2267) [2022-10-02 05:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][100/1251] eta 0:06:04 lr 0.000643 time 0.2839 (0.3168) loss 3.5607 (3.5703) grad_norm 1.3173 (1.4135) [2022-10-02 05:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][200/1251] eta 0:05:17 lr 0.000642 time 0.2858 (0.3024) loss 3.9608 (3.6593) grad_norm 1.5130 (1.4005) [2022-10-02 05:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][300/1251] eta 0:04:43 lr 0.000642 time 0.2863 (0.2977) loss 3.7919 (3.6153) grad_norm 2.1955 (1.4057) [2022-10-02 05:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][400/1251] eta 0:04:11 lr 0.000642 time 0.2877 (0.2953) loss 4.1118 (3.6343) grad_norm 1.4432 (1.3946) [2022-10-02 05:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][500/1251] eta 0:03:40 lr 0.000641 time 0.2878 (0.2940) loss 4.0745 (3.6323) grad_norm 1.3413 (1.3936) [2022-10-02 05:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][600/1251] eta 0:03:10 lr 0.000641 time 0.2908 (0.2929) loss 4.0901 (3.6482) grad_norm 1.4359 (1.3932) [2022-10-02 05:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][700/1251] eta 0:02:40 lr 0.000640 time 0.2863 (0.2921) loss 3.3207 (3.6475) grad_norm 1.3565 (1.3903) [2022-10-02 05:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][800/1251] eta 0:02:11 lr 0.000640 time 0.2871 (0.2915) loss 4.0321 (3.6494) grad_norm 1.2277 (1.3902) [2022-10-02 05:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][900/1251] eta 0:01:42 lr 0.000640 time 0.2860 (0.2912) loss 3.8690 (3.6477) grad_norm 1.3307 (1.3882) [2022-10-02 05:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1000/1251] eta 0:01:13 lr 0.000639 time 0.2893 (0.2912) loss 3.6115 (3.6495) grad_norm 1.7737 (1.3893) [2022-10-02 05:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1100/1251] eta 0:00:43 lr 0.000639 time 0.2894 (0.2909) loss 4.1871 (3.6496) grad_norm 1.1845 (1.3892) [2022-10-02 05:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1200/1251] eta 0:00:14 lr 0.000638 time 0.2863 (0.2909) loss 4.1256 (3.6504) grad_norm 1.3596 (1.3894) [2022-10-02 05:54:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 123 training takes 0:06:04 [2022-10-02 05:54:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.768 (2.768) Loss 1.1688 (1.1688) Acc@1 72.852 (72.852) Acc@5 91.699 (91.699) [2022-10-02 05:54:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.688 Acc@5 92.696 [2022-10-02 05:54:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-02 05:54:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.79% [2022-10-02 05:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][0/1251] eta 1:05:33 lr 0.000638 time 3.1442 (3.1442) loss 3.7843 (3.7843) grad_norm 1.4609 (1.4609) [2022-10-02 05:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][100/1251] eta 0:06:10 lr 0.000638 time 0.2896 (0.3221) loss 4.1349 (3.5780) grad_norm 1.1924 (1.4315) [2022-10-02 05:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][200/1251] eta 0:05:22 lr 0.000637 time 0.2865 (0.3065) loss 4.3990 (3.5941) grad_norm 1.4700 (1.4137) [2022-10-02 05:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][300/1251] eta 0:04:46 lr 0.000637 time 0.2887 (0.3010) loss 3.1883 (3.6144) grad_norm 1.4898 (1.4087) [2022-10-02 05:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][400/1251] eta 0:04:13 lr 0.000637 time 0.2913 (0.2981) loss 3.8327 (3.6325) grad_norm 1.3309 (1.3995) [2022-10-02 05:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][500/1251] eta 0:03:42 lr 0.000636 time 0.2876 (0.2963) loss 3.2657 (3.6407) grad_norm 1.6854 (1.3961) [2022-10-02 05:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][600/1251] eta 0:03:12 lr 0.000636 time 0.2891 (0.2951) loss 3.8924 (3.6395) grad_norm 1.2621 (1.3947) [2022-10-02 05:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][700/1251] eta 0:02:42 lr 0.000635 time 0.2871 (0.2943) loss 3.7373 (3.6322) grad_norm 1.1755 (1.3913) [2022-10-02 05:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][800/1251] eta 0:02:12 lr 0.000635 time 0.2975 (0.2937) loss 4.1365 (3.6349) grad_norm 1.4274 (1.3870) [2022-10-02 05:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][900/1251] eta 0:01:42 lr 0.000635 time 0.2917 (0.2932) loss 4.1312 (3.6305) grad_norm 1.3518 (1.3868) [2022-10-02 05:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1000/1251] eta 0:01:13 lr 0.000634 time 0.2910 (0.2928) loss 3.9435 (3.6360) grad_norm 1.7978 (1.3862) [2022-10-02 05:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1100/1251] eta 0:00:44 lr 0.000634 time 0.2877 (0.2924) loss 3.9370 (3.6407) grad_norm 1.2524 (1.3868) [2022-10-02 06:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1200/1251] eta 0:00:14 lr 0.000633 time 0.2864 (0.2921) loss 2.7097 (3.6444) grad_norm 1.4539 (1.3865) [2022-10-02 06:00:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 124 training takes 0:06:05 [2022-10-02 06:00:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.240 (3.240) Loss 1.0311 (1.0311) Acc@1 76.270 (76.270) Acc@5 92.383 (92.383) [2022-10-02 06:00:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.968 Acc@5 92.694 [2022-10-02 06:00:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-02 06:00:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.97% [2022-10-02 06:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][0/1251] eta 0:52:17 lr 0.000633 time 2.5083 (2.5083) loss 3.5631 (3.5631) grad_norm 1.2016 (1.2016) [2022-10-02 06:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][100/1251] eta 0:06:02 lr 0.000633 time 0.2914 (0.3150) loss 3.4567 (3.5627) grad_norm 1.5476 (1.3889) [2022-10-02 06:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][200/1251] eta 0:05:17 lr 0.000632 time 0.2886 (0.3025) loss 3.9952 (3.5969) grad_norm 1.2658 (1.3828) [2022-10-02 06:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][300/1251] eta 0:04:43 lr 0.000632 time 0.2898 (0.2983) loss 3.3632 (3.6426) grad_norm 1.6232 (1.4055) [2022-10-02 06:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][400/1251] eta 0:04:12 lr 0.000632 time 0.2854 (0.2961) loss 4.0892 (3.6220) grad_norm 1.8714 (1.3999) [2022-10-02 06:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][500/1251] eta 0:03:41 lr 0.000631 time 0.2886 (0.2948) loss 4.3577 (3.6078) grad_norm 1.7042 (1.4042) [2022-10-02 06:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][600/1251] eta 0:03:11 lr 0.000631 time 0.2876 (0.2939) loss 3.2704 (3.6186) grad_norm 1.4502 (1.4051) [2022-10-02 06:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][700/1251] eta 0:02:41 lr 0.000630 time 0.2896 (0.2933) loss 3.6857 (3.6103) grad_norm 1.5355 (1.4117) [2022-10-02 06:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][800/1251] eta 0:02:12 lr 0.000630 time 0.2882 (0.2928) loss 3.6795 (3.6105) grad_norm 1.5065 (1.4093) [2022-10-02 06:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][900/1251] eta 0:01:42 lr 0.000630 time 0.2893 (0.2924) loss 4.1427 (3.6138) grad_norm 1.3418 (1.4078) [2022-10-02 06:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1000/1251] eta 0:01:13 lr 0.000629 time 0.2855 (0.2921) loss 3.6399 (3.6135) grad_norm 1.1082 (1.4065) [2022-10-02 06:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1100/1251] eta 0:00:44 lr 0.000629 time 0.2903 (0.2918) loss 4.0219 (3.6115) grad_norm 1.4860 (1.4014) [2022-10-02 06:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1200/1251] eta 0:00:14 lr 0.000628 time 0.2885 (0.2916) loss 3.5628 (3.6184) grad_norm 1.3280 (1.3966) [2022-10-02 06:06:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 125 training takes 0:06:05 [2022-10-02 06:06:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.925 (2.925) Loss 1.0980 (1.0980) Acc@1 74.609 (74.609) Acc@5 92.480 (92.480) [2022-10-02 06:07:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.848 Acc@5 92.750 [2022-10-02 06:07:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-02 06:07:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.97% [2022-10-02 06:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][0/1251] eta 1:10:27 lr 0.000628 time 3.3789 (3.3789) loss 3.9225 (3.9225) grad_norm 1.2702 (1.2702) [2022-10-02 06:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][100/1251] eta 0:06:09 lr 0.000628 time 0.2945 (0.3211) loss 4.1551 (3.6698) grad_norm 1.5378 (1.3959) [2022-10-02 06:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][200/1251] eta 0:05:21 lr 0.000627 time 0.2860 (0.3055) loss 4.4565 (3.6452) grad_norm 1.5625 (1.3968) [2022-10-02 06:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][300/1251] eta 0:04:45 lr 0.000627 time 0.2926 (0.3004) loss 2.7949 (3.6614) grad_norm 1.8568 (1.3986) [2022-10-02 06:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][400/1251] eta 0:04:13 lr 0.000626 time 0.2862 (0.2978) loss 3.4718 (3.6548) grad_norm 1.2491 (1.3938) [2022-10-02 06:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][500/1251] eta 0:03:42 lr 0.000626 time 0.2890 (0.2960) loss 3.0776 (3.6441) grad_norm 1.3162 (1.3879) [2022-10-02 06:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][600/1251] eta 0:03:11 lr 0.000626 time 0.2872 (0.2947) loss 3.3125 (3.6363) grad_norm 1.7118 (1.3926) [2022-10-02 06:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][700/1251] eta 0:02:41 lr 0.000625 time 0.2939 (0.2938) loss 3.4147 (3.6377) grad_norm 1.2720 (1.3937) [2022-10-02 06:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][800/1251] eta 0:02:12 lr 0.000625 time 0.2852 (0.2931) loss 3.9676 (3.6295) grad_norm 1.3778 (1.3954) [2022-10-02 06:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][900/1251] eta 0:01:42 lr 0.000624 time 0.2900 (0.2926) loss 3.8955 (3.6316) grad_norm 1.2815 (1.3952) [2022-10-02 06:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1000/1251] eta 0:01:13 lr 0.000624 time 0.2875 (0.2922) loss 3.8466 (3.6257) grad_norm 1.2854 (1.3998) [2022-10-02 06:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1100/1251] eta 0:00:44 lr 0.000624 time 0.2901 (0.2919) loss 3.0197 (3.6236) grad_norm 1.5377 (1.3982) [2022-10-02 06:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1200/1251] eta 0:00:14 lr 0.000623 time 0.2857 (0.2915) loss 4.0585 (3.6218) grad_norm 1.5042 (1.4006) [2022-10-02 06:13:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 126 training takes 0:06:04 [2022-10-02 06:13:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.519 (3.519) Loss 1.0407 (1.0407) Acc@1 75.293 (75.293) Acc@5 92.773 (92.773) [2022-10-02 06:13:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.990 Acc@5 92.800 [2022-10-02 06:13:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-02 06:13:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 74.99% [2022-10-02 06:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][0/1251] eta 1:05:48 lr 0.000623 time 3.1563 (3.1563) loss 4.3305 (4.3305) grad_norm 1.3805 (1.3805) [2022-10-02 06:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][100/1251] eta 0:06:05 lr 0.000623 time 0.2885 (0.3177) loss 3.6280 (3.5614) grad_norm 1.8614 (1.3964) [2022-10-02 06:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][200/1251] eta 0:05:18 lr 0.000622 time 0.2852 (0.3030) loss 3.8219 (3.6296) grad_norm 1.4139 (1.3793) [2022-10-02 06:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][300/1251] eta 0:04:43 lr 0.000622 time 0.2895 (0.2981) loss 3.8475 (3.6181) grad_norm 1.2338 (1.3780) [2022-10-02 06:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][400/1251] eta 0:04:11 lr 0.000621 time 0.2863 (0.2955) loss 3.4295 (3.6284) grad_norm 1.3131 (1.3899) [2022-10-02 06:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][500/1251] eta 0:03:40 lr 0.000621 time 0.2898 (0.2941) loss 3.5031 (3.6146) grad_norm 1.4874 (1.3940) [2022-10-02 06:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][600/1251] eta 0:03:10 lr 0.000621 time 0.2889 (0.2931) loss 3.8519 (3.6059) grad_norm 1.3079 (1.3939) [2022-10-02 06:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][700/1251] eta 0:02:41 lr 0.000620 time 0.2902 (0.2924) loss 4.0325 (3.6120) grad_norm 1.3039 (1.3912) [2022-10-02 06:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][800/1251] eta 0:02:11 lr 0.000620 time 0.2854 (0.2918) loss 2.9441 (3.6081) grad_norm 1.2053 (1.3970) [2022-10-02 06:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][900/1251] eta 0:01:42 lr 0.000619 time 0.2876 (0.2914) loss 3.5367 (3.6040) grad_norm 1.4504 (1.3968) [2022-10-02 06:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1000/1251] eta 0:01:13 lr 0.000619 time 0.2862 (0.2910) loss 3.5596 (3.6140) grad_norm 1.2085 (1.3940) [2022-10-02 06:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1100/1251] eta 0:00:43 lr 0.000619 time 0.2880 (0.2907) loss 4.0835 (3.6104) grad_norm 1.3950 (1.3931) [2022-10-02 06:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1200/1251] eta 0:00:14 lr 0.000618 time 0.2876 (0.2905) loss 3.0028 (3.6079) grad_norm 1.5734 (1.3925) [2022-10-02 06:19:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 127 training takes 0:06:03 [2022-10-02 06:19:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.538 (2.538) Loss 1.1067 (1.1067) Acc@1 73.633 (73.633) Acc@5 92.480 (92.480) [2022-10-02 06:19:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.004 Acc@5 92.730 [2022-10-02 06:19:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-10-02 06:19:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.00% [2022-10-02 06:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][0/1251] eta 0:57:30 lr 0.000618 time 2.7585 (2.7585) loss 3.6308 (3.6308) grad_norm 1.5319 (1.5319) [2022-10-02 06:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][100/1251] eta 0:06:03 lr 0.000618 time 0.2880 (0.3159) loss 4.3200 (3.6334) grad_norm 1.2194 (1.3842) [2022-10-02 06:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][200/1251] eta 0:05:18 lr 0.000617 time 0.2877 (0.3032) loss 4.4472 (3.6566) grad_norm 1.3197 (1.3864) [2022-10-02 06:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][300/1251] eta 0:04:44 lr 0.000617 time 0.2890 (0.2988) loss 3.5855 (3.6173) grad_norm 1.2504 (1.3886) [2022-10-02 06:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][400/1251] eta 0:04:12 lr 0.000616 time 0.2877 (0.2966) loss 4.1448 (3.6205) grad_norm 1.5652 (1.3904) [2022-10-02 06:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][500/1251] eta 0:03:41 lr 0.000616 time 0.2857 (0.2954) loss 3.9292 (3.6165) grad_norm 1.4097 (1.3953) [2022-10-02 06:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][600/1251] eta 0:03:11 lr 0.000616 time 0.2913 (0.2946) loss 3.2690 (3.6113) grad_norm 1.5292 (1.3957) [2022-10-02 06:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][700/1251] eta 0:02:41 lr 0.000615 time 0.2864 (0.2940) loss 3.6880 (3.6079) grad_norm 1.3710 (1.3949) [2022-10-02 06:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][800/1251] eta 0:02:12 lr 0.000615 time 0.2885 (0.2935) loss 3.4656 (3.6043) grad_norm 1.6784 (1.3917) [2022-10-02 06:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][900/1251] eta 0:01:42 lr 0.000614 time 0.2878 (0.2932) loss 3.9750 (3.6034) grad_norm 1.4678 (1.3919) [2022-10-02 06:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1000/1251] eta 0:01:13 lr 0.000614 time 0.2889 (0.2929) loss 3.9331 (3.6071) grad_norm 1.2917 (1.3930) [2022-10-02 06:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1100/1251] eta 0:00:44 lr 0.000614 time 0.2861 (0.2927) loss 4.1256 (3.6080) grad_norm 1.1809 (1.3931) [2022-10-02 06:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1200/1251] eta 0:00:14 lr 0.000613 time 0.2889 (0.2925) loss 3.7923 (3.6093) grad_norm 1.3295 (1.3930) [2022-10-02 06:25:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 128 training takes 0:06:06 [2022-10-02 06:25:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.706 (2.706) Loss 1.0172 (1.0172) Acc@1 77.148 (77.148) Acc@5 93.555 (93.555) [2022-10-02 06:26:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.828 Acc@5 92.522 [2022-10-02 06:26:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-10-02 06:26:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.00% [2022-10-02 06:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][0/1251] eta 0:47:58 lr 0.000613 time 2.3008 (2.3008) loss 4.1446 (4.1446) grad_norm 1.3507 (1.3507) [2022-10-02 06:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][100/1251] eta 0:06:01 lr 0.000613 time 0.2890 (0.3137) loss 3.9448 (3.5483) grad_norm 1.5189 (1.3923) [2022-10-02 06:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][200/1251] eta 0:05:16 lr 0.000612 time 0.2877 (0.3011) loss 3.9354 (3.5755) grad_norm 1.4982 (1.4021) [2022-10-02 06:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][300/1251] eta 0:04:42 lr 0.000612 time 0.2903 (0.2970) loss 3.6596 (3.5957) grad_norm 1.2033 (1.4070) [2022-10-02 06:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][400/1251] eta 0:04:10 lr 0.000611 time 0.2916 (0.2949) loss 3.3890 (3.6198) grad_norm 1.4959 (1.3983) [2022-10-02 06:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][500/1251] eta 0:03:40 lr 0.000611 time 0.2918 (0.2936) loss 3.9806 (3.6155) grad_norm 1.3632 (1.3974) [2022-10-02 06:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][600/1251] eta 0:03:10 lr 0.000611 time 0.2886 (0.2928) loss 3.2134 (3.6208) grad_norm 1.3490 (1.3991) [2022-10-02 06:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][700/1251] eta 0:02:40 lr 0.000610 time 0.2883 (0.2921) loss 4.0261 (3.6163) grad_norm 1.6440 (1.4047) [2022-10-02 06:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][800/1251] eta 0:02:11 lr 0.000610 time 0.2901 (0.2917) loss 4.4213 (3.6179) grad_norm 1.4141 (1.4074) [2022-10-02 06:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][900/1251] eta 0:01:42 lr 0.000609 time 0.2934 (0.2914) loss 3.4614 (3.6255) grad_norm 1.3642 (1.4084) [2022-10-02 06:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1000/1251] eta 0:01:13 lr 0.000609 time 0.2906 (0.2911) loss 3.7598 (3.6255) grad_norm 1.2594 (1.4091) [2022-10-02 06:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1100/1251] eta 0:00:43 lr 0.000609 time 0.2909 (0.2908) loss 4.0415 (3.6325) grad_norm 1.4484 (1.4063) [2022-10-02 06:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1200/1251] eta 0:00:14 lr 0.000608 time 0.2883 (0.2906) loss 3.9062 (3.6306) grad_norm 1.5601 (1.4050) [2022-10-02 06:32:06 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 129 training takes 0:06:03 [2022-10-02 06:32:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.077 (3.077) Loss 1.0887 (1.0887) Acc@1 75.195 (75.195) Acc@5 92.090 (92.090) [2022-10-02 06:32:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.868 Acc@5 92.758 [2022-10-02 06:32:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-02 06:32:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.00% [2022-10-02 06:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][0/1251] eta 1:00:37 lr 0.000608 time 2.9075 (2.9075) loss 3.9145 (3.9145) grad_norm 1.3021 (1.3021) [2022-10-02 06:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][100/1251] eta 0:06:04 lr 0.000608 time 0.2912 (0.3169) loss 3.2776 (3.6761) grad_norm 1.5926 (1.4068) [2022-10-02 06:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][200/1251] eta 0:05:18 lr 0.000607 time 0.2833 (0.3030) loss 4.0980 (3.6869) grad_norm 1.3216 (1.4167) [2022-10-02 06:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][300/1251] eta 0:04:43 lr 0.000607 time 0.2911 (0.2982) loss 4.1490 (3.6873) grad_norm 1.3124 (1.4186) [2022-10-02 06:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][400/1251] eta 0:04:11 lr 0.000606 time 0.2886 (0.2957) loss 3.6602 (3.6692) grad_norm 1.5482 (1.4079) [2022-10-02 06:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][500/1251] eta 0:03:41 lr 0.000606 time 0.2910 (0.2943) loss 4.2755 (3.6611) grad_norm 1.5213 (1.4102) [2022-10-02 06:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][600/1251] eta 0:03:10 lr 0.000605 time 0.2871 (0.2932) loss 3.4216 (3.6574) grad_norm 1.5335 (1.4192) [2022-10-02 06:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][700/1251] eta 0:02:41 lr 0.000605 time 0.2899 (0.2926) loss 3.9399 (3.6434) grad_norm 1.1960 (1.4162) [2022-10-02 06:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][800/1251] eta 0:02:11 lr 0.000605 time 0.2867 (0.2921) loss 4.1290 (3.6426) grad_norm 1.3706 (1.4153) [2022-10-02 06:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][900/1251] eta 0:01:42 lr 0.000604 time 0.2895 (0.2917) loss 3.3383 (3.6428) grad_norm 1.3407 (1.4155) [2022-10-02 06:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1000/1251] eta 0:01:13 lr 0.000604 time 0.2876 (0.2913) loss 3.7521 (3.6501) grad_norm 1.2529 (1.4174) [2022-10-02 06:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1100/1251] eta 0:00:43 lr 0.000603 time 0.2919 (0.2911) loss 3.8967 (3.6431) grad_norm 1.5195 (1.4142) [2022-10-02 06:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1200/1251] eta 0:00:14 lr 0.000603 time 0.2860 (0.2908) loss 4.5012 (3.6459) grad_norm 1.5867 (1.4110) [2022-10-02 06:38:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 130 training takes 0:06:04 [2022-10-02 06:38:22 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saving...... [2022-10-02 06:38:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saved !!! [2022-10-02 06:38:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.814 (2.814) Loss 1.1231 (1.1231) Acc@1 73.926 (73.926) Acc@5 92.578 (92.578) [2022-10-02 06:38:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.748 Acc@5 92.766 [2022-10-02 06:38:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.7% [2022-10-02 06:38:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.00% [2022-10-02 06:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][0/1251] eta 1:08:36 lr 0.000603 time 3.2908 (3.2908) loss 3.6370 (3.6370) grad_norm 1.5264 (1.5264) [2022-10-02 06:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][100/1251] eta 0:06:07 lr 0.000602 time 0.2873 (0.3191) loss 3.8459 (3.6518) grad_norm 1.2085 (1.4059) [2022-10-02 06:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][200/1251] eta 0:05:19 lr 0.000602 time 0.2859 (0.3037) loss 2.8155 (3.6094) grad_norm 1.5186 (1.4017) [2022-10-02 06:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][300/1251] eta 0:04:44 lr 0.000602 time 0.2889 (0.2987) loss 3.2873 (3.6129) grad_norm 1.4816 (1.4013) [2022-10-02 06:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][400/1251] eta 0:04:12 lr 0.000601 time 0.2897 (0.2962) loss 3.3942 (3.6013) grad_norm 1.2673 (1.3993) [2022-10-02 06:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][500/1251] eta 0:03:41 lr 0.000601 time 0.2842 (0.2946) loss 4.0599 (3.5811) grad_norm 1.2157 (1.4078) [2022-10-02 06:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][600/1251] eta 0:03:11 lr 0.000600 time 0.2858 (0.2936) loss 4.0304 (3.5868) grad_norm 1.3347 (1.4106) [2022-10-02 06:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][700/1251] eta 0:02:41 lr 0.000600 time 0.2873 (0.2928) loss 3.6365 (3.5876) grad_norm 1.9056 (1.4146) [2022-10-02 06:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][800/1251] eta 0:02:11 lr 0.000600 time 0.2873 (0.2922) loss 3.6417 (3.5906) grad_norm 1.3569 (1.4207) [2022-10-02 06:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][900/1251] eta 0:01:42 lr 0.000599 time 0.2868 (0.2917) loss 4.2462 (3.5962) grad_norm 1.1906 (1.4166) [2022-10-02 06:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1000/1251] eta 0:01:13 lr 0.000599 time 0.2850 (0.2913) loss 3.1584 (3.5928) grad_norm 1.5618 (1.4190) [2022-10-02 06:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1100/1251] eta 0:00:43 lr 0.000598 time 0.2812 (0.2910) loss 3.9495 (3.6034) grad_norm 1.7900 (1.4156) [2022-10-02 06:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1200/1251] eta 0:00:14 lr 0.000598 time 0.2862 (0.2908) loss 2.8388 (3.6071) grad_norm 1.4839 (1.4161) [2022-10-02 06:44:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 131 training takes 0:06:03 [2022-10-02 06:44:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.797 (2.797) Loss 1.0505 (1.0505) Acc@1 75.391 (75.391) Acc@5 93.359 (93.359) [2022-10-02 06:44:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 74.882 Acc@5 92.732 [2022-10-02 06:44:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-10-02 06:44:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.00% [2022-10-02 06:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][0/1251] eta 1:06:23 lr 0.000598 time 3.1843 (3.1843) loss 4.1217 (4.1217) grad_norm 1.2044 (1.2044) [2022-10-02 06:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][100/1251] eta 0:06:08 lr 0.000597 time 0.2907 (0.3200) loss 2.7335 (3.5266) grad_norm 1.2713 (1.4368) [2022-10-02 06:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][200/1251] eta 0:05:20 lr 0.000597 time 0.2964 (0.3053) loss 2.4746 (3.5710) grad_norm 1.1989 (1.4284) [2022-10-02 06:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][300/1251] eta 0:04:45 lr 0.000597 time 0.2871 (0.3001) loss 3.2351 (3.5783) grad_norm 1.1963 (1.4295) [2022-10-02 06:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][400/1251] eta 0:04:13 lr 0.000596 time 0.2953 (0.2975) loss 4.3631 (3.5970) grad_norm 1.4038 (1.4232) [2022-10-02 06:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][500/1251] eta 0:03:42 lr 0.000596 time 0.2852 (0.2959) loss 3.9184 (3.5948) grad_norm 1.3562 (1.4208) [2022-10-02 06:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][600/1251] eta 0:03:11 lr 0.000595 time 0.2949 (0.2949) loss 4.0760 (3.6046) grad_norm 1.3245 (1.4172) [2022-10-02 06:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][700/1251] eta 0:02:42 lr 0.000595 time 0.2892 (0.2941) loss 4.0001 (3.5940) grad_norm 1.2664 (1.4150) [2022-10-02 06:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][800/1251] eta 0:02:12 lr 0.000594 time 0.2929 (0.2935) loss 2.3292 (3.5902) grad_norm 1.3099 (1.4178) [2022-10-02 06:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][900/1251] eta 0:01:42 lr 0.000594 time 0.2888 (0.2931) loss 3.6943 (3.5887) grad_norm 1.7291 (1.4161) [2022-10-02 06:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1000/1251] eta 0:01:13 lr 0.000594 time 0.2937 (0.2927) loss 3.5762 (3.5924) grad_norm 1.3670 (1.4171) [2022-10-02 06:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1100/1251] eta 0:00:44 lr 0.000593 time 0.2886 (0.2924) loss 2.5656 (3.5931) grad_norm 1.6684 (1.4178) [2022-10-02 06:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1200/1251] eta 0:00:14 lr 0.000593 time 0.2949 (0.2922) loss 3.3076 (3.6017) grad_norm 1.2990 (1.4199) [2022-10-02 06:50:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 132 training takes 0:06:05 [2022-10-02 06:51:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.660 (2.660) Loss 1.0491 (1.0491) Acc@1 72.754 (72.754) Acc@5 94.434 (94.434) [2022-10-02 06:51:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.160 Acc@5 92.914 [2022-10-02 06:51:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-02 06:51:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.16% [2022-10-02 06:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][0/1251] eta 1:09:34 lr 0.000593 time 3.3368 (3.3368) loss 4.2351 (4.2351) grad_norm 1.3810 (1.3810) [2022-10-02 06:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][100/1251] eta 0:06:08 lr 0.000592 time 0.2879 (0.3204) loss 4.3089 (3.5523) grad_norm 1.2830 (1.4144) [2022-10-02 06:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][200/1251] eta 0:05:20 lr 0.000592 time 0.2871 (0.3052) loss 3.5503 (3.6005) grad_norm 1.2257 (1.4202) [2022-10-02 06:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][300/1251] eta 0:04:45 lr 0.000591 time 0.2873 (0.3000) loss 2.8039 (3.5768) grad_norm 1.4722 (1.4122) [2022-10-02 06:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][400/1251] eta 0:04:13 lr 0.000591 time 0.2884 (0.2974) loss 2.6515 (3.5919) grad_norm 1.4843 (1.4169) [2022-10-02 06:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][500/1251] eta 0:03:41 lr 0.000591 time 0.2849 (0.2955) loss 3.2915 (3.5981) grad_norm 1.7280 (1.4177) [2022-10-02 06:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][600/1251] eta 0:03:11 lr 0.000590 time 0.2855 (0.2943) loss 4.3051 (3.6070) grad_norm 1.2906 (1.4161) [2022-10-02 06:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][700/1251] eta 0:02:41 lr 0.000590 time 0.2858 (0.2934) loss 4.0156 (3.5860) grad_norm 1.3832 (1.4197) [2022-10-02 06:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][800/1251] eta 0:02:12 lr 0.000589 time 0.2860 (0.2927) loss 4.1602 (3.5911) grad_norm 1.6704 (1.4286) [2022-10-02 06:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][900/1251] eta 0:01:42 lr 0.000589 time 0.2892 (0.2922) loss 4.3497 (3.5919) grad_norm 1.2312 (1.4264) [2022-10-02 06:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1000/1251] eta 0:01:13 lr 0.000589 time 0.2873 (0.2918) loss 3.9750 (3.5934) grad_norm 1.2539 (1.4270) [2022-10-02 06:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1100/1251] eta 0:00:44 lr 0.000588 time 0.2903 (0.2915) loss 4.1674 (3.5949) grad_norm 1.3424 (1.4236) [2022-10-02 06:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1200/1251] eta 0:00:14 lr 0.000588 time 0.2859 (0.2913) loss 3.4178 (3.6003) grad_norm 1.2565 (1.4246) [2022-10-02 06:57:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 133 training takes 0:06:04 [2022-10-02 06:57:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.070 (3.070) Loss 1.0361 (1.0361) Acc@1 75.000 (75.000) Acc@5 93.262 (93.262) [2022-10-02 06:57:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.102 Acc@5 92.924 [2022-10-02 06:57:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-10-02 06:57:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.16% [2022-10-02 06:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][0/1251] eta 1:00:29 lr 0.000588 time 2.9016 (2.9016) loss 3.4152 (3.4152) grad_norm 1.2803 (1.2803) [2022-10-02 06:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][100/1251] eta 0:06:03 lr 0.000587 time 0.2926 (0.3160) loss 2.7968 (3.5658) grad_norm 1.2664 (1.4167) [2022-10-02 06:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][200/1251] eta 0:05:17 lr 0.000587 time 0.2875 (0.3025) loss 3.7933 (3.5338) grad_norm 1.6817 (1.4450) [2022-10-02 06:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][300/1251] eta 0:04:43 lr 0.000586 time 0.2931 (0.2980) loss 3.9671 (3.5814) grad_norm 1.3548 (1.4432) [2022-10-02 06:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][400/1251] eta 0:04:11 lr 0.000586 time 0.2849 (0.2957) loss 3.9752 (3.5791) grad_norm 1.5165 (1.4337) [2022-10-02 06:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][500/1251] eta 0:03:41 lr 0.000586 time 0.2876 (0.2944) loss 4.0367 (3.5618) grad_norm 1.5190 (1.4298) [2022-10-02 07:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][600/1251] eta 0:03:11 lr 0.000585 time 0.2872 (0.2935) loss 2.8992 (3.5725) grad_norm 1.3439 (1.4363) [2022-10-02 07:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][700/1251] eta 0:02:41 lr 0.000585 time 0.2905 (0.2928) loss 2.5581 (3.5777) grad_norm 1.4589 (1.4380) [2022-10-02 07:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][800/1251] eta 0:02:11 lr 0.000584 time 0.2879 (0.2922) loss 2.8781 (3.5870) grad_norm 1.2667 (1.4406) [2022-10-02 07:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][900/1251] eta 0:01:42 lr 0.000584 time 0.2919 (0.2918) loss 3.9997 (3.5940) grad_norm 1.2973 (1.4371) [2022-10-02 07:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1000/1251] eta 0:01:13 lr 0.000583 time 0.2844 (0.2914) loss 4.1839 (3.5878) grad_norm 1.4501 (1.4376) [2022-10-02 07:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1100/1251] eta 0:00:43 lr 0.000583 time 0.2901 (0.2912) loss 3.8008 (3.5908) grad_norm 1.4023 (1.4403) [2022-10-02 07:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1200/1251] eta 0:00:14 lr 0.000583 time 0.2891 (0.2909) loss 3.3524 (3.5925) grad_norm 1.3765 (1.4361) [2022-10-02 07:03:32 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 134 training takes 0:06:04 [2022-10-02 07:03:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.810 (2.810) Loss 1.0295 (1.0295) Acc@1 77.148 (77.148) Acc@5 93.555 (93.555) [2022-10-02 07:03:45 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.292 Acc@5 93.036 [2022-10-02 07:03:45 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-02 07:03:45 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.29% [2022-10-02 07:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][0/1251] eta 0:45:30 lr 0.000582 time 2.1823 (2.1823) loss 4.0055 (4.0055) grad_norm 1.6153 (1.6153) [2022-10-02 07:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][100/1251] eta 0:06:02 lr 0.000582 time 0.2896 (0.3152) loss 4.0023 (3.5853) grad_norm 1.3839 (1.3919) [2022-10-02 07:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][200/1251] eta 0:05:18 lr 0.000582 time 0.2886 (0.3028) loss 4.2682 (3.6581) grad_norm 1.3490 (1.3942) [2022-10-02 07:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][300/1251] eta 0:04:44 lr 0.000581 time 0.2889 (0.2986) loss 3.3392 (3.6248) grad_norm 1.4128 (1.4017) [2022-10-02 07:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][400/1251] eta 0:04:12 lr 0.000581 time 0.2911 (0.2965) loss 3.6380 (3.6260) grad_norm 1.2595 (1.4125) [2022-10-02 07:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][500/1251] eta 0:03:41 lr 0.000580 time 0.2897 (0.2953) loss 3.5584 (3.6264) grad_norm 1.2270 (1.4185) [2022-10-02 07:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][600/1251] eta 0:03:11 lr 0.000580 time 0.2911 (0.2945) loss 3.9594 (3.6279) grad_norm 1.4536 (1.4120) [2022-10-02 07:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][700/1251] eta 0:02:41 lr 0.000580 time 0.2909 (0.2938) loss 4.1111 (3.6193) grad_norm 1.5485 (1.4181) [2022-10-02 07:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][800/1251] eta 0:02:12 lr 0.000579 time 0.2896 (0.2933) loss 4.1669 (3.6225) grad_norm 1.2095 (1.4185) [2022-10-02 07:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][900/1251] eta 0:01:42 lr 0.000579 time 0.2902 (0.2930) loss 3.0872 (3.6160) grad_norm 1.5149 (1.4242) [2022-10-02 07:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1000/1251] eta 0:01:13 lr 0.000578 time 0.2884 (0.2927) loss 2.6919 (3.6115) grad_norm 1.2895 (1.4251) [2022-10-02 07:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1100/1251] eta 0:00:44 lr 0.000578 time 0.2900 (0.2925) loss 4.0727 (3.6085) grad_norm 1.4921 (1.4246) [2022-10-02 07:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1200/1251] eta 0:00:14 lr 0.000578 time 0.2870 (0.2923) loss 2.8223 (3.6063) grad_norm 1.2943 (1.4275) [2022-10-02 07:09:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 135 training takes 0:06:05 [2022-10-02 07:09:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.771 (2.771) Loss 1.0508 (1.0508) Acc@1 75.586 (75.586) Acc@5 92.969 (92.969) [2022-10-02 07:10:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.316 Acc@5 92.770 [2022-10-02 07:10:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-10-02 07:10:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.32% [2022-10-02 07:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][0/1251] eta 1:05:57 lr 0.000577 time 3.1634 (3.1634) loss 3.3731 (3.3731) grad_norm 1.3724 (1.3724) [2022-10-02 07:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][100/1251] eta 0:06:11 lr 0.000577 time 0.2955 (0.3223) loss 4.1843 (3.5169) grad_norm 1.4418 (1.4564) [2022-10-02 07:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][200/1251] eta 0:05:23 lr 0.000576 time 0.2960 (0.3074) loss 4.1877 (3.6257) grad_norm 1.5284 (1.4514) [2022-10-02 07:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][300/1251] eta 0:04:47 lr 0.000576 time 0.2876 (0.3020) loss 3.8425 (3.6224) grad_norm 1.4518 (1.4543) [2022-10-02 07:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][400/1251] eta 0:04:14 lr 0.000576 time 0.2979 (0.2994) loss 2.6632 (3.6179) grad_norm 1.3433 (1.4496) [2022-10-02 07:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][500/1251] eta 0:03:43 lr 0.000575 time 0.2865 (0.2978) loss 4.1978 (3.6179) grad_norm 1.6376 (1.4595) [2022-10-02 07:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][600/1251] eta 0:03:13 lr 0.000575 time 0.2969 (0.2967) loss 3.8426 (3.6229) grad_norm 1.3774 (1.4612) [2022-10-02 07:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][700/1251] eta 0:02:43 lr 0.000574 time 0.2897 (0.2959) loss 4.0986 (3.6237) grad_norm 1.8732 (1.4617) [2022-10-02 07:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][800/1251] eta 0:02:13 lr 0.000574 time 0.2995 (0.2952) loss 4.0152 (3.6211) grad_norm 1.4321 (1.4598) [2022-10-02 07:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][900/1251] eta 0:01:43 lr 0.000574 time 0.2895 (0.2946) loss 3.8883 (3.6178) grad_norm 1.4100 (1.4573) [2022-10-02 07:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1000/1251] eta 0:01:13 lr 0.000573 time 0.2982 (0.2942) loss 3.9343 (3.6085) grad_norm 1.4015 (1.4576) [2022-10-02 07:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1100/1251] eta 0:00:44 lr 0.000573 time 0.2861 (0.2939) loss 3.8384 (3.6070) grad_norm 1.8541 (1.4560) [2022-10-02 07:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1200/1251] eta 0:00:14 lr 0.000572 time 0.2970 (0.2935) loss 3.8027 (3.6078) grad_norm 1.3776 (1.4516) [2022-10-02 07:16:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 136 training takes 0:06:07 [2022-10-02 07:16:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.472 (2.472) Loss 1.0502 (1.0502) Acc@1 75.195 (75.195) Acc@5 93.066 (93.066) [2022-10-02 07:16:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.238 Acc@5 92.890 [2022-10-02 07:16:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-10-02 07:16:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.32% [2022-10-02 07:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][0/1251] eta 1:07:33 lr 0.000572 time 3.2398 (3.2398) loss 3.3583 (3.3583) grad_norm 1.3067 (1.3067) [2022-10-02 07:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][100/1251] eta 0:06:08 lr 0.000572 time 0.2904 (0.3201) loss 2.6977 (3.5800) grad_norm 1.6350 (1.4511) [2022-10-02 07:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][200/1251] eta 0:05:20 lr 0.000571 time 0.2899 (0.3050) loss 4.1030 (3.6065) grad_norm 1.3488 (1.4540) [2022-10-02 07:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][300/1251] eta 0:04:45 lr 0.000571 time 0.2891 (0.3000) loss 2.8252 (3.5896) grad_norm 1.3107 (1.4558) [2022-10-02 07:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][400/1251] eta 0:04:13 lr 0.000571 time 0.2882 (0.2973) loss 3.9072 (3.5796) grad_norm 1.4468 (1.4530) [2022-10-02 07:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][500/1251] eta 0:03:42 lr 0.000570 time 0.2853 (0.2958) loss 3.0761 (3.5743) grad_norm 1.2284 (1.4525) [2022-10-02 07:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][600/1251] eta 0:03:11 lr 0.000570 time 0.2874 (0.2947) loss 3.7538 (3.5606) grad_norm 1.3792 (1.4492) [2022-10-02 07:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][700/1251] eta 0:02:41 lr 0.000569 time 0.2888 (0.2939) loss 2.8418 (3.5696) grad_norm 1.3930 (1.4510) [2022-10-02 07:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][800/1251] eta 0:02:12 lr 0.000569 time 0.2939 (0.2934) loss 3.5854 (3.5623) grad_norm 1.5677 (1.4576) [2022-10-02 07:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][900/1251] eta 0:01:42 lr 0.000568 time 0.2862 (0.2930) loss 3.8195 (3.5589) grad_norm 1.7933 (1.4589) [2022-10-02 07:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1000/1251] eta 0:01:13 lr 0.000568 time 0.2915 (0.2927) loss 3.5021 (3.5710) grad_norm 1.3833 (1.4578) [2022-10-02 07:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1100/1251] eta 0:00:44 lr 0.000568 time 0.2884 (0.2924) loss 3.5172 (3.5822) grad_norm 1.5327 (1.4578) [2022-10-02 07:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1200/1251] eta 0:00:14 lr 0.000567 time 0.2899 (0.2922) loss 4.4762 (3.5810) grad_norm 1.6578 (1.4586) [2022-10-02 07:22:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 137 training takes 0:06:05 [2022-10-02 07:22:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.092 (3.092) Loss 0.9809 (0.9809) Acc@1 76.855 (76.855) Acc@5 94.434 (94.434) [2022-10-02 07:22:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.534 Acc@5 93.032 [2022-10-02 07:22:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-02 07:22:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.53% [2022-10-02 07:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][0/1251] eta 1:09:27 lr 0.000567 time 3.3313 (3.3313) loss 3.9174 (3.9174) grad_norm 1.7367 (1.7367) [2022-10-02 07:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][100/1251] eta 0:06:08 lr 0.000567 time 0.2923 (0.3205) loss 2.7198 (3.6019) grad_norm 1.3980 (1.4361) [2022-10-02 07:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][200/1251] eta 0:05:20 lr 0.000566 time 0.2884 (0.3052) loss 3.6885 (3.6273) grad_norm 1.5286 (1.4639) [2022-10-02 07:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][300/1251] eta 0:04:45 lr 0.000566 time 0.2915 (0.2999) loss 2.5385 (3.6182) grad_norm 1.3574 (1.4443) [2022-10-02 07:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][400/1251] eta 0:04:12 lr 0.000565 time 0.2889 (0.2972) loss 2.7499 (3.5780) grad_norm 1.6005 (1.4551) [2022-10-02 07:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][500/1251] eta 0:03:41 lr 0.000565 time 0.2897 (0.2955) loss 4.0374 (3.5661) grad_norm 1.4211 (1.4517) [2022-10-02 07:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][600/1251] eta 0:03:11 lr 0.000565 time 0.2881 (0.2943) loss 3.8933 (3.5515) grad_norm 2.1198 (1.4570) [2022-10-02 07:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][700/1251] eta 0:02:41 lr 0.000564 time 0.2924 (0.2935) loss 3.3558 (3.5587) grad_norm 1.4618 (1.4567) [2022-10-02 07:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][800/1251] eta 0:02:12 lr 0.000564 time 0.2848 (0.2928) loss 2.9166 (3.5571) grad_norm 1.3339 (1.4532) [2022-10-02 07:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][900/1251] eta 0:01:42 lr 0.000563 time 0.2922 (0.2923) loss 2.8398 (3.5705) grad_norm 1.3056 (1.4549) [2022-10-02 07:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1000/1251] eta 0:01:13 lr 0.000563 time 0.2900 (0.2919) loss 4.2822 (3.5756) grad_norm 1.2707 (1.4543) [2022-10-02 07:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1100/1251] eta 0:00:44 lr 0.000563 time 0.2892 (0.2916) loss 3.9006 (3.5795) grad_norm 1.2782 (1.4561) [2022-10-02 07:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1200/1251] eta 0:00:14 lr 0.000562 time 0.2857 (0.2913) loss 3.9241 (3.5783) grad_norm 1.4666 (1.4595) [2022-10-02 07:28:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 138 training takes 0:06:04 [2022-10-02 07:28:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.219 (2.219) Loss 1.1862 (1.1862) Acc@1 71.387 (71.387) Acc@5 91.406 (91.406) [2022-10-02 07:29:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.484 Acc@5 92.982 [2022-10-02 07:29:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-02 07:29:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.53% [2022-10-02 07:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][0/1251] eta 0:45:04 lr 0.000562 time 2.1618 (2.1618) loss 2.5223 (2.5223) grad_norm 1.3178 (1.3178) [2022-10-02 07:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][100/1251] eta 0:05:56 lr 0.000561 time 0.2829 (0.3101) loss 3.6477 (3.4939) grad_norm 1.3061 (1.4508) [2022-10-02 07:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][200/1251] eta 0:05:14 lr 0.000561 time 0.2857 (0.2991) loss 2.7113 (3.5681) grad_norm 1.8798 (1.4555) [2022-10-02 07:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][300/1251] eta 0:04:41 lr 0.000561 time 0.2875 (0.2956) loss 4.2964 (3.5695) grad_norm 1.5114 (1.4625) [2022-10-02 07:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][400/1251] eta 0:04:10 lr 0.000560 time 0.2867 (0.2939) loss 3.4768 (3.5733) grad_norm 1.3326 (1.4607) [2022-10-02 07:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][500/1251] eta 0:03:39 lr 0.000560 time 0.2860 (0.2927) loss 3.3070 (3.5744) grad_norm 1.9086 (1.4603) [2022-10-02 07:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][600/1251] eta 0:03:10 lr 0.000559 time 0.2860 (0.2920) loss 4.1552 (3.5752) grad_norm 1.2972 (1.4597) [2022-10-02 07:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][700/1251] eta 0:02:40 lr 0.000559 time 0.2871 (0.2915) loss 4.0975 (3.5795) grad_norm 1.3917 (1.4621) [2022-10-02 07:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][800/1251] eta 0:02:11 lr 0.000559 time 0.2882 (0.2912) loss 3.5078 (3.5694) grad_norm 1.4688 (1.4666) [2022-10-02 07:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][900/1251] eta 0:01:42 lr 0.000558 time 0.2872 (0.2909) loss 4.3216 (3.5647) grad_norm 1.6280 (1.4697) [2022-10-02 07:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1000/1251] eta 0:01:12 lr 0.000558 time 0.2861 (0.2907) loss 3.8022 (3.5616) grad_norm 1.2676 (1.4678) [2022-10-02 07:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1100/1251] eta 0:00:43 lr 0.000557 time 0.2879 (0.2905) loss 3.7313 (3.5614) grad_norm 1.3730 (1.4660) [2022-10-02 07:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1200/1251] eta 0:00:14 lr 0.000557 time 0.2894 (0.2904) loss 4.2816 (3.5661) grad_norm 1.4726 (1.4689) [2022-10-02 07:35:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 139 training takes 0:06:03 [2022-10-02 07:35:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.738 (2.738) Loss 1.0756 (1.0756) Acc@1 74.902 (74.902) Acc@5 92.773 (92.773) [2022-10-02 07:35:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.590 Acc@5 92.976 [2022-10-02 07:35:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-02 07:35:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.59% [2022-10-02 07:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][0/1251] eta 1:03:20 lr 0.000557 time 3.0380 (3.0380) loss 4.2163 (4.2163) grad_norm 1.3753 (1.3753) [2022-10-02 07:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][100/1251] eta 0:06:04 lr 0.000556 time 0.2892 (0.3171) loss 2.6753 (3.5855) grad_norm 1.5047 (1.4580) [2022-10-02 07:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][200/1251] eta 0:05:19 lr 0.000556 time 0.2865 (0.3035) loss 3.8890 (3.5772) grad_norm 1.8965 (1.4649) [2022-10-02 07:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][300/1251] eta 0:04:44 lr 0.000556 time 0.2900 (0.2989) loss 3.9488 (3.5663) grad_norm 1.3245 (1.4627) [2022-10-02 07:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][400/1251] eta 0:04:12 lr 0.000555 time 0.2874 (0.2967) loss 3.5329 (3.5775) grad_norm 1.4166 (1.4626) [2022-10-02 07:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][500/1251] eta 0:03:41 lr 0.000555 time 0.2864 (0.2952) loss 3.9165 (3.5821) grad_norm 1.4103 (1.4673) [2022-10-02 07:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][600/1251] eta 0:03:11 lr 0.000554 time 0.2893 (0.2942) loss 2.6268 (3.5820) grad_norm 1.3501 (1.4647) [2022-10-02 07:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][700/1251] eta 0:02:41 lr 0.000554 time 0.2860 (0.2936) loss 3.5872 (3.5899) grad_norm 1.6409 (1.4699) [2022-10-02 07:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][800/1251] eta 0:02:12 lr 0.000553 time 0.2863 (0.2931) loss 4.3449 (3.5820) grad_norm 1.4490 (1.4714) [2022-10-02 07:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][900/1251] eta 0:01:42 lr 0.000553 time 0.2905 (0.2926) loss 3.0372 (3.5890) grad_norm 1.2431 (1.4700) [2022-10-02 07:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1000/1251] eta 0:01:13 lr 0.000553 time 0.2888 (0.2923) loss 3.2510 (3.5846) grad_norm 1.4032 (1.4747) [2022-10-02 07:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1100/1251] eta 0:00:44 lr 0.000552 time 0.2877 (0.2920) loss 4.0357 (3.5827) grad_norm 1.3124 (1.4745) [2022-10-02 07:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1200/1251] eta 0:00:14 lr 0.000552 time 0.2893 (0.2917) loss 3.7534 (3.5893) grad_norm 1.6325 (1.4711) [2022-10-02 07:41:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 140 training takes 0:06:05 [2022-10-02 07:41:22 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saving...... [2022-10-02 07:41:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saved !!! [2022-10-02 07:41:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.858 (2.858) Loss 1.0198 (1.0198) Acc@1 75.586 (75.586) Acc@5 93.457 (93.457) [2022-10-02 07:41:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.500 Acc@5 92.966 [2022-10-02 07:41:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-02 07:41:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.59% [2022-10-02 07:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][0/1251] eta 1:06:40 lr 0.000552 time 3.1979 (3.1979) loss 2.8371 (2.8371) grad_norm 1.3593 (1.3593) [2022-10-02 07:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][100/1251] eta 0:06:05 lr 0.000551 time 0.2873 (0.3180) loss 2.9906 (3.5309) grad_norm 1.3382 (1.4516) [2022-10-02 07:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][200/1251] eta 0:05:18 lr 0.000551 time 0.2881 (0.3031) loss 3.5791 (3.4602) grad_norm 1.2477 (1.4513) [2022-10-02 07:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][300/1251] eta 0:04:43 lr 0.000550 time 0.2868 (0.2981) loss 2.7200 (3.4806) grad_norm 1.4210 (1.4598) [2022-10-02 07:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][400/1251] eta 0:04:11 lr 0.000550 time 0.2894 (0.2957) loss 3.9506 (3.4911) grad_norm 1.3425 (1.4630) [2022-10-02 07:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][500/1251] eta 0:03:40 lr 0.000550 time 0.2898 (0.2941) loss 3.3232 (3.4948) grad_norm 1.5539 (1.4624) [2022-10-02 07:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][600/1251] eta 0:03:10 lr 0.000549 time 0.2886 (0.2931) loss 4.2375 (3.5083) grad_norm 1.4172 (1.4610) [2022-10-02 07:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][700/1251] eta 0:02:41 lr 0.000549 time 0.2876 (0.2924) loss 3.9587 (3.5248) grad_norm 1.4314 (1.4608) [2022-10-02 07:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][800/1251] eta 0:02:11 lr 0.000548 time 0.2875 (0.2919) loss 3.4321 (3.5254) grad_norm 1.4696 (1.4646) [2022-10-02 07:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][900/1251] eta 0:01:42 lr 0.000548 time 0.2893 (0.2915) loss 3.3540 (3.5408) grad_norm 1.8254 (1.4670) [2022-10-02 07:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1000/1251] eta 0:01:13 lr 0.000547 time 0.2932 (0.2912) loss 3.7693 (3.5520) grad_norm 1.1949 (1.4665) [2022-10-02 07:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1100/1251] eta 0:00:43 lr 0.000547 time 0.2880 (0.2909) loss 2.9438 (3.5623) grad_norm 1.6181 (1.4651) [2022-10-02 07:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1200/1251] eta 0:00:14 lr 0.000547 time 0.2883 (0.2906) loss 3.2272 (3.5723) grad_norm 1.4238 (1.4696) [2022-10-02 07:47:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 141 training takes 0:06:03 [2022-10-02 07:47:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.755 (2.755) Loss 1.1354 (1.1354) Acc@1 73.047 (73.047) Acc@5 91.602 (91.602) [2022-10-02 07:47:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.412 Acc@5 93.018 [2022-10-02 07:47:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-10-02 07:47:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.59% [2022-10-02 07:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][0/1251] eta 1:10:25 lr 0.000546 time 3.3777 (3.3777) loss 2.8451 (2.8451) grad_norm 1.3252 (1.3252) [2022-10-02 07:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][100/1251] eta 0:06:11 lr 0.000546 time 0.2889 (0.3223) loss 3.5963 (3.5422) grad_norm 1.4777 (1.5037) [2022-10-02 07:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][200/1251] eta 0:05:21 lr 0.000546 time 0.2875 (0.3063) loss 3.9442 (3.5300) grad_norm 1.1722 (1.4849) [2022-10-02 07:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][300/1251] eta 0:04:45 lr 0.000545 time 0.2920 (0.3007) loss 3.7484 (3.5493) grad_norm 1.5805 (1.4819) [2022-10-02 07:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][400/1251] eta 0:04:13 lr 0.000545 time 0.2906 (0.2978) loss 3.8451 (3.5386) grad_norm 1.3382 (1.4822) [2022-10-02 07:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][500/1251] eta 0:03:42 lr 0.000544 time 0.2881 (0.2962) loss 3.3863 (3.5423) grad_norm 1.7568 (1.4738) [2022-10-02 07:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][600/1251] eta 0:03:12 lr 0.000544 time 0.2908 (0.2951) loss 4.0868 (3.5387) grad_norm 1.4628 (1.4741) [2022-10-02 07:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][700/1251] eta 0:02:42 lr 0.000544 time 0.2849 (0.2943) loss 3.5675 (3.5418) grad_norm 1.4427 (1.4788) [2022-10-02 07:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][800/1251] eta 0:02:12 lr 0.000543 time 0.2888 (0.2938) loss 3.8996 (3.5444) grad_norm 1.7844 (1.4810) [2022-10-02 07:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][900/1251] eta 0:01:42 lr 0.000543 time 0.2911 (0.2932) loss 3.5443 (3.5547) grad_norm 1.7595 (1.4810) [2022-10-02 07:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1000/1251] eta 0:01:13 lr 0.000542 time 0.2845 (0.2928) loss 3.7252 (3.5616) grad_norm 1.3858 (1.4812) [2022-10-02 07:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1100/1251] eta 0:00:44 lr 0.000542 time 0.2887 (0.2924) loss 3.9625 (3.5613) grad_norm 1.5339 (1.4826) [2022-10-02 07:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1200/1251] eta 0:00:14 lr 0.000541 time 0.2915 (0.2920) loss 2.6515 (3.5581) grad_norm 1.3451 (1.4866) [2022-10-02 07:53:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 142 training takes 0:06:05 [2022-10-02 07:54:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.236 (3.236) Loss 1.0360 (1.0360) Acc@1 75.781 (75.781) Acc@5 92.578 (92.578) [2022-10-02 07:54:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.538 Acc@5 93.140 [2022-10-02 07:54:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-10-02 07:54:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.59% [2022-10-02 07:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][0/1251] eta 1:02:36 lr 0.000541 time 3.0027 (3.0027) loss 3.0187 (3.0187) grad_norm 1.4310 (1.4310) [2022-10-02 07:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][100/1251] eta 0:06:08 lr 0.000541 time 0.2874 (0.3198) loss 3.7729 (3.5969) grad_norm 1.4976 (1.4765) [2022-10-02 07:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][200/1251] eta 0:05:20 lr 0.000540 time 0.2913 (0.3048) loss 3.0588 (3.5876) grad_norm 1.4724 (1.4938) [2022-10-02 07:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][300/1251] eta 0:04:45 lr 0.000540 time 0.2909 (0.2999) loss 3.3477 (3.5555) grad_norm 1.3731 (1.4828) [2022-10-02 07:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][400/1251] eta 0:04:13 lr 0.000540 time 0.2876 (0.2974) loss 4.2409 (3.5405) grad_norm 1.2357 (1.4837) [2022-10-02 07:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][500/1251] eta 0:03:42 lr 0.000539 time 0.2847 (0.2958) loss 3.5838 (3.5252) grad_norm 1.5286 (1.4829) [2022-10-02 07:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][600/1251] eta 0:03:11 lr 0.000539 time 0.2904 (0.2948) loss 4.0876 (3.5434) grad_norm 1.3608 (1.4817) [2022-10-02 07:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][700/1251] eta 0:02:42 lr 0.000538 time 0.2903 (0.2941) loss 3.8560 (3.5304) grad_norm 1.4100 (1.4825) [2022-10-02 07:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][800/1251] eta 0:02:12 lr 0.000538 time 0.2887 (0.2936) loss 4.1900 (3.5250) grad_norm 1.5691 (1.4872) [2022-10-02 07:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][900/1251] eta 0:01:42 lr 0.000538 time 0.2874 (0.2932) loss 3.9784 (3.5287) grad_norm 1.3880 (1.4869) [2022-10-02 07:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1000/1251] eta 0:01:13 lr 0.000537 time 0.2885 (0.2928) loss 2.2375 (3.5277) grad_norm 1.3067 (1.4861) [2022-10-02 07:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1100/1251] eta 0:00:44 lr 0.000537 time 0.2929 (0.2926) loss 3.8654 (3.5252) grad_norm 1.2970 (1.4885) [2022-10-02 08:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1200/1251] eta 0:00:14 lr 0.000536 time 0.2938 (0.2923) loss 4.2366 (3.5346) grad_norm 1.2588 (1.4876) [2022-10-02 08:00:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 143 training takes 0:06:05 [2022-10-02 08:00:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.393 (2.393) Loss 1.0733 (1.0733) Acc@1 73.340 (73.340) Acc@5 92.188 (92.188) [2022-10-02 08:00:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.760 Acc@5 93.186 [2022-10-02 08:00:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-02 08:00:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-10-02 08:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][0/1251] eta 0:54:53 lr 0.000536 time 2.6323 (2.6323) loss 3.7194 (3.7194) grad_norm 1.4439 (1.4439) [2022-10-02 08:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][100/1251] eta 0:06:04 lr 0.000536 time 0.2853 (0.3170) loss 4.0745 (3.5404) grad_norm 1.3255 (1.5116) [2022-10-02 08:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][200/1251] eta 0:05:17 lr 0.000535 time 0.2860 (0.3025) loss 3.2923 (3.4954) grad_norm 1.3755 (1.5214) [2022-10-02 08:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][300/1251] eta 0:04:42 lr 0.000535 time 0.2962 (0.2975) loss 3.8900 (3.5074) grad_norm 1.3868 (1.5083) [2022-10-02 08:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][400/1251] eta 0:04:11 lr 0.000534 time 0.2921 (0.2950) loss 3.9570 (3.5196) grad_norm 1.6234 (1.4992) [2022-10-02 08:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][500/1251] eta 0:03:40 lr 0.000534 time 0.2853 (0.2936) loss 3.0401 (3.5212) grad_norm 1.5429 (1.5002) [2022-10-02 08:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][600/1251] eta 0:03:10 lr 0.000534 time 0.2904 (0.2925) loss 3.2978 (3.5357) grad_norm 1.4227 (1.5136) [2022-10-02 08:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][700/1251] eta 0:02:40 lr 0.000533 time 0.2858 (0.2919) loss 3.8853 (3.5271) grad_norm 1.4447 (1.5077) [2022-10-02 08:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][800/1251] eta 0:02:11 lr 0.000533 time 0.2898 (0.2913) loss 4.1430 (3.5309) grad_norm 1.2336 (1.5052) [2022-10-02 08:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][900/1251] eta 0:01:42 lr 0.000532 time 0.2849 (0.2909) loss 4.1518 (3.5272) grad_norm 1.5673 (1.5041) [2022-10-02 08:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1000/1251] eta 0:01:12 lr 0.000532 time 0.2836 (0.2905) loss 3.8614 (3.5283) grad_norm 2.0072 (1.5036) [2022-10-02 08:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1100/1251] eta 0:00:43 lr 0.000532 time 0.2852 (0.2902) loss 3.4545 (3.5330) grad_norm 1.3537 (1.5014) [2022-10-02 08:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1200/1251] eta 0:00:14 lr 0.000531 time 0.2894 (0.2899) loss 4.2508 (3.5356) grad_norm 1.4740 (1.5009) [2022-10-02 08:06:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 144 training takes 0:06:02 [2022-10-02 08:06:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.624 (2.624) Loss 1.0775 (1.0775) Acc@1 74.707 (74.707) Acc@5 92.188 (92.188) [2022-10-02 08:06:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.644 Acc@5 93.150 [2022-10-02 08:06:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-10-02 08:06:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.76% [2022-10-02 08:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][0/1251] eta 1:08:20 lr 0.000531 time 3.2781 (3.2781) loss 2.5798 (2.5798) grad_norm 1.3707 (1.3707) [2022-10-02 08:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][100/1251] eta 0:06:07 lr 0.000530 time 0.2884 (0.3194) loss 2.8930 (3.5126) grad_norm 1.8022 (1.4857) [2022-10-02 08:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][200/1251] eta 0:05:19 lr 0.000530 time 0.2909 (0.3040) loss 2.7713 (3.5600) grad_norm 1.8578 (1.5070) [2022-10-02 08:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][300/1251] eta 0:04:44 lr 0.000530 time 0.2854 (0.2989) loss 4.2272 (3.5831) grad_norm 1.4378 (1.5019) [2022-10-02 08:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][400/1251] eta 0:04:12 lr 0.000529 time 0.2908 (0.2964) loss 3.6495 (3.5764) grad_norm 1.4746 (1.5039) [2022-10-02 08:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][500/1251] eta 0:03:41 lr 0.000529 time 0.2873 (0.2949) loss 2.8636 (3.5566) grad_norm 1.7054 (1.5010) [2022-10-02 08:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][600/1251] eta 0:03:11 lr 0.000528 time 0.2898 (0.2938) loss 3.8262 (3.5651) grad_norm 1.6584 (1.5025) [2022-10-02 08:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][700/1251] eta 0:02:41 lr 0.000528 time 0.2871 (0.2932) loss 3.6132 (3.5593) grad_norm 1.3472 (1.5002) [2022-10-02 08:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][800/1251] eta 0:02:12 lr 0.000528 time 0.2912 (0.2928) loss 2.8714 (3.5579) grad_norm 1.4012 (1.5016) [2022-10-02 08:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][900/1251] eta 0:01:42 lr 0.000527 time 0.2891 (0.2924) loss 3.9143 (3.5509) grad_norm 1.2468 (1.5019) [2022-10-02 08:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1000/1251] eta 0:01:13 lr 0.000527 time 0.2902 (0.2920) loss 2.8157 (3.5495) grad_norm 1.5528 (1.5006) [2022-10-02 08:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1100/1251] eta 0:00:44 lr 0.000526 time 0.2871 (0.2917) loss 3.7218 (3.5470) grad_norm 1.4073 (1.4982) [2022-10-02 08:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1200/1251] eta 0:00:14 lr 0.000526 time 0.2904 (0.2914) loss 4.0026 (3.5498) grad_norm 1.6146 (1.5006) [2022-10-02 08:12:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 145 training takes 0:06:04 [2022-10-02 08:12:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.382 (2.382) Loss 1.0766 (1.0766) Acc@1 73.926 (73.926) Acc@5 93.750 (93.750) [2022-10-02 08:13:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.796 Acc@5 93.286 [2022-10-02 08:13:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-02 08:13:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.80% [2022-10-02 08:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][0/1251] eta 0:54:04 lr 0.000526 time 2.5937 (2.5937) loss 4.2762 (4.2762) grad_norm 1.6706 (1.6706) [2022-10-02 08:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][100/1251] eta 0:06:07 lr 0.000525 time 0.2867 (0.3189) loss 2.7870 (3.4440) grad_norm 1.4682 (1.5574) [2022-10-02 08:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][200/1251] eta 0:05:20 lr 0.000525 time 0.2894 (0.3047) loss 2.5373 (3.4815) grad_norm 1.5502 (1.5388) [2022-10-02 08:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][300/1251] eta 0:04:45 lr 0.000524 time 0.2884 (0.3000) loss 3.2201 (3.4896) grad_norm 1.2817 (1.5192) [2022-10-02 08:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][400/1251] eta 0:04:13 lr 0.000524 time 0.2896 (0.2976) loss 4.4078 (3.4993) grad_norm 1.4752 (1.5166) [2022-10-02 08:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][500/1251] eta 0:03:42 lr 0.000524 time 0.2890 (0.2961) loss 2.5513 (3.5195) grad_norm 1.3526 (1.5150) [2022-10-02 08:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][600/1251] eta 0:03:12 lr 0.000523 time 0.2887 (0.2952) loss 4.0882 (3.5214) grad_norm 1.3953 (1.5162) [2022-10-02 08:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][700/1251] eta 0:02:42 lr 0.000523 time 0.2871 (0.2944) loss 2.3548 (3.5222) grad_norm 1.6100 (1.5154) [2022-10-02 08:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][800/1251] eta 0:02:12 lr 0.000522 time 0.2866 (0.2938) loss 3.6528 (3.5216) grad_norm 1.6947 (1.5135) [2022-10-02 08:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][900/1251] eta 0:01:42 lr 0.000522 time 0.2872 (0.2933) loss 2.9726 (3.5254) grad_norm 1.3915 (1.5111) [2022-10-02 08:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1000/1251] eta 0:01:13 lr 0.000522 time 0.2873 (0.2929) loss 4.0072 (3.5314) grad_norm 1.3771 (1.5116) [2022-10-02 08:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1100/1251] eta 0:00:44 lr 0.000521 time 0.2872 (0.2925) loss 2.2256 (3.5298) grad_norm 1.4286 (1.5143) [2022-10-02 08:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1200/1251] eta 0:00:14 lr 0.000521 time 0.2855 (0.2922) loss 4.2184 (3.5306) grad_norm 1.3449 (1.5132) [2022-10-02 08:19:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 146 training takes 0:06:05 [2022-10-02 08:19:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.510 (2.510) Loss 1.1028 (1.1028) Acc@1 74.707 (74.707) Acc@5 92.676 (92.676) [2022-10-02 08:19:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.780 Acc@5 93.266 [2022-10-02 08:19:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-02 08:19:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.80% [2022-10-02 08:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][0/1251] eta 1:09:01 lr 0.000521 time 3.3104 (3.3104) loss 3.7701 (3.7701) grad_norm 1.5720 (1.5720) [2022-10-02 08:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][100/1251] eta 0:06:09 lr 0.000520 time 0.2901 (0.3212) loss 4.0128 (3.5312) grad_norm 1.2496 (1.5338) [2022-10-02 08:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][200/1251] eta 0:05:21 lr 0.000520 time 0.2918 (0.3059) loss 2.8808 (3.5031) grad_norm 1.6275 (1.5263) [2022-10-02 08:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][300/1251] eta 0:04:45 lr 0.000519 time 0.2904 (0.3005) loss 3.5786 (3.4971) grad_norm 1.4041 (1.5396) [2022-10-02 08:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][400/1251] eta 0:04:13 lr 0.000519 time 0.2914 (0.2980) loss 2.3677 (3.4875) grad_norm 1.9263 (1.5342) [2022-10-02 08:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][500/1251] eta 0:03:42 lr 0.000518 time 0.2877 (0.2964) loss 3.5054 (3.5044) grad_norm 1.5123 (1.5270) [2022-10-02 08:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][600/1251] eta 0:03:12 lr 0.000518 time 0.2906 (0.2954) loss 2.9064 (3.5138) grad_norm 1.2998 (1.5227) [2022-10-02 08:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][700/1251] eta 0:02:42 lr 0.000518 time 0.2916 (0.2945) loss 4.2375 (3.5072) grad_norm 1.4247 (1.5185) [2022-10-02 08:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][800/1251] eta 0:02:12 lr 0.000517 time 0.2852 (0.2939) loss 4.1500 (3.5271) grad_norm 1.6397 (1.5152) [2022-10-02 08:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][900/1251] eta 0:01:42 lr 0.000517 time 0.2915 (0.2933) loss 3.8634 (3.5358) grad_norm 1.3149 (1.5126) [2022-10-02 08:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1000/1251] eta 0:01:13 lr 0.000516 time 0.2869 (0.2929) loss 3.2188 (3.5263) grad_norm 1.7234 (1.5130) [2022-10-02 08:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1100/1251] eta 0:00:44 lr 0.000516 time 0.2872 (0.2925) loss 3.8028 (3.5269) grad_norm 1.7452 (1.5132) [2022-10-02 08:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1200/1251] eta 0:00:14 lr 0.000516 time 0.2851 (0.2922) loss 3.5465 (3.5276) grad_norm 1.6350 (1.5133) [2022-10-02 08:25:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 147 training takes 0:06:05 [2022-10-02 08:25:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.800 (2.800) Loss 1.0852 (1.0852) Acc@1 75.195 (75.195) Acc@5 92.871 (92.871) [2022-10-02 08:25:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.942 Acc@5 93.324 [2022-10-02 08:25:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-10-02 08:25:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 75.94% [2022-10-02 08:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][0/1251] eta 1:08:22 lr 0.000515 time 3.2791 (3.2791) loss 3.5652 (3.5652) grad_norm 1.2710 (1.2710) [2022-10-02 08:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][100/1251] eta 0:06:09 lr 0.000515 time 0.2875 (0.3213) loss 2.4768 (3.5925) grad_norm 1.3484 (1.5444) [2022-10-02 08:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][200/1251] eta 0:05:21 lr 0.000515 time 0.2945 (0.3059) loss 4.0555 (3.5236) grad_norm 1.4063 (1.5067) [2022-10-02 08:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][300/1251] eta 0:04:46 lr 0.000514 time 0.2858 (0.3010) loss 3.9456 (3.5192) grad_norm 1.4260 (1.5014) [2022-10-02 08:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][400/1251] eta 0:04:13 lr 0.000514 time 0.2938 (0.2984) loss 4.3860 (3.5352) grad_norm 1.4296 (1.5059) [2022-10-02 08:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][500/1251] eta 0:03:42 lr 0.000513 time 0.2848 (0.2968) loss 3.5523 (3.5411) grad_norm 1.5594 (1.5074) [2022-10-02 08:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][600/1251] eta 0:03:12 lr 0.000513 time 0.2974 (0.2958) loss 4.3163 (3.5521) grad_norm 1.5140 (1.5081) [2022-10-02 08:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][700/1251] eta 0:02:42 lr 0.000512 time 0.2869 (0.2950) loss 3.8046 (3.5479) grad_norm 1.6751 (1.5064) [2022-10-02 08:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][800/1251] eta 0:02:12 lr 0.000512 time 0.2889 (0.2945) loss 3.9735 (3.5409) grad_norm 1.2109 (1.5050) [2022-10-02 08:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][900/1251] eta 0:01:43 lr 0.000512 time 0.2870 (0.2941) loss 3.0757 (3.5482) grad_norm 1.4418 (1.5094) [2022-10-02 08:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1000/1251] eta 0:01:13 lr 0.000511 time 0.2872 (0.2937) loss 2.1747 (3.5547) grad_norm 1.3068 (1.5134) [2022-10-02 08:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1100/1251] eta 0:00:44 lr 0.000511 time 0.2870 (0.2934) loss 3.7242 (3.5550) grad_norm 1.3528 (1.5128) [2022-10-02 08:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1200/1251] eta 0:00:14 lr 0.000510 time 0.2897 (0.2932) loss 3.4022 (3.5535) grad_norm 1.5666 (1.5154) [2022-10-02 08:31:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 148 training takes 0:06:07 [2022-10-02 08:31:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.229 (3.229) Loss 0.9967 (0.9967) Acc@1 76.855 (76.855) Acc@5 94.531 (94.531) [2022-10-02 08:31:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.080 Acc@5 93.346 [2022-10-02 08:31:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-02 08:31:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.08% [2022-10-02 08:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][0/1251] eta 1:02:50 lr 0.000510 time 3.0143 (3.0143) loss 3.7611 (3.7611) grad_norm 1.8519 (1.8519) [2022-10-02 08:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][100/1251] eta 0:06:08 lr 0.000510 time 0.2857 (0.3199) loss 3.5864 (3.5124) grad_norm 1.4173 (1.5525) [2022-10-02 08:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][200/1251] eta 0:05:20 lr 0.000509 time 0.2909 (0.3053) loss 3.2021 (3.5281) grad_norm 1.5682 (1.5355) [2022-10-02 08:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][300/1251] eta 0:04:45 lr 0.000509 time 0.2889 (0.3004) loss 2.8920 (3.5442) grad_norm 1.4250 (1.5257) [2022-10-02 08:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][400/1251] eta 0:04:13 lr 0.000509 time 0.2911 (0.2980) loss 4.0336 (3.5497) grad_norm 1.5010 (1.5251) [2022-10-02 08:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][500/1251] eta 0:03:42 lr 0.000508 time 0.2969 (0.2965) loss 4.2010 (3.5489) grad_norm 1.7712 (1.5278) [2022-10-02 08:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][600/1251] eta 0:03:12 lr 0.000508 time 0.2892 (0.2954) loss 3.7087 (3.5421) grad_norm 1.6202 (1.5227) [2022-10-02 08:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][700/1251] eta 0:02:42 lr 0.000507 time 0.2858 (0.2946) loss 2.5273 (3.5337) grad_norm 1.6885 (1.5176) [2022-10-02 08:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][800/1251] eta 0:02:12 lr 0.000507 time 0.2874 (0.2940) loss 3.7361 (3.5236) grad_norm 1.4683 (1.5212) [2022-10-02 08:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][900/1251] eta 0:01:42 lr 0.000506 time 0.2881 (0.2934) loss 3.9741 (3.5208) grad_norm 1.4388 (1.5255) [2022-10-02 08:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1000/1251] eta 0:01:13 lr 0.000506 time 0.2890 (0.2929) loss 3.0334 (3.5263) grad_norm 1.6621 (1.5241) [2022-10-02 08:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1100/1251] eta 0:00:44 lr 0.000506 time 0.2864 (0.2925) loss 3.4930 (3.5276) grad_norm 1.2439 (1.5205) [2022-10-02 08:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1200/1251] eta 0:00:14 lr 0.000505 time 0.2921 (0.2921) loss 3.7268 (3.5262) grad_norm 1.5032 (1.5234) [2022-10-02 08:38:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 149 training takes 0:06:05 [2022-10-02 08:38:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.163 (3.163) Loss 1.0109 (1.0109) Acc@1 76.660 (76.660) Acc@5 94.727 (94.727) [2022-10-02 08:38:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.998 Acc@5 93.414 [2022-10-02 08:38:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-02 08:38:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.08% [2022-10-02 08:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][0/1251] eta 1:08:56 lr 0.000505 time 3.3064 (3.3064) loss 4.0739 (4.0739) grad_norm 1.4343 (1.4343) [2022-10-02 08:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][100/1251] eta 0:06:09 lr 0.000505 time 0.2858 (0.3209) loss 2.5792 (3.5447) grad_norm 1.5580 (1.5145) [2022-10-02 08:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][200/1251] eta 0:05:21 lr 0.000504 time 0.2975 (0.3055) loss 2.8563 (3.5046) grad_norm 1.4681 (1.5377) [2022-10-02 08:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][300/1251] eta 0:04:45 lr 0.000504 time 0.2928 (0.3006) loss 4.0468 (3.5004) grad_norm 1.4685 (1.5422) [2022-10-02 08:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][400/1251] eta 0:04:13 lr 0.000503 time 0.2938 (0.2980) loss 2.9826 (3.4929) grad_norm 1.5835 (1.5493) [2022-10-02 08:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][500/1251] eta 0:03:42 lr 0.000503 time 0.2881 (0.2964) loss 4.2189 (3.4744) grad_norm 1.4006 (1.5426) [2022-10-02 08:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][600/1251] eta 0:03:12 lr 0.000503 time 0.2914 (0.2953) loss 2.6407 (3.4833) grad_norm 1.5682 (1.5470) [2022-10-02 08:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][700/1251] eta 0:02:42 lr 0.000502 time 0.2811 (0.2945) loss 3.1780 (3.4930) grad_norm 2.3058 (1.5452) [2022-10-02 08:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][800/1251] eta 0:02:12 lr 0.000502 time 0.2928 (0.2940) loss 3.2072 (3.5004) grad_norm 1.6206 (1.5448) [2022-10-02 08:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][900/1251] eta 0:01:43 lr 0.000501 time 0.2887 (0.2936) loss 3.6861 (3.5068) grad_norm 1.3701 (1.5430) [2022-10-02 08:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1000/1251] eta 0:01:13 lr 0.000501 time 0.2890 (0.2933) loss 3.8134 (3.5038) grad_norm 1.7583 (1.5431) [2022-10-02 08:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1100/1251] eta 0:00:44 lr 0.000500 time 0.2885 (0.2930) loss 3.6241 (3.5109) grad_norm 1.8125 (1.5418) [2022-10-02 08:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1200/1251] eta 0:00:14 lr 0.000500 time 0.2901 (0.2927) loss 3.8988 (3.5063) grad_norm 2.0196 (1.5416) [2022-10-02 08:44:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 150 training takes 0:06:06 [2022-10-02 08:44:23 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saving...... [2022-10-02 08:44:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saved !!! [2022-10-02 08:44:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.563 (2.563) Loss 0.9523 (0.9523) Acc@1 77.246 (77.246) Acc@5 94.727 (94.727) [2022-10-02 08:44:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.826 Acc@5 93.300 [2022-10-02 08:44:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 75.8% [2022-10-02 08:44:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.08% [2022-10-02 08:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][0/1251] eta 0:53:01 lr 0.000500 time 2.5435 (2.5435) loss 3.9170 (3.9170) grad_norm 1.8991 (1.8991) [2022-10-02 08:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][100/1251] eta 0:05:58 lr 0.000499 time 0.2868 (0.3119) loss 3.0677 (3.5627) grad_norm 1.5182 (1.5629) [2022-10-02 08:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][200/1251] eta 0:05:15 lr 0.000499 time 0.2898 (0.2997) loss 3.0395 (3.5608) grad_norm 1.6395 (1.5705) [2022-10-02 08:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][300/1251] eta 0:04:41 lr 0.000499 time 0.2886 (0.2956) loss 3.8941 (3.5542) grad_norm 1.8662 (1.5588) [2022-10-02 08:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][400/1251] eta 0:04:09 lr 0.000498 time 0.2865 (0.2936) loss 2.5907 (3.5569) grad_norm 1.3908 (1.5479) [2022-10-02 08:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][500/1251] eta 0:03:39 lr 0.000498 time 0.2894 (0.2927) loss 3.7299 (3.5526) grad_norm 1.3791 (1.5388) [2022-10-02 08:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][600/1251] eta 0:03:10 lr 0.000497 time 0.2889 (0.2919) loss 2.8832 (3.5592) grad_norm 1.4099 (1.5385) [2022-10-02 08:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][700/1251] eta 0:02:40 lr 0.000497 time 0.2875 (0.2914) loss 3.1318 (3.5501) grad_norm 1.5545 (1.5371) [2022-10-02 08:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][800/1251] eta 0:02:11 lr 0.000497 time 0.2852 (0.2909) loss 3.8479 (3.5588) grad_norm 1.4363 (1.5356) [2022-10-02 08:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][900/1251] eta 0:01:41 lr 0.000496 time 0.2891 (0.2906) loss 3.9981 (3.5585) grad_norm 1.6255 (1.5367) [2022-10-02 08:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1000/1251] eta 0:01:12 lr 0.000496 time 0.2842 (0.2902) loss 3.7888 (3.5546) grad_norm 1.7648 (1.5384) [2022-10-02 08:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1100/1251] eta 0:00:43 lr 0.000495 time 0.2888 (0.2900) loss 3.4431 (3.5626) grad_norm 1.2695 (1.5386) [2022-10-02 08:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1200/1251] eta 0:00:14 lr 0.000495 time 0.2878 (0.2898) loss 2.5139 (3.5597) grad_norm 1.9362 (1.5382) [2022-10-02 08:50:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 151 training takes 0:06:02 [2022-10-02 08:50:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.746 (2.746) Loss 0.9855 (0.9855) Acc@1 77.246 (77.246) Acc@5 94.336 (94.336) [2022-10-02 08:50:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 75.992 Acc@5 93.344 [2022-10-02 08:50:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-10-02 08:50:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.08% [2022-10-02 08:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][0/1251] eta 1:05:34 lr 0.000495 time 3.1453 (3.1453) loss 4.2453 (4.2453) grad_norm 1.3636 (1.3636) [2022-10-02 08:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][100/1251] eta 0:06:04 lr 0.000494 time 0.2899 (0.3170) loss 4.0943 (3.5338) grad_norm 1.4674 (1.5821) [2022-10-02 08:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][200/1251] eta 0:05:18 lr 0.000494 time 0.2883 (0.3029) loss 2.3633 (3.5632) grad_norm 1.4970 (1.5735) [2022-10-02 08:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][300/1251] eta 0:04:43 lr 0.000493 time 0.2888 (0.2982) loss 2.5046 (3.5577) grad_norm 1.6331 (1.5739) [2022-10-02 08:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][400/1251] eta 0:04:11 lr 0.000493 time 0.2865 (0.2957) loss 3.6623 (3.5422) grad_norm 1.8406 (1.5633) [2022-10-02 08:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][500/1251] eta 0:03:41 lr 0.000493 time 0.2883 (0.2944) loss 3.6961 (3.5359) grad_norm 1.5278 (1.5542) [2022-10-02 08:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][600/1251] eta 0:03:11 lr 0.000492 time 0.2878 (0.2934) loss 3.5946 (3.5174) grad_norm 1.2400 (1.5529) [2022-10-02 08:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][700/1251] eta 0:02:41 lr 0.000492 time 0.2910 (0.2927) loss 3.0001 (3.5252) grad_norm 1.6979 (1.5525) [2022-10-02 08:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][800/1251] eta 0:02:11 lr 0.000491 time 0.2871 (0.2922) loss 3.9151 (3.5205) grad_norm 2.8252 (1.5567) [2022-10-02 08:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][900/1251] eta 0:01:42 lr 0.000491 time 0.2894 (0.2918) loss 3.6776 (3.5353) grad_norm 1.4457 (1.5595) [2022-10-02 08:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1000/1251] eta 0:01:13 lr 0.000490 time 0.2865 (0.2915) loss 3.8076 (3.5370) grad_norm 1.5556 (1.5604) [2022-10-02 08:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1100/1251] eta 0:00:43 lr 0.000490 time 0.2856 (0.2912) loss 4.0235 (3.5294) grad_norm 1.4234 (1.5601) [2022-10-02 08:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1200/1251] eta 0:00:14 lr 0.000490 time 0.2868 (0.2909) loss 4.2681 (3.5366) grad_norm 1.9492 (1.5612) [2022-10-02 08:56:56 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 152 training takes 0:06:04 [2022-10-02 08:56:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.324 (2.324) Loss 1.0914 (1.0914) Acc@1 75.098 (75.098) Acc@5 93.750 (93.750) [2022-10-02 08:57:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.152 Acc@5 93.412 [2022-10-02 08:57:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-02 08:57:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.15% [2022-10-02 08:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][0/1251] eta 1:06:16 lr 0.000489 time 3.1790 (3.1790) loss 3.3881 (3.3881) grad_norm 1.4661 (1.4661) [2022-10-02 08:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][100/1251] eta 0:06:06 lr 0.000489 time 0.2887 (0.3182) loss 3.1290 (3.4321) grad_norm 1.5504 (1.5305) [2022-10-02 08:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][200/1251] eta 0:05:19 lr 0.000489 time 0.2857 (0.3036) loss 3.2214 (3.4596) grad_norm 1.8693 (1.5547) [2022-10-02 08:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][300/1251] eta 0:04:44 lr 0.000488 time 0.2871 (0.2989) loss 3.7539 (3.4950) grad_norm 1.7651 (1.5665) [2022-10-02 08:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][400/1251] eta 0:04:12 lr 0.000488 time 0.2932 (0.2964) loss 3.5061 (3.5049) grad_norm 1.8269 (1.5758) [2022-10-02 08:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][500/1251] eta 0:03:41 lr 0.000487 time 0.2836 (0.2947) loss 3.5605 (3.5070) grad_norm 1.4914 (1.5678) [2022-10-02 09:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][600/1251] eta 0:03:11 lr 0.000487 time 0.2887 (0.2935) loss 3.1748 (3.5112) grad_norm 1.7361 (1.5640) [2022-10-02 09:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][700/1251] eta 0:02:41 lr 0.000487 time 0.2909 (0.2926) loss 3.5796 (3.5065) grad_norm 1.5784 (1.5597) [2022-10-02 09:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][800/1251] eta 0:02:11 lr 0.000486 time 0.2947 (0.2919) loss 3.6576 (3.5091) grad_norm 1.5051 (1.5607) [2022-10-02 09:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][900/1251] eta 0:01:42 lr 0.000486 time 0.2849 (0.2914) loss 3.5886 (3.5016) grad_norm 1.5895 (1.5588) [2022-10-02 09:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1000/1251] eta 0:01:13 lr 0.000485 time 0.2877 (0.2910) loss 3.4404 (3.5066) grad_norm 1.4552 (1.5583) [2022-10-02 09:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1100/1251] eta 0:00:43 lr 0.000485 time 0.2865 (0.2906) loss 4.0536 (3.5085) grad_norm 1.3349 (1.5587) [2022-10-02 09:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1200/1251] eta 0:00:14 lr 0.000484 time 0.2853 (0.2903) loss 3.3551 (3.5014) grad_norm 1.6408 (1.5586) [2022-10-02 09:03:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 153 training takes 0:06:03 [2022-10-02 09:03:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.978 (2.978) Loss 0.9766 (0.9766) Acc@1 77.344 (77.344) Acc@5 92.578 (92.578) [2022-10-02 09:03:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.328 Acc@5 93.414 [2022-10-02 09:03:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-02 09:03:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.33% [2022-10-02 09:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][0/1251] eta 0:54:46 lr 0.000484 time 2.6270 (2.6270) loss 3.1625 (3.1625) grad_norm 1.8120 (1.8120) [2022-10-02 09:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][100/1251] eta 0:06:06 lr 0.000484 time 0.2860 (0.3181) loss 4.1384 (3.4458) grad_norm 1.3124 (1.5345) [2022-10-02 09:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][200/1251] eta 0:05:19 lr 0.000483 time 0.2886 (0.3042) loss 3.8495 (3.5098) grad_norm 1.6038 (1.5536) [2022-10-02 09:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][300/1251] eta 0:04:45 lr 0.000483 time 0.2874 (0.2997) loss 3.7183 (3.4982) grad_norm 1.3738 (1.5526) [2022-10-02 09:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][400/1251] eta 0:04:13 lr 0.000483 time 0.2878 (0.2979) loss 3.9031 (3.4950) grad_norm 1.6431 (1.5415) [2022-10-02 09:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][500/1251] eta 0:03:42 lr 0.000482 time 0.2894 (0.2964) loss 3.7052 (3.4911) grad_norm 1.6318 (1.5388) [2022-10-02 09:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][600/1251] eta 0:03:12 lr 0.000482 time 0.2912 (0.2954) loss 2.6219 (3.4900) grad_norm 1.3714 (1.5471) [2022-10-02 09:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][700/1251] eta 0:02:42 lr 0.000481 time 0.2896 (0.2948) loss 3.9143 (3.5013) grad_norm 1.5237 (1.5455) [2022-10-02 09:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][800/1251] eta 0:02:12 lr 0.000481 time 0.2877 (0.2942) loss 4.1660 (3.5022) grad_norm 1.4566 (1.5486) [2022-10-02 09:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][900/1251] eta 0:01:43 lr 0.000481 time 0.2897 (0.2936) loss 2.9060 (3.4949) grad_norm 1.7270 (1.5500) [2022-10-02 09:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1000/1251] eta 0:01:13 lr 0.000480 time 0.2872 (0.2932) loss 4.1593 (3.4911) grad_norm 1.5988 (1.5517) [2022-10-02 09:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1100/1251] eta 0:00:44 lr 0.000480 time 0.2931 (0.2928) loss 4.0841 (3.4952) grad_norm 1.6747 (1.5537) [2022-10-02 09:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1200/1251] eta 0:00:14 lr 0.000479 time 0.2934 (0.2925) loss 2.9176 (3.5000) grad_norm 1.5197 (1.5573) [2022-10-02 09:09:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 154 training takes 0:06:06 [2022-10-02 09:09:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.763 (2.763) Loss 0.9848 (0.9848) Acc@1 78.223 (78.223) Acc@5 94.434 (94.434) [2022-10-02 09:09:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.060 Acc@5 93.266 [2022-10-02 09:09:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.1% [2022-10-02 09:09:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.33% [2022-10-02 09:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][0/1251] eta 1:02:59 lr 0.000479 time 3.0215 (3.0215) loss 3.3801 (3.3801) grad_norm 1.4685 (1.4685) [2022-10-02 09:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][100/1251] eta 0:06:05 lr 0.000479 time 0.2910 (0.3176) loss 3.9559 (3.4781) grad_norm 1.4782 (1.5366) [2022-10-02 09:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][200/1251] eta 0:05:18 lr 0.000478 time 0.2866 (0.3035) loss 3.3313 (3.5034) grad_norm 1.4194 (1.5426) [2022-10-02 09:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][300/1251] eta 0:04:43 lr 0.000478 time 0.2847 (0.2986) loss 3.4190 (3.5116) grad_norm 1.3786 (1.5575) [2022-10-02 09:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][400/1251] eta 0:04:11 lr 0.000477 time 0.2862 (0.2961) loss 3.9431 (3.5055) grad_norm 1.3791 (1.5472) [2022-10-02 09:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][500/1251] eta 0:03:41 lr 0.000477 time 0.2885 (0.2946) loss 2.8871 (3.5041) grad_norm 1.4197 (1.5465) [2022-10-02 09:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][600/1251] eta 0:03:11 lr 0.000477 time 0.2883 (0.2935) loss 3.8799 (3.4998) grad_norm 1.7691 (1.5477) [2022-10-02 09:13:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][700/1251] eta 0:02:41 lr 0.000476 time 0.2864 (0.2928) loss 2.8486 (3.4980) grad_norm 1.4495 (1.5518) [2022-10-02 09:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][800/1251] eta 0:02:11 lr 0.000476 time 0.2879 (0.2922) loss 3.3957 (3.5064) grad_norm 1.4052 (1.5481) [2022-10-02 09:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][900/1251] eta 0:01:42 lr 0.000475 time 0.2861 (0.2917) loss 2.2851 (3.5173) grad_norm 1.4419 (1.5503) [2022-10-02 09:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1000/1251] eta 0:01:13 lr 0.000475 time 0.2859 (0.2913) loss 3.9940 (3.5185) grad_norm 1.5285 (1.5517) [2022-10-02 09:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1100/1251] eta 0:00:43 lr 0.000475 time 0.2862 (0.2910) loss 2.9002 (3.5165) grad_norm 1.5934 (1.5545) [2022-10-02 09:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1200/1251] eta 0:00:14 lr 0.000474 time 0.2873 (0.2908) loss 2.9809 (3.5159) grad_norm 2.2367 (1.5585) [2022-10-02 09:15:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 155 training takes 0:06:04 [2022-10-02 09:15:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.314 (2.314) Loss 0.9241 (0.9241) Acc@1 77.344 (77.344) Acc@5 93.848 (93.848) [2022-10-02 09:16:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.396 Acc@5 93.462 [2022-10-02 09:16:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-02 09:16:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.40% [2022-10-02 09:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][0/1251] eta 1:02:23 lr 0.000474 time 2.9923 (2.9923) loss 3.9916 (3.9916) grad_norm 1.5354 (1.5354) [2022-10-02 09:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][100/1251] eta 0:06:04 lr 0.000474 time 0.2895 (0.3167) loss 4.4558 (3.5741) grad_norm 1.6563 (1.5644) [2022-10-02 09:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][200/1251] eta 0:05:18 lr 0.000473 time 0.2939 (0.3030) loss 3.6293 (3.5225) grad_norm 1.2531 (1.5639) [2022-10-02 09:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][300/1251] eta 0:04:43 lr 0.000473 time 0.2903 (0.2985) loss 3.6378 (3.5064) grad_norm 1.3383 (1.5763) [2022-10-02 09:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][400/1251] eta 0:04:12 lr 0.000472 time 0.2888 (0.2962) loss 3.9023 (3.4922) grad_norm 1.5760 (1.5830) [2022-10-02 09:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][500/1251] eta 0:03:41 lr 0.000472 time 0.2880 (0.2947) loss 3.6405 (3.5081) grad_norm 1.6890 (1.5852) [2022-10-02 09:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][600/1251] eta 0:03:11 lr 0.000471 time 0.2843 (0.2938) loss 3.6538 (3.5101) grad_norm 1.4901 (1.5825) [2022-10-02 09:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][700/1251] eta 0:02:41 lr 0.000471 time 0.2884 (0.2930) loss 3.7452 (3.5140) grad_norm 1.4623 (1.5813) [2022-10-02 09:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][800/1251] eta 0:02:11 lr 0.000471 time 0.2882 (0.2924) loss 3.7510 (3.5144) grad_norm 1.4725 (1.5820) [2022-10-02 09:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][900/1251] eta 0:01:42 lr 0.000470 time 0.2869 (0.2919) loss 3.8852 (3.5243) grad_norm 1.8002 (1.5805) [2022-10-02 09:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1000/1251] eta 0:01:13 lr 0.000470 time 0.2913 (0.2916) loss 4.1479 (3.5229) grad_norm 1.5433 (1.5800) [2022-10-02 09:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1100/1251] eta 0:00:43 lr 0.000469 time 0.2873 (0.2912) loss 3.4247 (3.5253) grad_norm 1.7951 (1.5836) [2022-10-02 09:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1200/1251] eta 0:00:14 lr 0.000469 time 0.2903 (0.2910) loss 2.6822 (3.5234) grad_norm 1.4876 (1.5804) [2022-10-02 09:22:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 156 training takes 0:06:04 [2022-10-02 09:22:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.349 (3.349) Loss 1.0627 (1.0627) Acc@1 74.902 (74.902) Acc@5 92.578 (92.578) [2022-10-02 09:22:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.374 Acc@5 93.592 [2022-10-02 09:22:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-10-02 09:22:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.40% [2022-10-02 09:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][0/1251] eta 0:51:27 lr 0.000469 time 2.4680 (2.4680) loss 2.0247 (2.0247) grad_norm 1.4211 (1.4211) [2022-10-02 09:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][100/1251] eta 0:06:02 lr 0.000468 time 0.2903 (0.3146) loss 3.4120 (3.5211) grad_norm 1.5537 (1.5767) [2022-10-02 09:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][200/1251] eta 0:05:17 lr 0.000468 time 0.3524 (0.3023) loss 3.8542 (3.5011) grad_norm 1.6722 (1.5803) [2022-10-02 09:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][300/1251] eta 0:04:43 lr 0.000468 time 0.2902 (0.2982) loss 2.3246 (3.5131) grad_norm 1.3835 (1.5789) [2022-10-02 09:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][400/1251] eta 0:04:11 lr 0.000467 time 0.2906 (0.2961) loss 3.5491 (3.5201) grad_norm 2.0060 (1.5812) [2022-10-02 09:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][500/1251] eta 0:03:41 lr 0.000467 time 0.2874 (0.2948) loss 3.7680 (3.5081) grad_norm 1.4335 (1.5846) [2022-10-02 09:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][600/1251] eta 0:03:11 lr 0.000466 time 0.2938 (0.2940) loss 3.9654 (3.5049) grad_norm 1.4169 (1.5887) [2022-10-02 09:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][700/1251] eta 0:02:41 lr 0.000466 time 0.2873 (0.2934) loss 3.5800 (3.5121) grad_norm 1.7370 (1.5974) [2022-10-02 09:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][800/1251] eta 0:02:12 lr 0.000465 time 0.2905 (0.2929) loss 2.3690 (3.5031) grad_norm 1.4262 (1.5954) [2022-10-02 09:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][900/1251] eta 0:01:42 lr 0.000465 time 0.2908 (0.2925) loss 3.7649 (3.5108) grad_norm 1.6458 (1.5925) [2022-10-02 09:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1000/1251] eta 0:01:13 lr 0.000465 time 0.2899 (0.2922) loss 3.8286 (3.5048) grad_norm 1.4654 (1.5899) [2022-10-02 09:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1100/1251] eta 0:00:44 lr 0.000464 time 0.2903 (0.2919) loss 4.2620 (3.4991) grad_norm 1.4615 (1.5884) [2022-10-02 09:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1200/1251] eta 0:00:14 lr 0.000464 time 0.2904 (0.2917) loss 3.6798 (3.4987) grad_norm 2.1428 (1.5935) [2022-10-02 09:28:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 157 training takes 0:06:05 [2022-10-02 09:28:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.318 (2.318) Loss 0.9562 (0.9562) Acc@1 76.758 (76.758) Acc@5 94.141 (94.141) [2022-10-02 09:28:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.180 Acc@5 93.514 [2022-10-02 09:28:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-02 09:28:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.40% [2022-10-02 09:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][0/1251] eta 1:10:27 lr 0.000464 time 3.3793 (3.3793) loss 2.8491 (2.8491) grad_norm 1.3889 (1.3889) [2022-10-02 09:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][100/1251] eta 0:06:08 lr 0.000463 time 0.2961 (0.3205) loss 4.0904 (3.5825) grad_norm 1.6253 (1.6383) [2022-10-02 09:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][200/1251] eta 0:05:20 lr 0.000463 time 0.2875 (0.3049) loss 3.3261 (3.5492) grad_norm 1.5275 (1.6552) [2022-10-02 09:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][300/1251] eta 0:04:45 lr 0.000462 time 0.2919 (0.2997) loss 3.9531 (3.5100) grad_norm 1.4850 (1.6468) [2022-10-02 09:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][400/1251] eta 0:04:12 lr 0.000462 time 0.2893 (0.2970) loss 3.4793 (3.5089) grad_norm 1.7936 (1.6296) [2022-10-02 09:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][500/1251] eta 0:03:41 lr 0.000462 time 0.2895 (0.2954) loss 3.6533 (3.5126) grad_norm 1.3866 (1.6261) [2022-10-02 09:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][600/1251] eta 0:03:11 lr 0.000461 time 0.2879 (0.2943) loss 3.8887 (3.5228) grad_norm 1.5199 (1.6182) [2022-10-02 09:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][700/1251] eta 0:02:41 lr 0.000461 time 0.2889 (0.2935) loss 2.9380 (3.5331) grad_norm 1.6468 (1.6143) [2022-10-02 09:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][800/1251] eta 0:02:12 lr 0.000460 time 0.2880 (0.2928) loss 2.4125 (3.5273) grad_norm 1.6842 (1.6102) [2022-10-02 09:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][900/1251] eta 0:01:42 lr 0.000460 time 0.2899 (0.2922) loss 4.2091 (3.5312) grad_norm 1.5707 (1.6033) [2022-10-02 09:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1000/1251] eta 0:01:13 lr 0.000459 time 0.2877 (0.2918) loss 3.9273 (3.5319) grad_norm 1.4013 (1.6005) [2022-10-02 09:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1100/1251] eta 0:00:43 lr 0.000459 time 0.2860 (0.2914) loss 4.3400 (3.5347) grad_norm 1.6414 (1.6024) [2022-10-02 09:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1200/1251] eta 0:00:14 lr 0.000459 time 0.2867 (0.2911) loss 3.8590 (3.5306) grad_norm 1.6758 (1.6027) [2022-10-02 09:34:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 158 training takes 0:06:04 [2022-10-02 09:34:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.013 (3.013) Loss 0.9590 (0.9590) Acc@1 76.367 (76.367) Acc@5 93.457 (93.457) [2022-10-02 09:34:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.226 Acc@5 93.522 [2022-10-02 09:34:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-02 09:34:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.40% [2022-10-02 09:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][0/1251] eta 1:09:45 lr 0.000458 time 3.3459 (3.3459) loss 3.7660 (3.7660) grad_norm 1.4243 (1.4243) [2022-10-02 09:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][100/1251] eta 0:06:09 lr 0.000458 time 0.2874 (0.3207) loss 4.2686 (3.5138) grad_norm 1.7136 (1.5804) [2022-10-02 09:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][200/1251] eta 0:05:20 lr 0.000458 time 0.2926 (0.3054) loss 3.4219 (3.5016) grad_norm 1.4629 (1.5686) [2022-10-02 09:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][300/1251] eta 0:04:45 lr 0.000457 time 0.2870 (0.3003) loss 4.1525 (3.5159) grad_norm 1.4625 (1.5763) [2022-10-02 09:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][400/1251] eta 0:04:13 lr 0.000457 time 0.2912 (0.2976) loss 3.0986 (3.5045) grad_norm 1.5518 (1.5815) [2022-10-02 09:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][500/1251] eta 0:03:42 lr 0.000456 time 0.2883 (0.2959) loss 2.5022 (3.4827) grad_norm 1.7944 (1.5850) [2022-10-02 09:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][600/1251] eta 0:03:11 lr 0.000456 time 0.2924 (0.2947) loss 2.9569 (3.4783) grad_norm 1.4367 (1.5858) [2022-10-02 09:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][700/1251] eta 0:02:41 lr 0.000456 time 0.2897 (0.2939) loss 2.9794 (3.4835) grad_norm 1.5597 (1.5795) [2022-10-02 09:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][800/1251] eta 0:02:12 lr 0.000455 time 0.2882 (0.2932) loss 3.7798 (3.4931) grad_norm 1.6811 (1.5826) [2022-10-02 09:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][900/1251] eta 0:01:42 lr 0.000455 time 0.2888 (0.2927) loss 4.4624 (3.4941) grad_norm 1.6627 (1.5837) [2022-10-02 09:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1000/1251] eta 0:01:13 lr 0.000454 time 0.2889 (0.2922) loss 3.9006 (3.4955) grad_norm 1.6945 (1.5817) [2022-10-02 09:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1100/1251] eta 0:00:44 lr 0.000454 time 0.2900 (0.2919) loss 3.7518 (3.4983) grad_norm 1.5014 (1.5838) [2022-10-02 09:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1200/1251] eta 0:00:14 lr 0.000453 time 0.2877 (0.2915) loss 3.0650 (3.4964) grad_norm 1.7133 (1.5840) [2022-10-02 09:40:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 159 training takes 0:06:04 [2022-10-02 09:41:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.881 (2.881) Loss 1.0463 (1.0463) Acc@1 75.195 (75.195) Acc@5 92.969 (92.969) [2022-10-02 09:41:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.292 Acc@5 93.552 [2022-10-02 09:41:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-02 09:41:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.40% [2022-10-02 09:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][0/1251] eta 0:55:27 lr 0.000453 time 2.6600 (2.6600) loss 4.4146 (4.4146) grad_norm 1.6649 (1.6649) [2022-10-02 09:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][100/1251] eta 0:06:00 lr 0.000453 time 0.2891 (0.3129) loss 4.0549 (3.5156) grad_norm 1.6385 (1.6102) [2022-10-02 09:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][200/1251] eta 0:05:15 lr 0.000452 time 0.2924 (0.3003) loss 4.0253 (3.4987) grad_norm 1.4944 (1.6054) [2022-10-02 09:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][300/1251] eta 0:04:41 lr 0.000452 time 0.2872 (0.2961) loss 2.4835 (3.4955) grad_norm 1.4474 (1.6180) [2022-10-02 09:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][400/1251] eta 0:04:10 lr 0.000452 time 0.2904 (0.2941) loss 4.2284 (3.4931) grad_norm 1.5271 (1.6197) [2022-10-02 09:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][500/1251] eta 0:03:39 lr 0.000451 time 0.2881 (0.2926) loss 3.0372 (3.4791) grad_norm 1.4421 (1.6072) [2022-10-02 09:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][600/1251] eta 0:03:09 lr 0.000451 time 0.2888 (0.2918) loss 3.9816 (3.4751) grad_norm 1.5926 (1.6036) [2022-10-02 09:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][700/1251] eta 0:02:40 lr 0.000450 time 0.2898 (0.2911) loss 2.3259 (3.4720) grad_norm 1.4938 (1.6103) [2022-10-02 09:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][800/1251] eta 0:02:11 lr 0.000450 time 0.2858 (0.2906) loss 3.7736 (3.4663) grad_norm 1.7937 (1.6078) [2022-10-02 09:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][900/1251] eta 0:01:41 lr 0.000450 time 0.2871 (0.2902) loss 4.0926 (3.4662) grad_norm 1.6431 (1.6041) [2022-10-02 09:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1000/1251] eta 0:01:12 lr 0.000449 time 0.2874 (0.2899) loss 4.2892 (3.4679) grad_norm 1.7169 (1.6067) [2022-10-02 09:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1100/1251] eta 0:00:43 lr 0.000449 time 0.2852 (0.2897) loss 4.0258 (3.4588) grad_norm 1.6654 (1.6084) [2022-10-02 09:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1200/1251] eta 0:00:14 lr 0.000448 time 0.2881 (0.2895) loss 3.9080 (3.4633) grad_norm 1.3890 (1.6103) [2022-10-02 09:47:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 160 training takes 0:06:02 [2022-10-02 09:47:12 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saving...... [2022-10-02 09:47:13 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saved !!! [2022-10-02 09:47:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.090 (3.090) Loss 0.9211 (0.9211) Acc@1 79.395 (79.395) Acc@5 94.531 (94.531) [2022-10-02 09:47:25 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.482 Acc@5 93.608 [2022-10-02 09:47:25 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-10-02 09:47:25 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.48% [2022-10-02 09:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][0/1251] eta 1:10:36 lr 0.000448 time 3.3863 (3.3863) loss 3.6653 (3.6653) grad_norm 1.4394 (1.4394) [2022-10-02 09:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][100/1251] eta 0:06:09 lr 0.000448 time 0.2905 (0.3213) loss 3.5763 (3.4186) grad_norm 1.8133 (1.6057) [2022-10-02 09:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][200/1251] eta 0:05:20 lr 0.000447 time 0.2902 (0.3052) loss 3.4274 (3.4168) grad_norm 1.6701 (1.6065) [2022-10-02 09:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][300/1251] eta 0:04:45 lr 0.000447 time 0.2896 (0.3000) loss 4.0320 (3.4495) grad_norm 1.9465 (1.6175) [2022-10-02 09:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][400/1251] eta 0:04:12 lr 0.000446 time 0.2883 (0.2972) loss 3.9826 (3.4730) grad_norm 1.4640 (1.6186) [2022-10-02 09:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][500/1251] eta 0:03:42 lr 0.000446 time 0.2932 (0.2956) loss 3.0227 (3.4792) grad_norm 1.4979 (1.6225) [2022-10-02 09:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][600/1251] eta 0:03:11 lr 0.000446 time 0.2907 (0.2944) loss 4.2132 (3.4791) grad_norm 1.5032 (1.6200) [2022-10-02 09:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][700/1251] eta 0:02:41 lr 0.000445 time 0.2890 (0.2936) loss 4.4462 (3.4852) grad_norm 1.7831 (1.6176) [2022-10-02 09:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][800/1251] eta 0:02:12 lr 0.000445 time 0.2887 (0.2929) loss 3.9418 (3.4916) grad_norm 1.7370 (1.6132) [2022-10-02 09:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][900/1251] eta 0:01:42 lr 0.000444 time 0.2849 (0.2924) loss 3.8898 (3.4952) grad_norm 1.4725 (1.6128) [2022-10-02 09:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1000/1251] eta 0:01:13 lr 0.000444 time 0.2864 (0.2919) loss 3.9605 (3.4974) grad_norm 1.3524 (1.6118) [2022-10-02 09:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1100/1251] eta 0:00:44 lr 0.000444 time 0.2844 (0.2915) loss 4.1468 (3.4954) grad_norm 1.8278 (1.6087) [2022-10-02 09:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1200/1251] eta 0:00:14 lr 0.000443 time 0.2884 (0.2911) loss 3.9135 (3.4960) grad_norm 2.1143 (1.6120) [2022-10-02 09:53:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 161 training takes 0:06:04 [2022-10-02 09:53:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.154 (3.154) Loss 1.0010 (1.0010) Acc@1 77.539 (77.539) Acc@5 93.066 (93.066) [2022-10-02 09:53:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.178 Acc@5 93.482 [2022-10-02 09:53:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-10-02 09:53:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.48% [2022-10-02 09:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][0/1251] eta 1:05:21 lr 0.000443 time 3.1346 (3.1346) loss 2.9196 (2.9196) grad_norm 1.5625 (1.5625) [2022-10-02 09:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][100/1251] eta 0:06:06 lr 0.000443 time 0.2902 (0.3181) loss 3.1599 (3.4170) grad_norm 1.5113 (1.6047) [2022-10-02 09:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][200/1251] eta 0:05:18 lr 0.000442 time 0.2855 (0.3033) loss 3.6000 (3.4340) grad_norm 1.6373 (1.6046) [2022-10-02 09:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][300/1251] eta 0:04:43 lr 0.000442 time 0.2881 (0.2984) loss 3.7298 (3.4294) grad_norm 1.5413 (1.6063) [2022-10-02 09:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][400/1251] eta 0:04:11 lr 0.000441 time 0.2903 (0.2957) loss 3.2745 (3.4524) grad_norm 1.7317 (1.6089) [2022-10-02 09:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][500/1251] eta 0:03:40 lr 0.000441 time 0.2868 (0.2941) loss 3.7654 (3.4593) grad_norm 1.7854 (1.6170) [2022-10-02 09:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][600/1251] eta 0:03:10 lr 0.000440 time 0.2888 (0.2931) loss 2.9208 (3.4721) grad_norm 1.4816 (1.6166) [2022-10-02 09:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][700/1251] eta 0:02:41 lr 0.000440 time 0.2847 (0.2923) loss 3.5916 (3.4841) grad_norm 1.7988 (1.6166) [2022-10-02 09:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][800/1251] eta 0:02:11 lr 0.000440 time 0.2847 (0.2916) loss 3.7013 (3.4887) grad_norm 1.5517 (1.6219) [2022-10-02 09:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][900/1251] eta 0:01:42 lr 0.000439 time 0.2915 (0.2911) loss 3.5858 (3.4898) grad_norm 1.5778 (1.6186) [2022-10-02 09:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1000/1251] eta 0:01:12 lr 0.000439 time 0.2872 (0.2907) loss 2.5214 (3.4881) grad_norm 1.9262 (1.6149) [2022-10-02 09:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1100/1251] eta 0:00:43 lr 0.000438 time 0.2879 (0.2904) loss 4.1672 (3.4924) grad_norm 1.9754 (1.6164) [2022-10-02 09:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1200/1251] eta 0:00:14 lr 0.000438 time 0.2854 (0.2901) loss 4.1443 (3.5045) grad_norm 1.6152 (1.6146) [2022-10-02 09:59:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 162 training takes 0:06:03 [2022-10-02 09:59:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.185 (3.185) Loss 0.8703 (0.8703) Acc@1 77.441 (77.441) Acc@5 94.238 (94.238) [2022-10-02 09:59:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.330 Acc@5 93.638 [2022-10-02 09:59:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-10-02 09:59:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.48% [2022-10-02 10:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][0/1251] eta 1:08:58 lr 0.000438 time 3.3085 (3.3085) loss 3.8062 (3.8062) grad_norm 1.7707 (1.7707) [2022-10-02 10:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][100/1251] eta 0:06:07 lr 0.000437 time 0.2930 (0.3193) loss 2.3441 (3.4887) grad_norm 1.6729 (1.5979) [2022-10-02 10:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][200/1251] eta 0:05:20 lr 0.000437 time 0.2930 (0.3045) loss 3.7828 (3.4833) grad_norm 1.6322 (1.6278) [2022-10-02 10:01:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][300/1251] eta 0:04:44 lr 0.000437 time 0.2880 (0.2995) loss 2.4139 (3.4700) grad_norm 1.7798 (1.6279) [2022-10-02 10:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][400/1251] eta 0:04:12 lr 0.000436 time 0.2913 (0.2970) loss 3.3229 (3.4887) grad_norm 3.1564 (1.6313) [2022-10-02 10:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][500/1251] eta 0:03:41 lr 0.000436 time 0.2877 (0.2954) loss 2.4430 (3.4789) grad_norm 1.3886 (1.6275) [2022-10-02 10:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][600/1251] eta 0:03:11 lr 0.000435 time 0.2907 (0.2942) loss 3.3144 (3.4851) grad_norm 1.5613 (1.6347) [2022-10-02 10:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][700/1251] eta 0:02:41 lr 0.000435 time 0.2888 (0.2934) loss 3.9735 (3.4887) grad_norm 1.4576 (1.6299) [2022-10-02 10:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][800/1251] eta 0:02:12 lr 0.000435 time 0.2890 (0.2927) loss 3.6169 (3.4827) grad_norm 1.7249 (1.6312) [2022-10-02 10:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][900/1251] eta 0:01:42 lr 0.000434 time 0.2912 (0.2923) loss 4.1125 (3.4902) grad_norm 2.0586 (1.6312) [2022-10-02 10:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1000/1251] eta 0:01:13 lr 0.000434 time 0.2877 (0.2919) loss 2.4116 (3.4910) grad_norm 1.6413 (1.6360) [2022-10-02 10:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1100/1251] eta 0:00:44 lr 0.000433 time 0.2886 (0.2916) loss 4.1696 (3.4873) grad_norm 1.4866 (1.6369) [2022-10-02 10:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1200/1251] eta 0:00:14 lr 0.000433 time 0.2894 (0.2913) loss 3.7938 (3.4855) grad_norm 1.4892 (1.6376) [2022-10-02 10:06:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 163 training takes 0:06:04 [2022-10-02 10:06:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.225 (2.225) Loss 0.9774 (0.9774) Acc@1 78.906 (78.906) Acc@5 93.164 (93.164) [2022-10-02 10:06:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.682 Acc@5 93.644 [2022-10-02 10:06:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-10-02 10:06:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.68% [2022-10-02 10:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][0/1251] eta 0:46:08 lr 0.000433 time 2.2127 (2.2127) loss 3.8805 (3.8805) grad_norm 1.6049 (1.6049) [2022-10-02 10:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][100/1251] eta 0:06:01 lr 0.000432 time 0.2906 (0.3140) loss 2.7835 (3.5657) grad_norm 1.5684 (1.6639) [2022-10-02 10:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][200/1251] eta 0:05:16 lr 0.000432 time 0.2918 (0.3016) loss 3.4097 (3.5391) grad_norm 1.9026 (1.6439) [2022-10-02 10:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][300/1251] eta 0:04:42 lr 0.000431 time 0.2896 (0.2973) loss 2.6319 (3.5032) grad_norm 1.8257 (1.6373) [2022-10-02 10:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][400/1251] eta 0:04:11 lr 0.000431 time 0.2918 (0.2953) loss 3.8449 (3.4916) grad_norm 1.8263 (1.6375) [2022-10-02 10:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][500/1251] eta 0:03:40 lr 0.000431 time 0.2860 (0.2939) loss 3.0257 (3.4896) grad_norm 1.8189 (1.6389) [2022-10-02 10:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][600/1251] eta 0:03:10 lr 0.000430 time 0.2870 (0.2930) loss 4.2892 (3.4908) grad_norm 1.6164 (1.6442) [2022-10-02 10:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][700/1251] eta 0:02:41 lr 0.000430 time 0.2878 (0.2923) loss 3.8609 (3.4900) grad_norm 1.4377 (1.6427) [2022-10-02 10:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][800/1251] eta 0:02:11 lr 0.000429 time 0.2919 (0.2918) loss 3.3118 (3.4989) grad_norm 1.5815 (1.6403) [2022-10-02 10:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][900/1251] eta 0:01:42 lr 0.000429 time 0.2862 (0.2914) loss 3.9244 (3.5017) grad_norm 1.3984 (1.6371) [2022-10-02 10:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1000/1251] eta 0:01:13 lr 0.000429 time 0.2898 (0.2910) loss 2.2074 (3.4927) grad_norm 1.5685 (1.6396) [2022-10-02 10:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1100/1251] eta 0:00:43 lr 0.000428 time 0.2878 (0.2907) loss 4.0345 (3.4915) grad_norm 1.9503 (1.6414) [2022-10-02 10:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1200/1251] eta 0:00:14 lr 0.000428 time 0.2893 (0.2904) loss 3.0993 (3.4946) grad_norm 1.5487 (1.6458) [2022-10-02 10:12:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 164 training takes 0:06:03 [2022-10-02 10:12:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.244 (3.244) Loss 0.9571 (0.9571) Acc@1 75.684 (75.684) Acc@5 94.727 (94.727) [2022-10-02 10:12:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.588 Acc@5 93.688 [2022-10-02 10:12:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-02 10:12:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.68% [2022-10-02 10:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][0/1251] eta 1:03:07 lr 0.000428 time 3.0276 (3.0276) loss 3.1942 (3.1942) grad_norm 1.7222 (1.7222) [2022-10-02 10:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][100/1251] eta 0:06:03 lr 0.000427 time 0.2909 (0.3160) loss 3.8574 (3.5160) grad_norm 1.5868 (1.6312) [2022-10-02 10:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][200/1251] eta 0:05:17 lr 0.000427 time 0.2853 (0.3024) loss 3.2925 (3.4843) grad_norm 1.4042 (1.6475) [2022-10-02 10:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][300/1251] eta 0:04:43 lr 0.000426 time 0.2857 (0.2978) loss 2.7016 (3.4960) grad_norm 1.4673 (1.6304) [2022-10-02 10:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][400/1251] eta 0:04:11 lr 0.000426 time 0.2871 (0.2953) loss 3.0615 (3.4875) grad_norm 1.6808 (1.6365) [2022-10-02 10:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][500/1251] eta 0:03:40 lr 0.000426 time 0.2863 (0.2938) loss 3.9165 (3.4921) grad_norm 1.5265 (1.6345) [2022-10-02 10:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][600/1251] eta 0:03:10 lr 0.000425 time 0.2879 (0.2929) loss 3.9722 (3.4941) grad_norm 1.5459 (1.6388) [2022-10-02 10:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][700/1251] eta 0:02:41 lr 0.000425 time 0.2851 (0.2922) loss 3.8139 (3.5021) grad_norm 1.6311 (1.6404) [2022-10-02 10:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][800/1251] eta 0:02:11 lr 0.000424 time 0.2881 (0.2917) loss 3.4718 (3.5020) grad_norm 2.1540 (1.6444) [2022-10-02 10:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][900/1251] eta 0:01:42 lr 0.000424 time 0.2858 (0.2912) loss 3.1546 (3.4952) grad_norm 1.6139 (1.6425) [2022-10-02 10:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1000/1251] eta 0:01:13 lr 0.000423 time 0.2880 (0.2909) loss 3.9847 (3.4850) grad_norm 1.9538 (1.6463) [2022-10-02 10:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1100/1251] eta 0:00:43 lr 0.000423 time 0.2872 (0.2906) loss 3.4828 (3.4818) grad_norm 1.3991 (1.6484) [2022-10-02 10:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1200/1251] eta 0:00:14 lr 0.000423 time 0.2865 (0.2905) loss 4.2237 (3.4881) grad_norm 1.5484 (1.6469) [2022-10-02 10:18:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 165 training takes 0:06:03 [2022-10-02 10:18:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.270 (2.270) Loss 1.0586 (1.0586) Acc@1 75.391 (75.391) Acc@5 92.090 (92.090) [2022-10-02 10:18:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.568 Acc@5 93.652 [2022-10-02 10:18:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-02 10:18:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.68% [2022-10-02 10:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][0/1251] eta 1:06:58 lr 0.000422 time 3.2125 (3.2125) loss 4.0319 (4.0319) grad_norm 1.8205 (1.8205) [2022-10-02 10:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][100/1251] eta 0:06:07 lr 0.000422 time 0.2876 (0.3189) loss 3.7128 (3.4962) grad_norm 1.6141 (1.6979) [2022-10-02 10:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][200/1251] eta 0:05:19 lr 0.000422 time 0.2895 (0.3041) loss 3.7096 (3.4650) grad_norm 1.4106 (1.6610) [2022-10-02 10:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][300/1251] eta 0:04:44 lr 0.000421 time 0.2890 (0.2991) loss 3.9175 (3.4682) grad_norm 1.9175 (1.6537) [2022-10-02 10:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][400/1251] eta 0:04:12 lr 0.000421 time 0.2889 (0.2966) loss 3.0895 (3.4650) grad_norm 1.5613 (1.6599) [2022-10-02 10:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][500/1251] eta 0:03:41 lr 0.000420 time 0.2896 (0.2950) loss 3.9861 (3.4617) grad_norm 1.4454 (1.6528) [2022-10-02 10:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][600/1251] eta 0:03:11 lr 0.000420 time 0.2875 (0.2941) loss 3.5462 (3.4649) grad_norm 1.7300 (1.6492) [2022-10-02 10:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][700/1251] eta 0:02:41 lr 0.000420 time 0.2870 (0.2934) loss 3.8974 (3.4712) grad_norm 1.7152 (1.6532) [2022-10-02 10:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][800/1251] eta 0:02:12 lr 0.000419 time 0.2883 (0.2930) loss 2.8106 (3.4654) grad_norm 1.5603 (1.6534) [2022-10-02 10:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][900/1251] eta 0:01:42 lr 0.000419 time 0.2877 (0.2927) loss 4.0061 (3.4632) grad_norm 1.5036 (1.6520) [2022-10-02 10:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1000/1251] eta 0:01:13 lr 0.000418 time 0.2914 (0.2923) loss 4.0380 (3.4691) grad_norm 1.5300 (1.6552) [2022-10-02 10:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1100/1251] eta 0:00:44 lr 0.000418 time 0.2866 (0.2921) loss 3.8082 (3.4675) grad_norm 1.5957 (1.6611) [2022-10-02 10:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1200/1251] eta 0:00:14 lr 0.000418 time 0.2890 (0.2919) loss 3.8565 (3.4759) grad_norm 1.6658 (1.6617) [2022-10-02 10:24:53 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 166 training takes 0:06:05 [2022-10-02 10:24:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.930 (2.930) Loss 0.9974 (0.9974) Acc@1 76.465 (76.465) Acc@5 93.262 (93.262) [2022-10-02 10:25:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.798 Acc@5 93.702 [2022-10-02 10:25:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-02 10:25:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.80% [2022-10-02 10:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][0/1251] eta 1:05:41 lr 0.000417 time 3.1506 (3.1506) loss 3.8988 (3.8988) grad_norm 1.6016 (1.6016) [2022-10-02 10:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][100/1251] eta 0:06:06 lr 0.000417 time 0.2890 (0.3182) loss 3.8210 (3.4485) grad_norm 1.8544 (1.6806) [2022-10-02 10:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][200/1251] eta 0:05:19 lr 0.000417 time 0.2886 (0.3042) loss 3.8807 (3.5315) grad_norm 1.6069 (1.6714) [2022-10-02 10:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][300/1251] eta 0:04:44 lr 0.000416 time 0.2932 (0.2994) loss 3.9527 (3.5385) grad_norm 1.7893 (1.6720) [2022-10-02 10:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][400/1251] eta 0:04:12 lr 0.000416 time 0.2878 (0.2971) loss 4.0672 (3.5033) grad_norm 1.6274 (1.6633) [2022-10-02 10:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][500/1251] eta 0:03:42 lr 0.000415 time 0.2889 (0.2956) loss 4.1265 (3.5013) grad_norm 1.7644 (1.6685) [2022-10-02 10:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][600/1251] eta 0:03:11 lr 0.000415 time 0.2854 (0.2946) loss 4.1152 (3.5030) grad_norm 1.7490 (1.6709) [2022-10-02 10:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][700/1251] eta 0:02:41 lr 0.000414 time 0.2897 (0.2939) loss 4.1290 (3.5067) grad_norm 1.7316 (1.6679) [2022-10-02 10:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][800/1251] eta 0:02:12 lr 0.000414 time 0.2879 (0.2934) loss 3.4717 (3.5023) grad_norm 1.5791 (1.6645) [2022-10-02 10:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][900/1251] eta 0:01:42 lr 0.000414 time 0.2888 (0.2929) loss 2.5293 (3.4928) grad_norm 1.6097 (1.6691) [2022-10-02 10:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1000/1251] eta 0:01:13 lr 0.000413 time 0.2865 (0.2924) loss 2.3657 (3.4902) grad_norm 1.5667 (1.6689) [2022-10-02 10:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1100/1251] eta 0:00:44 lr 0.000413 time 0.2896 (0.2921) loss 3.3998 (3.4869) grad_norm 1.5514 (1.6703) [2022-10-02 10:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1200/1251] eta 0:00:14 lr 0.000412 time 0.2844 (0.2919) loss 4.2263 (3.4819) grad_norm 1.7851 (1.6699) [2022-10-02 10:31:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 167 training takes 0:06:05 [2022-10-02 10:31:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.026 (3.026) Loss 1.0000 (1.0000) Acc@1 77.344 (77.344) Acc@5 92.969 (92.969) [2022-10-02 10:31:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.646 Acc@5 93.726 [2022-10-02 10:31:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-10-02 10:31:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.80% [2022-10-02 10:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][0/1251] eta 0:58:13 lr 0.000412 time 2.7925 (2.7925) loss 3.7652 (3.7652) grad_norm 1.6053 (1.6053) [2022-10-02 10:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][100/1251] eta 0:06:03 lr 0.000412 time 0.2904 (0.3162) loss 3.9579 (3.5259) grad_norm 1.7709 (1.6609) [2022-10-02 10:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][200/1251] eta 0:05:18 lr 0.000411 time 0.2889 (0.3030) loss 3.0029 (3.4527) grad_norm 1.7710 (1.6853) [2022-10-02 10:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][300/1251] eta 0:04:43 lr 0.000411 time 0.2876 (0.2986) loss 3.1283 (3.4465) grad_norm 1.4627 (1.6685) [2022-10-02 10:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][400/1251] eta 0:04:12 lr 0.000411 time 0.2873 (0.2964) loss 3.6500 (3.4636) grad_norm 1.4895 (1.6547) [2022-10-02 10:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][500/1251] eta 0:03:41 lr 0.000410 time 0.2920 (0.2952) loss 3.2515 (3.4541) grad_norm 1.5775 (1.6582) [2022-10-02 10:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][600/1251] eta 0:03:11 lr 0.000410 time 0.2888 (0.2942) loss 3.7272 (3.4493) grad_norm 1.7887 (1.6578) [2022-10-02 10:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][700/1251] eta 0:02:41 lr 0.000409 time 0.2877 (0.2934) loss 2.6267 (3.4382) grad_norm 1.4452 (1.6622) [2022-10-02 10:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][800/1251] eta 0:02:12 lr 0.000409 time 0.2860 (0.2928) loss 2.7851 (3.4438) grad_norm 1.4957 (1.6594) [2022-10-02 10:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][900/1251] eta 0:01:42 lr 0.000409 time 0.2929 (0.2923) loss 3.6892 (3.4629) grad_norm 1.4778 (1.6644) [2022-10-02 10:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1000/1251] eta 0:01:13 lr 0.000408 time 0.2898 (0.2919) loss 2.4242 (3.4632) grad_norm 1.5497 (1.6610) [2022-10-02 10:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1100/1251] eta 0:00:44 lr 0.000408 time 0.2877 (0.2918) loss 3.4319 (3.4636) grad_norm 1.4574 (1.6605) [2022-10-02 10:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1200/1251] eta 0:00:14 lr 0.000407 time 0.2875 (0.2916) loss 3.5014 (3.4488) grad_norm 1.4072 (1.6636) [2022-10-02 10:37:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 168 training takes 0:06:05 [2022-10-02 10:37:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.204 (3.204) Loss 1.0013 (1.0013) Acc@1 77.051 (77.051) Acc@5 94.043 (94.043) [2022-10-02 10:37:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.844 Acc@5 93.790 [2022-10-02 10:37:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-02 10:37:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.84% [2022-10-02 10:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][0/1251] eta 1:03:13 lr 0.000407 time 3.0323 (3.0323) loss 3.6070 (3.6070) grad_norm 1.7487 (1.7487) [2022-10-02 10:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][100/1251] eta 0:06:06 lr 0.000407 time 0.2921 (0.3181) loss 3.8877 (3.4078) grad_norm 1.6327 (1.6654) [2022-10-02 10:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][200/1251] eta 0:05:19 lr 0.000406 time 0.2915 (0.3042) loss 3.9955 (3.3977) grad_norm 1.6555 (1.6618) [2022-10-02 10:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][300/1251] eta 0:04:44 lr 0.000406 time 0.2876 (0.2995) loss 3.5665 (3.4194) grad_norm 1.6251 (1.6681) [2022-10-02 10:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][400/1251] eta 0:04:12 lr 0.000406 time 0.2881 (0.2972) loss 3.3046 (3.4322) grad_norm 1.6741 (1.6648) [2022-10-02 10:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][500/1251] eta 0:03:42 lr 0.000405 time 0.2908 (0.2957) loss 3.9299 (3.4496) grad_norm 1.9039 (1.6705) [2022-10-02 10:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][600/1251] eta 0:03:11 lr 0.000405 time 0.2893 (0.2948) loss 2.8374 (3.4644) grad_norm 1.7116 (1.6661) [2022-10-02 10:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][700/1251] eta 0:02:42 lr 0.000404 time 0.2918 (0.2941) loss 3.9257 (3.4653) grad_norm 1.5395 (1.6743) [2022-10-02 10:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][800/1251] eta 0:02:12 lr 0.000404 time 0.2878 (0.2936) loss 4.1822 (3.4684) grad_norm 1.8426 (1.6768) [2022-10-02 10:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][900/1251] eta 0:01:42 lr 0.000404 time 0.2899 (0.2932) loss 4.0586 (3.4621) grad_norm 1.8019 (1.6783) [2022-10-02 10:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1000/1251] eta 0:01:13 lr 0.000403 time 0.2879 (0.2929) loss 4.3032 (3.4680) grad_norm 1.7687 (1.6803) [2022-10-02 10:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1100/1251] eta 0:00:44 lr 0.000403 time 0.2897 (0.2926) loss 3.9409 (3.4692) grad_norm 1.7026 (1.6835) [2022-10-02 10:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1200/1251] eta 0:00:14 lr 0.000402 time 0.2920 (0.2923) loss 2.6447 (3.4673) grad_norm 1.5721 (1.6802) [2022-10-02 10:43:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 169 training takes 0:06:05 [2022-10-02 10:43:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.752 (2.752) Loss 0.9210 (0.9210) Acc@1 76.855 (76.855) Acc@5 93.848 (93.848) [2022-10-02 10:44:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.770 Acc@5 93.792 [2022-10-02 10:44:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-02 10:44:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.84% [2022-10-02 10:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][0/1251] eta 1:06:19 lr 0.000402 time 3.1813 (3.1813) loss 2.5971 (2.5971) grad_norm 1.7283 (1.7283) [2022-10-02 10:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][100/1251] eta 0:06:05 lr 0.000402 time 0.2886 (0.3172) loss 4.1605 (3.4675) grad_norm 1.4146 (1.7108) [2022-10-02 10:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][200/1251] eta 0:05:18 lr 0.000401 time 0.2905 (0.3028) loss 3.0028 (3.4846) grad_norm 1.5615 (1.6790) [2022-10-02 10:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][300/1251] eta 0:04:43 lr 0.000401 time 0.2869 (0.2980) loss 3.7025 (3.4751) grad_norm 1.6440 (1.6923) [2022-10-02 10:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][400/1251] eta 0:04:11 lr 0.000400 time 0.2907 (0.2956) loss 4.2679 (3.4654) grad_norm 1.5829 (1.6877) [2022-10-02 10:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][500/1251] eta 0:03:40 lr 0.000400 time 0.2885 (0.2941) loss 3.7386 (3.4649) grad_norm 1.9487 (1.6789) [2022-10-02 10:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][600/1251] eta 0:03:10 lr 0.000400 time 0.2890 (0.2931) loss 3.5165 (3.4634) grad_norm 1.4168 (1.6703) [2022-10-02 10:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][700/1251] eta 0:02:41 lr 0.000399 time 0.2892 (0.2924) loss 3.8405 (3.4644) grad_norm 1.4979 (1.6756) [2022-10-02 10:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][800/1251] eta 0:02:11 lr 0.000399 time 0.2903 (0.2919) loss 3.8666 (3.4675) grad_norm 1.5089 (1.6762) [2022-10-02 10:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][900/1251] eta 0:01:42 lr 0.000398 time 0.2928 (0.2915) loss 2.7589 (3.4769) grad_norm 1.5910 (1.6771) [2022-10-02 10:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1000/1251] eta 0:01:13 lr 0.000398 time 0.2922 (0.2911) loss 3.8474 (3.4734) grad_norm 1.6841 (1.6814) [2022-10-02 10:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1100/1251] eta 0:00:43 lr 0.000398 time 0.2879 (0.2908) loss 3.9640 (3.4652) grad_norm 2.0190 (1.6806) [2022-10-02 10:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1200/1251] eta 0:00:14 lr 0.000397 time 0.2884 (0.2905) loss 3.7893 (3.4671) grad_norm 1.5360 (1.6788) [2022-10-02 10:50:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 170 training takes 0:06:03 [2022-10-02 10:50:04 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saving...... [2022-10-02 10:50:05 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saved !!! [2022-10-02 10:50:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.606 (2.606) Loss 1.0250 (1.0250) Acc@1 75.684 (75.684) Acc@5 93.164 (93.164) [2022-10-02 10:50:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.950 Acc@5 93.862 [2022-10-02 10:50:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-02 10:50:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 76.95% [2022-10-02 10:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][0/1251] eta 0:47:48 lr 0.000397 time 2.2932 (2.2932) loss 4.1699 (4.1699) grad_norm 1.6091 (1.6091) [2022-10-02 10:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][100/1251] eta 0:06:00 lr 0.000397 time 0.2881 (0.3135) loss 2.4874 (3.3775) grad_norm 1.6369 (1.6862) [2022-10-02 10:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][200/1251] eta 0:05:16 lr 0.000396 time 0.2873 (0.3013) loss 3.2285 (3.4291) grad_norm 1.7671 (1.6808) [2022-10-02 10:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][300/1251] eta 0:04:42 lr 0.000396 time 0.2868 (0.2971) loss 3.6020 (3.4114) grad_norm 1.8857 (1.6741) [2022-10-02 10:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][400/1251] eta 0:04:10 lr 0.000395 time 0.2847 (0.2949) loss 3.8567 (3.4309) grad_norm 1.8428 (1.6795) [2022-10-02 10:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][500/1251] eta 0:03:40 lr 0.000395 time 0.2872 (0.2936) loss 3.5358 (3.4292) grad_norm 1.8227 (1.6783) [2022-10-02 10:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][600/1251] eta 0:03:10 lr 0.000395 time 0.2868 (0.2928) loss 2.3516 (3.4455) grad_norm 1.7045 (1.6902) [2022-10-02 10:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][700/1251] eta 0:02:40 lr 0.000394 time 0.2851 (0.2922) loss 3.7803 (3.4554) grad_norm 1.5885 (1.6880) [2022-10-02 10:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][800/1251] eta 0:02:11 lr 0.000394 time 0.2846 (0.2916) loss 3.8723 (3.4547) grad_norm 1.5669 (1.6955) [2022-10-02 10:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][900/1251] eta 0:01:42 lr 0.000393 time 0.2871 (0.2912) loss 3.8989 (3.4589) grad_norm 1.7012 (1.6944) [2022-10-02 10:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1000/1251] eta 0:01:13 lr 0.000393 time 0.2866 (0.2909) loss 3.0685 (3.4533) grad_norm 1.4430 (1.6941) [2022-10-02 10:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1100/1251] eta 0:00:43 lr 0.000393 time 0.2859 (0.2907) loss 2.8365 (3.4515) grad_norm 1.7747 (1.6974) [2022-10-02 10:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1200/1251] eta 0:00:14 lr 0.000392 time 0.2863 (0.2905) loss 4.2323 (3.4550) grad_norm 1.4802 (1.6971) [2022-10-02 10:56:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 171 training takes 0:06:03 [2022-10-02 10:56:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.630 (2.630) Loss 0.9551 (0.9551) Acc@1 76.270 (76.270) Acc@5 93.848 (93.848) [2022-10-02 10:56:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.018 Acc@5 93.780 [2022-10-02 10:56:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-02 10:56:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.02% [2022-10-02 10:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][0/1251] eta 1:10:55 lr 0.000392 time 3.4015 (3.4015) loss 2.5342 (2.5342) grad_norm 1.6393 (1.6393) [2022-10-02 10:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][100/1251] eta 0:06:09 lr 0.000392 time 0.2896 (0.3213) loss 3.4387 (3.3729) grad_norm 1.5313 (1.6992) [2022-10-02 10:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][200/1251] eta 0:05:21 lr 0.000391 time 0.2909 (0.3056) loss 3.1700 (3.3688) grad_norm 1.6925 (1.7143) [2022-10-02 10:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][300/1251] eta 0:04:45 lr 0.000391 time 0.2892 (0.3006) loss 4.0251 (3.3775) grad_norm 1.5753 (1.7097) [2022-10-02 10:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][400/1251] eta 0:04:13 lr 0.000390 time 0.2862 (0.2980) loss 3.9442 (3.3785) grad_norm 1.4235 (1.7024) [2022-10-02 10:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][500/1251] eta 0:03:42 lr 0.000390 time 0.2909 (0.2963) loss 3.5300 (3.3950) grad_norm 1.5558 (1.6992) [2022-10-02 10:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][600/1251] eta 0:03:12 lr 0.000390 time 0.2888 (0.2953) loss 3.5907 (3.4022) grad_norm 1.8281 (1.7036) [2022-10-02 11:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][700/1251] eta 0:02:42 lr 0.000389 time 0.2913 (0.2945) loss 4.1702 (3.4052) grad_norm 1.9313 (1.6961) [2022-10-02 11:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][800/1251] eta 0:02:12 lr 0.000389 time 0.2894 (0.2940) loss 3.9655 (3.4133) grad_norm 1.6007 (1.6977) [2022-10-02 11:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][900/1251] eta 0:01:43 lr 0.000388 time 0.2927 (0.2935) loss 3.8685 (3.4149) grad_norm 1.5896 (1.6984) [2022-10-02 11:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1000/1251] eta 0:01:13 lr 0.000388 time 0.2874 (0.2931) loss 3.5846 (3.4214) grad_norm 1.6497 (1.6990) [2022-10-02 11:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1100/1251] eta 0:00:44 lr 0.000388 time 0.2897 (0.2928) loss 3.7233 (3.4226) grad_norm 1.5247 (1.7000) [2022-10-02 11:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1200/1251] eta 0:00:14 lr 0.000387 time 0.2915 (0.2926) loss 4.2588 (3.4212) grad_norm 1.8578 (1.7005) [2022-10-02 11:02:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 172 training takes 0:06:06 [2022-10-02 11:02:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.088 (2.088) Loss 0.8942 (0.8942) Acc@1 79.297 (79.297) Acc@5 93.164 (93.164) [2022-10-02 11:02:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.752 Acc@5 93.844 [2022-10-02 11:02:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-10-02 11:02:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.02% [2022-10-02 11:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][0/1251] eta 0:59:39 lr 0.000387 time 2.8617 (2.8617) loss 4.0024 (4.0024) grad_norm 1.6550 (1.6550) [2022-10-02 11:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][100/1251] eta 0:06:02 lr 0.000387 time 0.2907 (0.3145) loss 3.0013 (3.4299) grad_norm 1.5108 (1.7056) [2022-10-02 11:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][200/1251] eta 0:05:16 lr 0.000386 time 0.2918 (0.3014) loss 2.6090 (3.4687) grad_norm 1.7700 (1.7090) [2022-10-02 11:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][300/1251] eta 0:04:42 lr 0.000386 time 0.2861 (0.2971) loss 3.5677 (3.4716) grad_norm 1.8498 (1.7056) [2022-10-02 11:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][400/1251] eta 0:04:10 lr 0.000385 time 0.2925 (0.2949) loss 3.7531 (3.4572) grad_norm 2.3314 (1.7142) [2022-10-02 11:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][500/1251] eta 0:03:40 lr 0.000385 time 0.2863 (0.2935) loss 3.5096 (3.4609) grad_norm 2.0380 (1.7174) [2022-10-02 11:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][600/1251] eta 0:03:10 lr 0.000385 time 0.2921 (0.2927) loss 3.3900 (3.4591) grad_norm 1.5970 (1.7179) [2022-10-02 11:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][700/1251] eta 0:02:40 lr 0.000384 time 0.2886 (0.2920) loss 2.7647 (3.4688) grad_norm 1.7957 (1.7246) [2022-10-02 11:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][800/1251] eta 0:02:11 lr 0.000384 time 0.2933 (0.2914) loss 3.6504 (3.4620) grad_norm 1.5930 (1.7189) [2022-10-02 11:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][900/1251] eta 0:01:42 lr 0.000383 time 0.2860 (0.2909) loss 2.4049 (3.4684) grad_norm 1.4420 (1.7194) [2022-10-02 11:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1000/1251] eta 0:01:12 lr 0.000383 time 0.2896 (0.2905) loss 3.8850 (3.4746) grad_norm 1.6399 (1.7216) [2022-10-02 11:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1100/1251] eta 0:00:43 lr 0.000383 time 0.2906 (0.2902) loss 2.8403 (3.4688) grad_norm 1.9187 (1.7171) [2022-10-02 11:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1200/1251] eta 0:00:14 lr 0.000382 time 0.2900 (0.2900) loss 3.4089 (3.4644) grad_norm 1.8461 (1.7146) [2022-10-02 11:08:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 173 training takes 0:06:02 [2022-10-02 11:08:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.779 (2.779) Loss 0.9754 (0.9754) Acc@1 76.562 (76.562) Acc@5 94.043 (94.043) [2022-10-02 11:09:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.172 Acc@5 93.968 [2022-10-02 11:09:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-10-02 11:09:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-02 11:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][0/1251] eta 0:59:36 lr 0.000382 time 2.8588 (2.8588) loss 3.0141 (3.0141) grad_norm 1.4845 (1.4845) [2022-10-02 11:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][100/1251] eta 0:06:03 lr 0.000381 time 0.2834 (0.3161) loss 3.4357 (3.4193) grad_norm 1.6624 (1.6554) [2022-10-02 11:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][200/1251] eta 0:05:18 lr 0.000381 time 0.2879 (0.3029) loss 3.9868 (3.4514) grad_norm 1.5754 (1.6805) [2022-10-02 11:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][300/1251] eta 0:04:43 lr 0.000381 time 0.2880 (0.2985) loss 2.5565 (3.4411) grad_norm 1.7702 (1.7071) [2022-10-02 11:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][400/1251] eta 0:04:12 lr 0.000380 time 0.2868 (0.2963) loss 3.8369 (3.4539) grad_norm 1.5964 (1.7031) [2022-10-02 11:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][500/1251] eta 0:03:41 lr 0.000380 time 0.2864 (0.2950) loss 3.3143 (3.4617) grad_norm 1.5312 (1.7092) [2022-10-02 11:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][600/1251] eta 0:03:11 lr 0.000379 time 0.2924 (0.2942) loss 3.3064 (3.4529) grad_norm 1.6228 (1.7080) [2022-10-02 11:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][700/1251] eta 0:02:41 lr 0.000379 time 0.2879 (0.2935) loss 2.3079 (3.4420) grad_norm 1.6583 (1.7085) [2022-10-02 11:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][800/1251] eta 0:02:12 lr 0.000379 time 0.2882 (0.2932) loss 2.8847 (3.4405) grad_norm 1.7466 (1.7112) [2022-10-02 11:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][900/1251] eta 0:01:42 lr 0.000378 time 0.2894 (0.2929) loss 3.9359 (3.4474) grad_norm 3.0804 (1.7144) [2022-10-02 11:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1000/1251] eta 0:01:13 lr 0.000378 time 0.2889 (0.2927) loss 3.4684 (3.4456) grad_norm 1.5669 (1.7129) [2022-10-02 11:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1100/1251] eta 0:00:44 lr 0.000377 time 0.2906 (0.2925) loss 2.6333 (3.4464) grad_norm 1.7286 (1.7132) [2022-10-02 11:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1200/1251] eta 0:00:14 lr 0.000377 time 0.2885 (0.2923) loss 3.3849 (3.4497) grad_norm 1.8113 (1.7167) [2022-10-02 11:15:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 174 training takes 0:06:06 [2022-10-02 11:15:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.087 (3.087) Loss 0.9820 (0.9820) Acc@1 75.098 (75.098) Acc@5 94.531 (94.531) [2022-10-02 11:15:27 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 76.998 Acc@5 93.878 [2022-10-02 11:15:27 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-02 11:15:27 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-02 11:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][0/1251] eta 0:45:43 lr 0.000377 time 2.1931 (2.1931) loss 4.1132 (4.1132) grad_norm 1.5497 (1.5497) [2022-10-02 11:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][100/1251] eta 0:06:02 lr 0.000376 time 0.2878 (0.3152) loss 3.7605 (3.5088) grad_norm 1.6116 (1.7194) [2022-10-02 11:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][200/1251] eta 0:05:17 lr 0.000376 time 0.2909 (0.3023) loss 3.8291 (3.4546) grad_norm 1.8504 (1.7107) [2022-10-02 11:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][300/1251] eta 0:04:43 lr 0.000376 time 0.2885 (0.2979) loss 3.2845 (3.4452) grad_norm 2.0285 (1.7095) [2022-10-02 11:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][400/1251] eta 0:04:11 lr 0.000375 time 0.2882 (0.2957) loss 2.7234 (3.4334) grad_norm 1.6972 (1.6981) [2022-10-02 11:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][500/1251] eta 0:03:41 lr 0.000375 time 0.2892 (0.2943) loss 3.6119 (3.4414) grad_norm 1.7777 (1.7113) [2022-10-02 11:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][600/1251] eta 0:03:11 lr 0.000374 time 0.2879 (0.2935) loss 3.2527 (3.4502) grad_norm 1.6145 (1.7157) [2022-10-02 11:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][700/1251] eta 0:02:41 lr 0.000374 time 0.2902 (0.2928) loss 3.3795 (3.4361) grad_norm 1.7901 (1.7177) [2022-10-02 11:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][800/1251] eta 0:02:11 lr 0.000374 time 0.2867 (0.2922) loss 3.5216 (3.4315) grad_norm 1.6496 (1.7185) [2022-10-02 11:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][900/1251] eta 0:01:42 lr 0.000373 time 0.2885 (0.2918) loss 3.9721 (3.4343) grad_norm 1.8434 (1.7221) [2022-10-02 11:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1000/1251] eta 0:01:13 lr 0.000373 time 0.2863 (0.2914) loss 3.7920 (3.4282) grad_norm 1.6454 (1.7216) [2022-10-02 11:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1100/1251] eta 0:00:43 lr 0.000372 time 0.2882 (0.2911) loss 3.6075 (3.4303) grad_norm 1.8334 (1.7183) [2022-10-02 11:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1200/1251] eta 0:00:14 lr 0.000372 time 0.2847 (0.2909) loss 3.6837 (3.4266) grad_norm 1.5543 (1.7199) [2022-10-02 11:21:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 175 training takes 0:06:04 [2022-10-02 11:21:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.227 (3.227) Loss 1.0383 (1.0383) Acc@1 76.270 (76.270) Acc@5 93.555 (93.555) [2022-10-02 11:21:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.026 Acc@5 93.866 [2022-10-02 11:21:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-10-02 11:21:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.17% [2022-10-02 11:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][0/1251] eta 0:51:46 lr 0.000372 time 2.4830 (2.4830) loss 3.3295 (3.3295) grad_norm 1.7229 (1.7229) [2022-10-02 11:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][100/1251] eta 0:06:03 lr 0.000371 time 0.2947 (0.3157) loss 4.0356 (3.4690) grad_norm 2.0795 (1.7389) [2022-10-02 11:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][200/1251] eta 0:05:18 lr 0.000371 time 0.2893 (0.3029) loss 3.7610 (3.4114) grad_norm 1.7012 (1.7122) [2022-10-02 11:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][300/1251] eta 0:04:44 lr 0.000371 time 0.2938 (0.2986) loss 3.5345 (3.4196) grad_norm 1.8331 (1.7097) [2022-10-02 11:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][400/1251] eta 0:04:12 lr 0.000370 time 0.2931 (0.2966) loss 3.1420 (3.4468) grad_norm 1.8404 (1.7352) [2022-10-02 11:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][500/1251] eta 0:03:41 lr 0.000370 time 0.2918 (0.2952) loss 3.2364 (3.4626) grad_norm 2.0079 (1.7341) [2022-10-02 11:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][600/1251] eta 0:03:11 lr 0.000369 time 0.2877 (0.2943) loss 3.0875 (3.4542) grad_norm 1.8664 (1.7401) [2022-10-02 11:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][700/1251] eta 0:02:41 lr 0.000369 time 0.2938 (0.2935) loss 2.4272 (3.4450) grad_norm 1.9504 (1.7397) [2022-10-02 11:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][800/1251] eta 0:02:12 lr 0.000369 time 0.2865 (0.2928) loss 2.3803 (3.4422) grad_norm 2.3155 (1.7399) [2022-10-02 11:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][900/1251] eta 0:01:42 lr 0.000368 time 0.2890 (0.2923) loss 3.2852 (3.4329) grad_norm 1.6805 (1.7466) [2022-10-02 11:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1000/1251] eta 0:01:13 lr 0.000368 time 0.2859 (0.2918) loss 3.7776 (3.4350) grad_norm 1.8786 (1.7464) [2022-10-02 11:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1100/1251] eta 0:00:44 lr 0.000368 time 0.2890 (0.2915) loss 3.3043 (3.4352) grad_norm 1.6226 (1.7506) [2022-10-02 11:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1200/1251] eta 0:00:14 lr 0.000367 time 0.2854 (0.2912) loss 2.7985 (3.4314) grad_norm 2.1018 (1.7491) [2022-10-02 11:27:48 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 176 training takes 0:06:04 [2022-10-02 11:27:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.752 (2.752) Loss 1.0228 (1.0228) Acc@1 77.734 (77.734) Acc@5 93.262 (93.262) [2022-10-02 11:28:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.272 Acc@5 93.906 [2022-10-02 11:28:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-02 11:28:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.27% [2022-10-02 11:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][0/1251] eta 1:08:42 lr 0.000367 time 3.2950 (3.2950) loss 4.2768 (4.2768) grad_norm 1.7058 (1.7058) [2022-10-02 11:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][100/1251] eta 0:06:08 lr 0.000367 time 0.2901 (0.3203) loss 3.2624 (3.3770) grad_norm 1.8218 (1.7757) [2022-10-02 11:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][200/1251] eta 0:05:21 lr 0.000366 time 0.2877 (0.3054) loss 3.5512 (3.3990) grad_norm 1.6230 (1.7569) [2022-10-02 11:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][300/1251] eta 0:04:45 lr 0.000366 time 0.2911 (0.3005) loss 3.7342 (3.4419) grad_norm 1.4173 (1.7577) [2022-10-02 11:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][400/1251] eta 0:04:13 lr 0.000365 time 0.2907 (0.2981) loss 3.8831 (3.3951) grad_norm 1.7107 (1.7709) [2022-10-02 11:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][500/1251] eta 0:03:42 lr 0.000365 time 0.2903 (0.2965) loss 4.0371 (3.3943) grad_norm 1.6557 (1.7718) [2022-10-02 11:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][600/1251] eta 0:03:12 lr 0.000365 time 0.2873 (0.2954) loss 4.2564 (3.4229) grad_norm 1.7684 (1.7698) [2022-10-02 11:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][700/1251] eta 0:02:42 lr 0.000364 time 0.2936 (0.2948) loss 2.3252 (3.4288) grad_norm 1.7977 (1.7769) [2022-10-02 11:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][800/1251] eta 0:02:12 lr 0.000364 time 0.2922 (0.2942) loss 3.6286 (3.4263) grad_norm 1.6162 (1.7735) [2022-10-02 11:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][900/1251] eta 0:01:43 lr 0.000363 time 0.2984 (0.2938) loss 4.1291 (3.4229) grad_norm 1.7389 (1.7679) [2022-10-02 11:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1000/1251] eta 0:01:13 lr 0.000363 time 0.2885 (0.2934) loss 4.3444 (3.4257) grad_norm 1.5392 (1.7669) [2022-10-02 11:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1100/1251] eta 0:00:44 lr 0.000363 time 0.2879 (0.2931) loss 2.7487 (3.4251) grad_norm 1.5479 (1.7659) [2022-10-02 11:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1200/1251] eta 0:00:14 lr 0.000362 time 0.2909 (0.2929) loss 3.5355 (3.4191) grad_norm 1.5864 (1.7649) [2022-10-02 11:34:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 177 training takes 0:06:06 [2022-10-02 11:34:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.102 (3.102) Loss 0.9518 (0.9518) Acc@1 77.148 (77.148) Acc@5 93.652 (93.652) [2022-10-02 11:34:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.306 Acc@5 93.942 [2022-10-02 11:34:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-10-02 11:34:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.31% [2022-10-02 11:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][0/1251] eta 0:49:25 lr 0.000362 time 2.3703 (2.3703) loss 2.5590 (2.5590) grad_norm 1.8146 (1.8146) [2022-10-02 11:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][100/1251] eta 0:06:05 lr 0.000362 time 0.2846 (0.3171) loss 4.2573 (3.4369) grad_norm 1.7399 (1.7482) [2022-10-02 11:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][200/1251] eta 0:05:18 lr 0.000361 time 0.2871 (0.3031) loss 3.6417 (3.4567) grad_norm 1.6798 (1.7525) [2022-10-02 11:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][300/1251] eta 0:04:43 lr 0.000361 time 0.2892 (0.2982) loss 3.8036 (3.4415) grad_norm 1.7801 (1.7653) [2022-10-02 11:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][400/1251] eta 0:04:11 lr 0.000360 time 0.2861 (0.2957) loss 3.0669 (3.4348) grad_norm 1.7052 (1.7691) [2022-10-02 11:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][500/1251] eta 0:03:40 lr 0.000360 time 0.2882 (0.2942) loss 2.4203 (3.4314) grad_norm 1.6359 (1.7660) [2022-10-02 11:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][600/1251] eta 0:03:10 lr 0.000360 time 0.2900 (0.2932) loss 3.6488 (3.4237) grad_norm 1.8092 (1.7660) [2022-10-02 11:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][700/1251] eta 0:02:41 lr 0.000359 time 0.2876 (0.2925) loss 3.0312 (3.4268) grad_norm 1.9567 (1.7599) [2022-10-02 11:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][800/1251] eta 0:02:11 lr 0.000359 time 0.2887 (0.2919) loss 3.1078 (3.4364) grad_norm 1.6826 (1.7565) [2022-10-02 11:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][900/1251] eta 0:01:42 lr 0.000358 time 0.2874 (0.2916) loss 3.6819 (3.4431) grad_norm 1.5296 (1.7602) [2022-10-02 11:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1000/1251] eta 0:01:13 lr 0.000358 time 0.2881 (0.2912) loss 2.6251 (3.4327) grad_norm 1.4657 (1.7628) [2022-10-02 11:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1100/1251] eta 0:00:43 lr 0.000358 time 0.2836 (0.2909) loss 3.8601 (3.4367) grad_norm 1.5906 (1.7637) [2022-10-02 11:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1200/1251] eta 0:00:14 lr 0.000357 time 0.2863 (0.2907) loss 3.6422 (3.4336) grad_norm 1.8869 (1.7640) [2022-10-02 11:40:24 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 178 training takes 0:06:03 [2022-10-02 11:40:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.119 (3.119) Loss 0.9197 (0.9197) Acc@1 76.270 (76.270) Acc@5 95.117 (95.117) [2022-10-02 11:40:37 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.224 Acc@5 94.026 [2022-10-02 11:40:37 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-10-02 11:40:37 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.31% [2022-10-02 11:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][0/1251] eta 1:07:34 lr 0.000357 time 3.2412 (3.2412) loss 3.6301 (3.6301) grad_norm 1.6455 (1.6455) [2022-10-02 11:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][100/1251] eta 0:06:08 lr 0.000357 time 0.2863 (0.3197) loss 3.1675 (3.3983) grad_norm 1.7014 (1.7534) [2022-10-02 11:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][200/1251] eta 0:05:20 lr 0.000356 time 0.2872 (0.3051) loss 2.6960 (3.4198) grad_norm 2.0395 (1.7608) [2022-10-02 11:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][300/1251] eta 0:04:45 lr 0.000356 time 0.2871 (0.3002) loss 2.6943 (3.4068) grad_norm 1.5495 (1.7636) [2022-10-02 11:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][400/1251] eta 0:04:13 lr 0.000355 time 0.2883 (0.2979) loss 2.9581 (3.4036) grad_norm 1.7302 (1.7681) [2022-10-02 11:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][500/1251] eta 0:03:42 lr 0.000355 time 0.2863 (0.2964) loss 3.7200 (3.4025) grad_norm 1.8544 (1.7652) [2022-10-02 11:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][600/1251] eta 0:03:12 lr 0.000355 time 0.2858 (0.2954) loss 3.4943 (3.4115) grad_norm 1.5307 (1.7704) [2022-10-02 11:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][700/1251] eta 0:02:42 lr 0.000354 time 0.2873 (0.2947) loss 3.3324 (3.4145) grad_norm 1.6798 (1.7684) [2022-10-02 11:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][800/1251] eta 0:02:12 lr 0.000354 time 0.2853 (0.2942) loss 3.7805 (3.4107) grad_norm 1.6631 (1.7678) [2022-10-02 11:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][900/1251] eta 0:01:43 lr 0.000353 time 0.2864 (0.2937) loss 3.9859 (3.4036) grad_norm 1.9751 (1.7711) [2022-10-02 11:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1000/1251] eta 0:01:13 lr 0.000353 time 0.2865 (0.2933) loss 3.6150 (3.4036) grad_norm 1.6870 (1.7697) [2022-10-02 11:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1100/1251] eta 0:00:44 lr 0.000353 time 0.2898 (0.2929) loss 4.0194 (3.4008) grad_norm 1.8098 (1.7690) [2022-10-02 11:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1200/1251] eta 0:00:14 lr 0.000352 time 0.2850 (0.2926) loss 3.7256 (3.4002) grad_norm 1.9339 (1.7664) [2022-10-02 11:46:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 179 training takes 0:06:06 [2022-10-02 11:46:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.162 (3.162) Loss 1.0272 (1.0272) Acc@1 75.684 (75.684) Acc@5 93.652 (93.652) [2022-10-02 11:46:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.368 Acc@5 93.974 [2022-10-02 11:46:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-02 11:46:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.37% [2022-10-02 11:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][0/1251] eta 1:10:41 lr 0.000352 time 3.3907 (3.3907) loss 3.1898 (3.1898) grad_norm 1.5872 (1.5872) [2022-10-02 11:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][100/1251] eta 0:06:08 lr 0.000352 time 0.2892 (0.3205) loss 2.9409 (3.3780) grad_norm 2.2923 (1.7756) [2022-10-02 11:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][200/1251] eta 0:05:20 lr 0.000351 time 0.2851 (0.3048) loss 3.2093 (3.3759) grad_norm 1.8599 (1.7570) [2022-10-02 11:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][300/1251] eta 0:04:44 lr 0.000351 time 0.2917 (0.2995) loss 3.6698 (3.3735) grad_norm 1.8171 (1.7622) [2022-10-02 11:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][400/1251] eta 0:04:12 lr 0.000350 time 0.2906 (0.2970) loss 3.8246 (3.3870) grad_norm 1.7018 (1.7630) [2022-10-02 11:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][500/1251] eta 0:03:41 lr 0.000350 time 0.2895 (0.2955) loss 2.4184 (3.3852) grad_norm 2.2262 (1.7685) [2022-10-02 11:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][600/1251] eta 0:03:11 lr 0.000350 time 0.2915 (0.2948) loss 2.9652 (3.3918) grad_norm 1.5993 (1.7774) [2022-10-02 11:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][700/1251] eta 0:02:42 lr 0.000349 time 0.2916 (0.2941) loss 3.9176 (3.3992) grad_norm 1.8583 (1.7699) [2022-10-02 11:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][800/1251] eta 0:02:12 lr 0.000349 time 0.2879 (0.2936) loss 3.2936 (3.4045) grad_norm 1.8516 (1.7730) [2022-10-02 11:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][900/1251] eta 0:01:42 lr 0.000348 time 0.2900 (0.2932) loss 3.9789 (3.4100) grad_norm 1.6183 (1.7701) [2022-10-02 11:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1000/1251] eta 0:01:13 lr 0.000348 time 0.2892 (0.2928) loss 3.3540 (3.4046) grad_norm 1.9164 (1.7666) [2022-10-02 11:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1100/1251] eta 0:00:44 lr 0.000348 time 0.2915 (0.2925) loss 3.6841 (3.4087) grad_norm 1.6595 (1.7698) [2022-10-02 11:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1200/1251] eta 0:00:14 lr 0.000347 time 0.2873 (0.2923) loss 4.2517 (3.4084) grad_norm 1.5109 (1.7705) [2022-10-02 11:53:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 180 training takes 0:06:05 [2022-10-02 11:53:01 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saving...... [2022-10-02 11:53:02 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saved !!! [2022-10-02 11:53:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.126 (2.126) Loss 0.8201 (0.8201) Acc@1 80.859 (80.859) Acc@5 95.215 (95.215) [2022-10-02 11:53:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.360 Acc@5 94.000 [2022-10-02 11:53:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-02 11:53:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.37% [2022-10-02 11:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][0/1251] eta 0:52:44 lr 0.000347 time 2.5294 (2.5294) loss 3.7496 (3.7496) grad_norm 1.9133 (1.9133) [2022-10-02 11:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][100/1251] eta 0:06:02 lr 0.000347 time 0.2944 (0.3153) loss 2.7916 (3.5079) grad_norm 1.8597 (1.7606) [2022-10-02 11:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][200/1251] eta 0:05:18 lr 0.000346 time 0.2903 (0.3031) loss 3.9928 (3.4875) grad_norm 2.2902 (1.7765) [2022-10-02 11:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][300/1251] eta 0:04:44 lr 0.000346 time 0.2935 (0.2991) loss 2.3763 (3.4621) grad_norm 1.9823 (1.7803) [2022-10-02 11:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][400/1251] eta 0:04:12 lr 0.000346 time 0.2862 (0.2969) loss 3.0902 (3.4297) grad_norm 1.8262 (1.7819) [2022-10-02 11:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][500/1251] eta 0:03:42 lr 0.000345 time 0.2906 (0.2956) loss 3.4552 (3.4436) grad_norm 1.6564 (1.7837) [2022-10-02 11:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][600/1251] eta 0:03:11 lr 0.000345 time 0.2867 (0.2946) loss 3.6710 (3.4394) grad_norm 2.0893 (1.8007) [2022-10-02 11:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][700/1251] eta 0:02:42 lr 0.000344 time 0.2894 (0.2940) loss 3.7977 (3.4378) grad_norm 2.1941 (1.8018) [2022-10-02 11:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][800/1251] eta 0:02:12 lr 0.000344 time 0.2885 (0.2936) loss 3.6368 (3.4270) grad_norm 1.7869 (1.7973) [2022-10-02 11:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][900/1251] eta 0:01:42 lr 0.000344 time 0.2858 (0.2932) loss 3.6466 (3.4215) grad_norm 1.8024 (1.7976) [2022-10-02 11:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1000/1251] eta 0:01:13 lr 0.000343 time 0.2890 (0.2928) loss 2.9583 (3.4153) grad_norm 1.8999 (1.7987) [2022-10-02 11:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1100/1251] eta 0:00:44 lr 0.000343 time 0.2893 (0.2925) loss 2.9351 (3.4181) grad_norm 1.5915 (1.7948) [2022-10-02 11:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1200/1251] eta 0:00:14 lr 0.000342 time 0.2891 (0.2923) loss 3.0260 (3.4180) grad_norm 1.7458 (1.7950) [2022-10-02 11:59:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 181 training takes 0:06:05 [2022-10-02 11:59:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.139 (3.139) Loss 0.9530 (0.9530) Acc@1 77.051 (77.051) Acc@5 94.531 (94.531) [2022-10-02 11:59:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.494 Acc@5 94.132 [2022-10-02 11:59:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-10-02 11:59:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.49% [2022-10-02 11:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][0/1251] eta 0:45:36 lr 0.000342 time 2.1873 (2.1873) loss 2.6525 (2.6525) grad_norm 2.0400 (2.0400) [2022-10-02 12:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][100/1251] eta 0:06:01 lr 0.000342 time 0.2901 (0.3141) loss 3.5932 (3.3583) grad_norm 1.8836 (1.8013) [2022-10-02 12:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][200/1251] eta 0:05:17 lr 0.000341 time 0.2909 (0.3021) loss 3.6703 (3.3744) grad_norm 1.6020 (1.7904) [2022-10-02 12:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][300/1251] eta 0:04:43 lr 0.000341 time 0.2933 (0.2979) loss 4.1091 (3.3984) grad_norm 1.6602 (1.8050) [2022-10-02 12:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][400/1251] eta 0:04:11 lr 0.000341 time 0.2908 (0.2957) loss 3.8878 (3.4376) grad_norm 2.1677 (1.8070) [2022-10-02 12:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][500/1251] eta 0:03:41 lr 0.000340 time 0.2886 (0.2943) loss 3.6327 (3.4310) grad_norm 1.6773 (1.8042) [2022-10-02 12:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][600/1251] eta 0:03:11 lr 0.000340 time 0.2919 (0.2934) loss 3.3841 (3.4306) grad_norm 1.8814 (1.8012) [2022-10-02 12:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][700/1251] eta 0:02:41 lr 0.000339 time 0.2864 (0.2927) loss 3.5454 (3.4292) grad_norm 1.7266 (1.7952) [2022-10-02 12:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][800/1251] eta 0:02:11 lr 0.000339 time 0.2892 (0.2922) loss 2.5647 (3.4105) grad_norm 1.8161 (1.7995) [2022-10-02 12:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][900/1251] eta 0:01:42 lr 0.000339 time 0.2862 (0.2918) loss 3.5506 (3.4087) grad_norm 1.6160 (1.7967) [2022-10-02 12:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1000/1251] eta 0:01:13 lr 0.000338 time 0.2888 (0.2915) loss 3.5878 (3.4142) grad_norm 1.5272 (1.7979) [2022-10-02 12:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1100/1251] eta 0:00:43 lr 0.000338 time 0.2852 (0.2911) loss 3.4104 (3.4225) grad_norm 2.1343 (1.8014) [2022-10-02 12:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1200/1251] eta 0:00:14 lr 0.000338 time 0.2914 (0.2909) loss 3.4125 (3.4176) grad_norm 1.8740 (1.7962) [2022-10-02 12:05:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 182 training takes 0:06:04 [2022-10-02 12:05:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.057 (3.057) Loss 1.1079 (1.1079) Acc@1 73.242 (73.242) Acc@5 92.676 (92.676) [2022-10-02 12:05:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.576 Acc@5 93.996 [2022-10-02 12:05:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-02 12:05:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.58% [2022-10-02 12:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][0/1251] eta 1:05:26 lr 0.000337 time 3.1383 (3.1383) loss 3.1826 (3.1826) grad_norm 1.6305 (1.6305) [2022-10-02 12:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][100/1251] eta 0:06:06 lr 0.000337 time 0.2890 (0.3181) loss 3.0832 (3.3543) grad_norm 1.7899 (1.7726) [2022-10-02 12:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][200/1251] eta 0:05:19 lr 0.000337 time 0.2913 (0.3036) loss 3.7621 (3.4480) grad_norm 2.6393 (1.7763) [2022-10-02 12:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][300/1251] eta 0:04:44 lr 0.000336 time 0.2870 (0.2987) loss 3.2585 (3.4416) grad_norm 1.4983 (1.7990) [2022-10-02 12:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][400/1251] eta 0:04:12 lr 0.000336 time 0.2868 (0.2963) loss 3.8714 (3.4260) grad_norm 1.8035 (1.8094) [2022-10-02 12:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][500/1251] eta 0:03:41 lr 0.000335 time 0.2892 (0.2952) loss 3.6615 (3.4036) grad_norm 1.7954 (1.8090) [2022-10-02 12:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][600/1251] eta 0:03:11 lr 0.000335 time 0.2954 (0.2943) loss 3.7206 (3.4087) grad_norm 1.7546 (1.8048) [2022-10-02 12:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][700/1251] eta 0:02:41 lr 0.000335 time 0.2897 (0.2936) loss 3.5878 (3.4169) grad_norm 1.5783 (1.8018) [2022-10-02 12:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][800/1251] eta 0:02:12 lr 0.000334 time 0.2921 (0.2930) loss 3.7705 (3.4125) grad_norm 2.0151 (1.7981) [2022-10-02 12:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][900/1251] eta 0:01:42 lr 0.000334 time 0.2908 (0.2925) loss 3.3184 (3.4107) grad_norm 1.5206 (1.7994) [2022-10-02 12:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1000/1251] eta 0:01:13 lr 0.000333 time 0.2873 (0.2921) loss 2.4161 (3.4122) grad_norm 1.5998 (1.7996) [2022-10-02 12:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1100/1251] eta 0:00:44 lr 0.000333 time 0.2908 (0.2917) loss 3.4198 (3.4128) grad_norm 1.7516 (1.8037) [2022-10-02 12:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1200/1251] eta 0:00:14 lr 0.000333 time 0.2875 (0.2914) loss 3.6762 (3.4034) grad_norm 1.8663 (1.8053) [2022-10-02 12:11:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 183 training takes 0:06:04 [2022-10-02 12:11:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.849 (2.849) Loss 0.9282 (0.9282) Acc@1 77.246 (77.246) Acc@5 94.922 (94.922) [2022-10-02 12:12:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.434 Acc@5 94.122 [2022-10-02 12:12:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-10-02 12:12:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.58% [2022-10-02 12:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][0/1251] eta 1:05:36 lr 0.000332 time 3.1463 (3.1463) loss 2.6413 (2.6413) grad_norm 1.6735 (1.6735) [2022-10-02 12:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][100/1251] eta 0:06:08 lr 0.000332 time 0.2918 (0.3206) loss 2.1434 (3.3581) grad_norm 1.5829 (1.7625) [2022-10-02 12:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][200/1251] eta 0:05:21 lr 0.000332 time 0.2908 (0.3057) loss 2.9770 (3.3805) grad_norm 1.7423 (1.8152) [2022-10-02 12:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][300/1251] eta 0:04:45 lr 0.000331 time 0.2879 (0.3005) loss 3.1326 (3.4090) grad_norm 1.8652 (1.8270) [2022-10-02 12:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][400/1251] eta 0:04:13 lr 0.000331 time 0.2874 (0.2979) loss 3.6605 (3.4166) grad_norm 1.9046 (1.8170) [2022-10-02 12:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][500/1251] eta 0:03:42 lr 0.000331 time 0.2902 (0.2964) loss 3.3866 (3.4185) grad_norm 1.7242 (1.8193) [2022-10-02 12:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][600/1251] eta 0:03:12 lr 0.000330 time 0.2898 (0.2953) loss 3.5371 (3.4324) grad_norm 2.0063 (1.8220) [2022-10-02 12:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][700/1251] eta 0:02:42 lr 0.000330 time 0.2889 (0.2945) loss 3.8774 (3.4310) grad_norm 1.9795 (1.8206) [2022-10-02 12:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][800/1251] eta 0:02:12 lr 0.000329 time 0.2893 (0.2938) loss 3.6562 (3.4271) grad_norm 1.8288 (1.8146) [2022-10-02 12:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][900/1251] eta 0:01:42 lr 0.000329 time 0.2889 (0.2933) loss 3.8201 (3.4239) grad_norm 1.8521 (1.8172) [2022-10-02 12:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1000/1251] eta 0:01:13 lr 0.000329 time 0.2889 (0.2929) loss 2.4665 (3.4144) grad_norm 1.6117 (1.8159) [2022-10-02 12:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1100/1251] eta 0:00:44 lr 0.000328 time 0.2997 (0.2925) loss 3.1673 (3.4145) grad_norm 1.9292 (1.8154) [2022-10-02 12:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1200/1251] eta 0:00:14 lr 0.000328 time 0.2894 (0.2922) loss 3.4734 (3.4179) grad_norm 1.8141 (1.8174) [2022-10-02 12:18:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 184 training takes 0:06:05 [2022-10-02 12:18:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.972 (2.972) Loss 0.9134 (0.9134) Acc@1 79.102 (79.102) Acc@5 93.750 (93.750) [2022-10-02 12:18:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.614 Acc@5 94.160 [2022-10-02 12:18:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-02 12:18:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.61% [2022-10-02 12:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][0/1251] eta 0:50:51 lr 0.000328 time 2.4392 (2.4392) loss 3.2647 (3.2647) grad_norm 1.7923 (1.7923) [2022-10-02 12:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][100/1251] eta 0:06:06 lr 0.000327 time 0.2874 (0.3182) loss 2.3160 (3.3546) grad_norm 1.8864 (1.8672) [2022-10-02 12:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][200/1251] eta 0:05:18 lr 0.000327 time 0.2886 (0.3034) loss 3.3936 (3.3580) grad_norm 1.7890 (1.8387) [2022-10-02 12:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][300/1251] eta 0:04:43 lr 0.000326 time 0.2845 (0.2984) loss 3.1365 (3.3805) grad_norm 1.9318 (1.8333) [2022-10-02 12:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][400/1251] eta 0:04:11 lr 0.000326 time 0.2875 (0.2958) loss 3.6899 (3.3798) grad_norm 1.6737 (1.8383) [2022-10-02 12:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][500/1251] eta 0:03:41 lr 0.000326 time 0.2860 (0.2943) loss 3.4811 (3.3773) grad_norm 1.6961 (1.8353) [2022-10-02 12:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][600/1251] eta 0:03:10 lr 0.000325 time 0.2874 (0.2932) loss 3.3950 (3.3784) grad_norm 1.6067 (1.8383) [2022-10-02 12:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][700/1251] eta 0:02:41 lr 0.000325 time 0.2857 (0.2927) loss 3.2324 (3.3937) grad_norm 1.7678 (1.8396) [2022-10-02 12:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][800/1251] eta 0:02:12 lr 0.000325 time 0.2871 (0.2928) loss 3.6423 (3.3975) grad_norm 1.8231 (1.8402) [2022-10-02 12:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][900/1251] eta 0:01:42 lr 0.000324 time 0.2867 (0.2922) loss 3.2690 (3.3940) grad_norm 1.5015 (1.8418) [2022-10-02 12:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1000/1251] eta 0:01:13 lr 0.000324 time 0.2844 (0.2917) loss 3.3915 (3.3985) grad_norm 1.7668 (1.8403) [2022-10-02 12:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1100/1251] eta 0:00:44 lr 0.000323 time 0.2865 (0.2915) loss 2.2405 (3.3980) grad_norm 1.9413 (1.8357) [2022-10-02 12:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1200/1251] eta 0:00:14 lr 0.000323 time 0.2883 (0.2912) loss 3.7955 (3.3993) grad_norm 1.6212 (1.8335) [2022-10-02 12:24:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 185 training takes 0:06:04 [2022-10-02 12:24:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.874 (2.874) Loss 0.9475 (0.9475) Acc@1 76.855 (76.855) Acc@5 94.629 (94.629) [2022-10-02 12:24:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.504 Acc@5 94.114 [2022-10-02 12:24:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-10-02 12:24:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.61% [2022-10-02 12:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][0/1251] eta 0:47:34 lr 0.000323 time 2.2820 (2.2820) loss 2.6534 (2.6534) grad_norm 1.6102 (1.6102) [2022-10-02 12:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][100/1251] eta 0:06:04 lr 0.000322 time 0.2881 (0.3168) loss 2.2026 (3.4060) grad_norm 2.0663 (1.8591) [2022-10-02 12:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][200/1251] eta 0:05:18 lr 0.000322 time 0.2870 (0.3030) loss 3.4188 (3.3937) grad_norm 1.5622 (1.8735) [2022-10-02 12:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][300/1251] eta 0:04:43 lr 0.000322 time 0.2909 (0.2983) loss 3.5969 (3.3955) grad_norm 1.7196 (1.8762) [2022-10-02 12:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][400/1251] eta 0:04:11 lr 0.000321 time 0.2884 (0.2959) loss 3.7408 (3.4137) grad_norm 1.7569 (1.8670) [2022-10-02 12:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][500/1251] eta 0:03:41 lr 0.000321 time 0.2938 (0.2944) loss 2.9896 (3.4024) grad_norm 1.7881 (1.8521) [2022-10-02 12:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][600/1251] eta 0:03:11 lr 0.000320 time 0.2890 (0.2935) loss 3.8739 (3.3960) grad_norm 1.6610 (1.8479) [2022-10-02 12:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][700/1251] eta 0:02:41 lr 0.000320 time 0.2914 (0.2927) loss 3.7557 (3.3932) grad_norm 1.6984 (1.8501) [2022-10-02 12:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][800/1251] eta 0:02:11 lr 0.000320 time 0.2894 (0.2922) loss 3.9127 (3.3978) grad_norm 2.0394 (1.8466) [2022-10-02 12:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][900/1251] eta 0:01:42 lr 0.000319 time 0.2908 (0.2917) loss 3.5115 (3.3983) grad_norm 1.8716 (1.8445) [2022-10-02 12:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1000/1251] eta 0:01:13 lr 0.000319 time 0.2868 (0.2913) loss 2.5564 (3.3893) grad_norm 1.8678 (1.8533) [2022-10-02 12:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1100/1251] eta 0:00:43 lr 0.000319 time 0.2894 (0.2910) loss 3.7150 (3.3833) grad_norm 1.6859 (1.8467) [2022-10-02 12:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1200/1251] eta 0:00:14 lr 0.000318 time 0.2956 (0.2907) loss 3.8264 (3.3789) grad_norm 1.7762 (1.8456) [2022-10-02 12:30:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 186 training takes 0:06:03 [2022-10-02 12:30:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.884 (2.884) Loss 0.9188 (0.9188) Acc@1 77.734 (77.734) Acc@5 94.531 (94.531) [2022-10-02 12:31:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.562 Acc@5 94.170 [2022-10-02 12:31:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-02 12:31:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.61% [2022-10-02 12:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][0/1251] eta 1:06:01 lr 0.000318 time 3.1667 (3.1667) loss 3.2094 (3.2094) grad_norm 2.0733 (2.0733) [2022-10-02 12:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][100/1251] eta 0:06:05 lr 0.000318 time 0.2892 (0.3175) loss 3.9483 (3.3912) grad_norm 1.8583 (1.8777) [2022-10-02 12:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][200/1251] eta 0:05:18 lr 0.000317 time 0.2886 (0.3028) loss 3.9759 (3.4001) grad_norm 1.9507 (1.8450) [2022-10-02 12:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][300/1251] eta 0:04:43 lr 0.000317 time 0.2944 (0.2980) loss 3.6188 (3.4108) grad_norm 1.7849 (1.8319) [2022-10-02 12:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][400/1251] eta 0:04:11 lr 0.000316 time 0.2868 (0.2955) loss 4.1232 (3.3899) grad_norm 1.9377 (1.8358) [2022-10-02 12:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][500/1251] eta 0:03:40 lr 0.000316 time 0.2845 (0.2941) loss 2.7082 (3.3952) grad_norm 1.6887 (1.8400) [2022-10-02 12:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][600/1251] eta 0:03:10 lr 0.000316 time 0.2880 (0.2930) loss 4.0285 (3.4091) grad_norm 2.5211 (1.8442) [2022-10-02 12:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][700/1251] eta 0:02:41 lr 0.000315 time 0.2872 (0.2924) loss 2.9466 (3.3927) grad_norm 2.5218 (1.8417) [2022-10-02 12:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][800/1251] eta 0:02:11 lr 0.000315 time 0.2880 (0.2918) loss 2.5493 (3.3796) grad_norm 1.8399 (1.8416) [2022-10-02 12:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][900/1251] eta 0:01:42 lr 0.000315 time 0.2903 (0.2913) loss 2.3593 (3.3680) grad_norm 1.8216 (1.8434) [2022-10-02 12:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1000/1251] eta 0:01:13 lr 0.000314 time 0.2866 (0.2909) loss 3.7696 (3.3788) grad_norm 1.6954 (1.8477) [2022-10-02 12:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1100/1251] eta 0:00:43 lr 0.000314 time 0.2885 (0.2905) loss 3.7852 (3.3793) grad_norm 1.8859 (1.8458) [2022-10-02 12:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1200/1251] eta 0:00:14 lr 0.000313 time 0.2896 (0.2902) loss 3.7035 (3.3833) grad_norm 1.6773 (1.8453) [2022-10-02 12:37:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 187 training takes 0:06:03 [2022-10-02 12:37:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.229 (3.229) Loss 0.8218 (0.8218) Acc@1 82.129 (82.129) Acc@5 95.898 (95.898) [2022-10-02 12:37:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.680 Acc@5 94.162 [2022-10-02 12:37:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-10-02 12:37:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.68% [2022-10-02 12:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][0/1251] eta 1:10:22 lr 0.000313 time 3.3757 (3.3757) loss 2.8618 (2.8618) grad_norm 1.7136 (1.7136) [2022-10-02 12:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][100/1251] eta 0:06:10 lr 0.000313 time 0.2935 (0.3223) loss 3.4261 (3.3571) grad_norm 1.6075 (1.8191) [2022-10-02 12:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][200/1251] eta 0:05:22 lr 0.000312 time 0.2889 (0.3067) loss 3.1591 (3.3948) grad_norm 1.9083 (1.8382) [2022-10-02 12:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][300/1251] eta 0:04:46 lr 0.000312 time 0.2908 (0.3014) loss 3.1473 (3.3876) grad_norm 1.9501 (1.8551) [2022-10-02 12:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][400/1251] eta 0:04:14 lr 0.000312 time 0.2915 (0.2986) loss 4.0757 (3.4051) grad_norm 1.9038 (1.8597) [2022-10-02 12:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][500/1251] eta 0:03:42 lr 0.000311 time 0.2892 (0.2969) loss 3.1224 (3.4005) grad_norm 2.2986 (1.8630) [2022-10-02 12:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][600/1251] eta 0:03:12 lr 0.000311 time 0.2928 (0.2959) loss 3.5545 (3.3814) grad_norm 1.8075 (1.8570) [2022-10-02 12:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][700/1251] eta 0:02:42 lr 0.000311 time 0.2924 (0.2951) loss 2.7884 (3.3921) grad_norm 1.9116 (1.8506) [2022-10-02 12:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][800/1251] eta 0:02:12 lr 0.000310 time 0.2931 (0.2945) loss 4.0085 (3.3950) grad_norm 1.9067 (1.8496) [2022-10-02 12:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][900/1251] eta 0:01:43 lr 0.000310 time 0.2862 (0.2940) loss 3.9688 (3.3941) grad_norm 1.7818 (1.8481) [2022-10-02 12:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1000/1251] eta 0:01:13 lr 0.000309 time 0.2900 (0.2935) loss 2.6493 (3.3870) grad_norm 1.8511 (1.8514) [2022-10-02 12:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1100/1251] eta 0:00:44 lr 0.000309 time 0.2884 (0.2931) loss 2.7220 (3.3789) grad_norm 1.7512 (1.8510) [2022-10-02 12:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1200/1251] eta 0:00:14 lr 0.000309 time 0.2868 (0.2926) loss 3.9380 (3.3806) grad_norm 2.1825 (1.8499) [2022-10-02 12:43:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 188 training takes 0:06:06 [2022-10-02 12:43:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.302 (3.302) Loss 0.9640 (0.9640) Acc@1 76.855 (76.855) Acc@5 93.262 (93.262) [2022-10-02 12:43:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.968 Acc@5 94.234 [2022-10-02 12:43:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-02 12:43:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.97% [2022-10-02 12:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][0/1251] eta 1:07:46 lr 0.000308 time 3.2506 (3.2506) loss 3.3620 (3.3620) grad_norm 1.5604 (1.5604) [2022-10-02 12:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][100/1251] eta 0:06:10 lr 0.000308 time 0.2888 (0.3217) loss 3.5811 (3.3512) grad_norm 1.7355 (1.7963) [2022-10-02 12:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][200/1251] eta 0:05:22 lr 0.000308 time 0.2918 (0.3068) loss 2.9123 (3.3995) grad_norm 1.5666 (1.8253) [2022-10-02 12:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][300/1251] eta 0:04:47 lr 0.000307 time 0.2854 (0.3023) loss 3.9282 (3.3925) grad_norm 1.9501 (1.8389) [2022-10-02 12:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][400/1251] eta 0:04:15 lr 0.000307 time 0.2930 (0.2997) loss 4.1272 (3.3762) grad_norm 1.6311 (1.8418) [2022-10-02 12:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][500/1251] eta 0:03:43 lr 0.000307 time 0.2897 (0.2980) loss 3.7772 (3.3621) grad_norm 1.6615 (1.8430) [2022-10-02 12:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][600/1251] eta 0:03:13 lr 0.000306 time 0.2884 (0.2970) loss 3.7657 (3.3497) grad_norm 1.8448 (1.8410) [2022-10-02 12:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][700/1251] eta 0:02:43 lr 0.000306 time 0.2904 (0.2960) loss 3.7350 (3.3443) grad_norm 2.0556 (1.8491) [2022-10-02 12:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][800/1251] eta 0:02:13 lr 0.000305 time 0.2932 (0.2953) loss 3.5732 (3.3530) grad_norm 1.7205 (1.8467) [2022-10-02 12:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][900/1251] eta 0:01:43 lr 0.000305 time 0.2879 (0.2948) loss 3.5966 (3.3497) grad_norm 2.1140 (1.8521) [2022-10-02 12:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1000/1251] eta 0:01:13 lr 0.000305 time 0.2919 (0.2943) loss 3.4220 (3.3560) grad_norm 1.5881 (1.8565) [2022-10-02 12:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1100/1251] eta 0:00:44 lr 0.000304 time 0.2878 (0.2940) loss 3.4198 (3.3512) grad_norm 2.0741 (1.8555) [2022-10-02 12:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1200/1251] eta 0:00:14 lr 0.000304 time 0.3005 (0.2936) loss 3.6996 (3.3560) grad_norm 1.6698 (1.8533) [2022-10-02 12:49:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 189 training takes 0:06:07 [2022-10-02 12:49:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.774 (2.774) Loss 0.9599 (0.9599) Acc@1 78.223 (78.223) Acc@5 94.141 (94.141) [2022-10-02 12:49:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.800 Acc@5 94.170 [2022-10-02 12:49:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-10-02 12:49:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.97% [2022-10-02 12:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][0/1251] eta 1:07:41 lr 0.000304 time 3.2466 (3.2466) loss 3.3059 (3.3059) grad_norm 2.0313 (2.0313) [2022-10-02 12:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][100/1251] eta 0:06:08 lr 0.000303 time 0.2888 (0.3199) loss 3.6086 (3.3298) grad_norm 1.9855 (1.8917) [2022-10-02 12:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][200/1251] eta 0:05:20 lr 0.000303 time 0.2875 (0.3049) loss 2.5801 (3.3802) grad_norm 1.7223 (1.8579) [2022-10-02 12:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][300/1251] eta 0:04:45 lr 0.000303 time 0.2906 (0.2997) loss 3.5556 (3.3507) grad_norm 1.9460 (1.8647) [2022-10-02 12:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][400/1251] eta 0:04:12 lr 0.000302 time 0.2872 (0.2971) loss 2.9759 (3.3573) grad_norm 1.6029 (1.8649) [2022-10-02 12:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][500/1251] eta 0:03:42 lr 0.000302 time 0.2884 (0.2956) loss 3.9892 (3.3688) grad_norm 2.1737 (1.8666) [2022-10-02 12:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][600/1251] eta 0:03:11 lr 0.000301 time 0.2881 (0.2946) loss 3.5439 (3.3617) grad_norm 2.4215 (1.8759) [2022-10-02 12:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][700/1251] eta 0:02:41 lr 0.000301 time 0.2895 (0.2939) loss 3.7968 (3.3694) grad_norm 2.3044 (1.8725) [2022-10-02 12:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][800/1251] eta 0:02:12 lr 0.000301 time 0.2892 (0.2932) loss 3.5806 (3.3661) grad_norm 1.8934 (1.8708) [2022-10-02 12:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][900/1251] eta 0:01:42 lr 0.000300 time 0.2889 (0.2927) loss 2.4653 (3.3676) grad_norm 1.6826 (1.8722) [2022-10-02 12:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1000/1251] eta 0:01:13 lr 0.000300 time 0.2888 (0.2923) loss 3.5013 (3.3666) grad_norm 1.9234 (1.8680) [2022-10-02 12:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1100/1251] eta 0:00:44 lr 0.000300 time 0.2870 (0.2919) loss 3.7786 (3.3710) grad_norm 1.9156 (1.8674) [2022-10-02 12:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1200/1251] eta 0:00:14 lr 0.000299 time 0.2927 (0.2916) loss 2.1763 (3.3655) grad_norm 1.8737 (1.8661) [2022-10-02 12:56:01 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 190 training takes 0:06:04 [2022-10-02 12:56:01 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saving...... [2022-10-02 12:56:01 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saved !!! [2022-10-02 12:56:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.872 (2.872) Loss 0.9184 (0.9184) Acc@1 77.637 (77.637) Acc@5 94.531 (94.531) [2022-10-02 12:56:14 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.590 Acc@5 94.242 [2022-10-02 12:56:14 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-10-02 12:56:14 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 77.97% [2022-10-02 12:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][0/1251] eta 1:02:25 lr 0.000299 time 2.9942 (2.9942) loss 3.5121 (3.5121) grad_norm 1.7216 (1.7216) [2022-10-02 12:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][100/1251] eta 0:06:06 lr 0.000299 time 0.2892 (0.3186) loss 3.5613 (3.3385) grad_norm 2.2207 (1.8646) [2022-10-02 12:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][200/1251] eta 0:05:20 lr 0.000298 time 0.2908 (0.3048) loss 3.1570 (3.3629) grad_norm 1.9998 (1.8950) [2022-10-02 12:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][300/1251] eta 0:04:45 lr 0.000298 time 0.2911 (0.3002) loss 3.1600 (3.3733) grad_norm 1.6256 (1.8979) [2022-10-02 12:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][400/1251] eta 0:04:13 lr 0.000297 time 0.2911 (0.2981) loss 3.8321 (3.3974) grad_norm 2.0934 (1.8968) [2022-10-02 12:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][500/1251] eta 0:03:42 lr 0.000297 time 0.2895 (0.2968) loss 3.8532 (3.3836) grad_norm 2.0522 (1.8925) [2022-10-02 12:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][600/1251] eta 0:03:12 lr 0.000297 time 0.2908 (0.2959) loss 3.6479 (3.3924) grad_norm 2.0724 (1.8894) [2022-10-02 12:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][700/1251] eta 0:02:42 lr 0.000296 time 0.2895 (0.2950) loss 2.9428 (3.3893) grad_norm 1.6902 (1.8918) [2022-10-02 13:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][800/1251] eta 0:02:12 lr 0.000296 time 0.2861 (0.2944) loss 3.4443 (3.3921) grad_norm 1.9903 (1.8900) [2022-10-02 13:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][900/1251] eta 0:01:43 lr 0.000296 time 0.2893 (0.2938) loss 3.4150 (3.3856) grad_norm 1.7879 (1.8904) [2022-10-02 13:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1000/1251] eta 0:01:13 lr 0.000295 time 0.2868 (0.2934) loss 3.8489 (3.3701) grad_norm 1.9791 (1.8860) [2022-10-02 13:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1100/1251] eta 0:00:44 lr 0.000295 time 0.2897 (0.2930) loss 2.9909 (3.3719) grad_norm 1.9419 (1.8844) [2022-10-02 13:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1200/1251] eta 0:00:14 lr 0.000294 time 0.2878 (0.2927) loss 2.4540 (3.3728) grad_norm 1.6516 (1.8864) [2022-10-02 13:02:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 191 training takes 0:06:06 [2022-10-02 13:02:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.140 (3.140) Loss 0.9538 (0.9538) Acc@1 77.051 (77.051) Acc@5 94.336 (94.336) [2022-10-02 13:02:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.052 Acc@5 94.250 [2022-10-02 13:02:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-02 13:02:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.05% [2022-10-02 13:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][0/1251] eta 1:10:12 lr 0.000294 time 3.3677 (3.3677) loss 2.0615 (2.0615) grad_norm 1.9864 (1.9864) [2022-10-02 13:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][100/1251] eta 0:06:07 lr 0.000294 time 0.2878 (0.3193) loss 4.0312 (3.3879) grad_norm 1.9572 (1.9040) [2022-10-02 13:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][200/1251] eta 0:05:19 lr 0.000293 time 0.2880 (0.3044) loss 3.8293 (3.3504) grad_norm 1.6719 (1.9168) [2022-10-02 13:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][300/1251] eta 0:04:44 lr 0.000293 time 0.2881 (0.2994) loss 2.3087 (3.3433) grad_norm 1.9994 (1.9470) [2022-10-02 13:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][400/1251] eta 0:04:12 lr 0.000293 time 0.2893 (0.2968) loss 3.7488 (3.3513) grad_norm 1.9397 (1.9295) [2022-10-02 13:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][500/1251] eta 0:03:41 lr 0.000292 time 0.2880 (0.2952) loss 3.6942 (3.3485) grad_norm 1.9496 (1.9252) [2022-10-02 13:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][600/1251] eta 0:03:11 lr 0.000292 time 0.2894 (0.2942) loss 3.9996 (3.3509) grad_norm 1.7038 (1.9127) [2022-10-02 13:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][700/1251] eta 0:02:41 lr 0.000292 time 0.2871 (0.2934) loss 2.7543 (3.3503) grad_norm 1.7933 (1.9148) [2022-10-02 13:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][800/1251] eta 0:02:12 lr 0.000291 time 0.2908 (0.2928) loss 2.9267 (3.3555) grad_norm 2.2665 (1.9183) [2022-10-02 13:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][900/1251] eta 0:01:42 lr 0.000291 time 0.2889 (0.2922) loss 4.0795 (3.3723) grad_norm 1.7979 (1.9215) [2022-10-02 13:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1000/1251] eta 0:01:13 lr 0.000290 time 0.2885 (0.2918) loss 3.3182 (3.3708) grad_norm 1.5718 (1.9216) [2022-10-02 13:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1100/1251] eta 0:00:44 lr 0.000290 time 0.2892 (0.2915) loss 3.3919 (3.3705) grad_norm 1.8914 (1.9185) [2022-10-02 13:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1200/1251] eta 0:00:14 lr 0.000290 time 0.2885 (0.2913) loss 2.2449 (3.3717) grad_norm 1.6135 (1.9148) [2022-10-02 13:08:37 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 192 training takes 0:06:04 [2022-10-02 13:08:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.897 (2.897) Loss 0.9215 (0.9215) Acc@1 79.492 (79.492) Acc@5 93.359 (93.359) [2022-10-02 13:08:50 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.000 Acc@5 94.316 [2022-10-02 13:08:50 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-02 13:08:50 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.05% [2022-10-02 13:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][0/1251] eta 0:46:59 lr 0.000290 time 2.2536 (2.2536) loss 3.8952 (3.8952) grad_norm 1.7338 (1.7338) [2022-10-02 13:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][100/1251] eta 0:06:01 lr 0.000289 time 0.2930 (0.3141) loss 3.0872 (3.3928) grad_norm 1.9370 (1.8825) [2022-10-02 13:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][200/1251] eta 0:05:16 lr 0.000289 time 0.2869 (0.3014) loss 2.7377 (3.3774) grad_norm 1.7342 (1.9310) [2022-10-02 13:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][300/1251] eta 0:04:42 lr 0.000288 time 0.2875 (0.2970) loss 2.3237 (3.3614) grad_norm 2.0362 (1.9224) [2022-10-02 13:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][400/1251] eta 0:04:10 lr 0.000288 time 0.2873 (0.2949) loss 3.7157 (3.3355) grad_norm 1.8828 (1.9149) [2022-10-02 13:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][500/1251] eta 0:03:40 lr 0.000288 time 0.2869 (0.2937) loss 3.1670 (3.3291) grad_norm 1.7212 (1.9054) [2022-10-02 13:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][600/1251] eta 0:03:10 lr 0.000287 time 0.2855 (0.2928) loss 3.8405 (3.3396) grad_norm 1.6541 (1.9024) [2022-10-02 13:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][700/1251] eta 0:02:40 lr 0.000287 time 0.2846 (0.2922) loss 3.5561 (3.3429) grad_norm 2.0543 (1.9046) [2022-10-02 13:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][800/1251] eta 0:02:11 lr 0.000287 time 0.2865 (0.2917) loss 3.6540 (3.3470) grad_norm 1.7818 (1.9066) [2022-10-02 13:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][900/1251] eta 0:01:42 lr 0.000286 time 0.2875 (0.2913) loss 3.8263 (3.3429) grad_norm 1.9305 (1.9114) [2022-10-02 13:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1000/1251] eta 0:01:13 lr 0.000286 time 0.2887 (0.2910) loss 3.4786 (3.3429) grad_norm 2.0008 (1.9214) [2022-10-02 13:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1100/1251] eta 0:00:43 lr 0.000285 time 0.2860 (0.2907) loss 2.7639 (3.3399) grad_norm 1.9421 (1.9209) [2022-10-02 13:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1200/1251] eta 0:00:14 lr 0.000285 time 0.2863 (0.2904) loss 2.9690 (3.3337) grad_norm 2.2289 (1.9161) [2022-10-02 13:14:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 193 training takes 0:06:03 [2022-10-02 13:14:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.616 (2.616) Loss 0.9334 (0.9334) Acc@1 79.785 (79.785) Acc@5 94.629 (94.629) [2022-10-02 13:15:06 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.860 Acc@5 94.248 [2022-10-02 13:15:06 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-10-02 13:15:06 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.05% [2022-10-02 13:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][0/1251] eta 1:02:33 lr 0.000285 time 3.0007 (3.0007) loss 3.0642 (3.0642) grad_norm 1.8729 (1.8729) [2022-10-02 13:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][100/1251] eta 0:06:07 lr 0.000285 time 0.2920 (0.3190) loss 3.9026 (3.3949) grad_norm 2.4131 (1.9333) [2022-10-02 13:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][200/1251] eta 0:05:20 lr 0.000284 time 0.2888 (0.3050) loss 3.6274 (3.3469) grad_norm 1.9944 (1.9191) [2022-10-02 13:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][300/1251] eta 0:04:45 lr 0.000284 time 0.2967 (0.3003) loss 3.3803 (3.3424) grad_norm 2.5287 (1.9225) [2022-10-02 13:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][400/1251] eta 0:04:13 lr 0.000283 time 0.2914 (0.2980) loss 3.5493 (3.3385) grad_norm 1.8394 (1.9174) [2022-10-02 13:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][500/1251] eta 0:03:42 lr 0.000283 time 0.2942 (0.2964) loss 3.1971 (3.3366) grad_norm 1.8454 (1.9174) [2022-10-02 13:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][600/1251] eta 0:03:12 lr 0.000283 time 0.2886 (0.2954) loss 3.5823 (3.3495) grad_norm 2.1493 (1.9233) [2022-10-02 13:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][700/1251] eta 0:02:42 lr 0.000282 time 0.2921 (0.2946) loss 3.7246 (3.3360) grad_norm 2.1293 (1.9217) [2022-10-02 13:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][800/1251] eta 0:02:12 lr 0.000282 time 0.2899 (0.2940) loss 3.8296 (3.3458) grad_norm 1.7501 (1.9233) [2022-10-02 13:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][900/1251] eta 0:01:43 lr 0.000282 time 0.2900 (0.2935) loss 3.7851 (3.3343) grad_norm 1.9837 (1.9221) [2022-10-02 13:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1000/1251] eta 0:01:13 lr 0.000281 time 0.2923 (0.2932) loss 2.4122 (3.3410) grad_norm 1.6910 (1.9219) [2022-10-02 13:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1100/1251] eta 0:00:44 lr 0.000281 time 0.2900 (0.2928) loss 3.5807 (3.3344) grad_norm 2.1634 (1.9270) [2022-10-02 13:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1200/1251] eta 0:00:14 lr 0.000280 time 0.2916 (0.2925) loss 2.3325 (3.3362) grad_norm 2.0343 (1.9272) [2022-10-02 13:21:13 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 194 training takes 0:06:06 [2022-10-02 13:21:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.181 (3.181) Loss 0.9332 (0.9332) Acc@1 79.004 (79.004) Acc@5 93.652 (93.652) [2022-10-02 13:21:25 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 77.984 Acc@5 94.284 [2022-10-02 13:21:25 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-10-02 13:21:25 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.05% [2022-10-02 13:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][0/1251] eta 0:46:06 lr 0.000280 time 2.2118 (2.2118) loss 3.5784 (3.5784) grad_norm 1.9982 (1.9982) [2022-10-02 13:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][100/1251] eta 0:06:01 lr 0.000280 time 0.2945 (0.3145) loss 4.3206 (3.3032) grad_norm 2.7327 (1.9397) [2022-10-02 13:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][200/1251] eta 0:05:16 lr 0.000280 time 0.2888 (0.3016) loss 3.3502 (3.3429) grad_norm 1.6309 (1.9354) [2022-10-02 13:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][300/1251] eta 0:04:42 lr 0.000279 time 0.2908 (0.2973) loss 2.5290 (3.3577) grad_norm 1.9847 (1.9288) [2022-10-02 13:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][400/1251] eta 0:04:11 lr 0.000279 time 0.2868 (0.2952) loss 3.6585 (3.3542) grad_norm 2.5306 (1.9198) [2022-10-02 13:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][500/1251] eta 0:03:40 lr 0.000278 time 0.2943 (0.2940) loss 3.2239 (3.3482) grad_norm 1.6965 (1.9208) [2022-10-02 13:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][600/1251] eta 0:03:10 lr 0.000278 time 0.2856 (0.2930) loss 3.7277 (3.3435) grad_norm 1.8275 (1.9175) [2022-10-02 13:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][700/1251] eta 0:02:41 lr 0.000278 time 0.2896 (0.2924) loss 2.4891 (3.3478) grad_norm 1.6074 (1.9225) [2022-10-02 13:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][800/1251] eta 0:02:11 lr 0.000277 time 0.2868 (0.2919) loss 3.0576 (3.3543) grad_norm 1.6892 (1.9213) [2022-10-02 13:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][900/1251] eta 0:01:42 lr 0.000277 time 0.2857 (0.2916) loss 2.7320 (3.3443) grad_norm 1.9285 (1.9195) [2022-10-02 13:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1000/1251] eta 0:01:13 lr 0.000277 time 0.2841 (0.2912) loss 3.8668 (3.3558) grad_norm 1.6489 (1.9248) [2022-10-02 13:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1100/1251] eta 0:00:43 lr 0.000276 time 0.2893 (0.2909) loss 2.8027 (3.3547) grad_norm 1.7701 (1.9251) [2022-10-02 13:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1200/1251] eta 0:00:14 lr 0.000276 time 0.2926 (0.2906) loss 3.3378 (3.3501) grad_norm 1.9356 (1.9257) [2022-10-02 13:27:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 195 training takes 0:06:03 [2022-10-02 13:27:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.348 (3.348) Loss 0.9623 (0.9623) Acc@1 78.027 (78.027) Acc@5 94.043 (94.043) [2022-10-02 13:27:42 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.098 Acc@5 94.322 [2022-10-02 13:27:42 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-02 13:27:42 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.10% [2022-10-02 13:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][0/1251] eta 0:49:59 lr 0.000276 time 2.3979 (2.3979) loss 3.7458 (3.7458) grad_norm 1.6687 (1.6687) [2022-10-02 13:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][100/1251] eta 0:06:03 lr 0.000275 time 0.2890 (0.3157) loss 2.5848 (3.3039) grad_norm 2.1078 (1.9166) [2022-10-02 13:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][200/1251] eta 0:05:17 lr 0.000275 time 0.2887 (0.3020) loss 2.3273 (3.2846) grad_norm 1.8358 (1.9284) [2022-10-02 13:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][300/1251] eta 0:04:42 lr 0.000275 time 0.2883 (0.2975) loss 3.7756 (3.2583) grad_norm 1.9010 (1.9251) [2022-10-02 13:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][400/1251] eta 0:04:11 lr 0.000274 time 0.2867 (0.2952) loss 3.4778 (3.2686) grad_norm 2.0378 (1.9252) [2022-10-02 13:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][500/1251] eta 0:03:40 lr 0.000274 time 0.2873 (0.2938) loss 3.9036 (3.2975) grad_norm 1.7369 (1.9334) [2022-10-02 13:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][600/1251] eta 0:03:10 lr 0.000273 time 0.2895 (0.2928) loss 3.6372 (3.3082) grad_norm 1.8744 (1.9240) [2022-10-02 13:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][700/1251] eta 0:02:40 lr 0.000273 time 0.2888 (0.2921) loss 3.6603 (3.3266) grad_norm 2.1070 (1.9219) [2022-10-02 13:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][800/1251] eta 0:02:11 lr 0.000273 time 0.2863 (0.2915) loss 3.8740 (3.3284) grad_norm 1.7597 (1.9162) [2022-10-02 13:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][900/1251] eta 0:01:42 lr 0.000272 time 0.2879 (0.2911) loss 2.5330 (3.3192) grad_norm 1.8575 (1.9230) [2022-10-02 13:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1000/1251] eta 0:01:12 lr 0.000272 time 0.2858 (0.2908) loss 2.9424 (3.3236) grad_norm 1.5606 (1.9280) [2022-10-02 13:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1100/1251] eta 0:00:43 lr 0.000272 time 0.2897 (0.2905) loss 2.7560 (3.3287) grad_norm 2.0912 (1.9299) [2022-10-02 13:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1200/1251] eta 0:00:14 lr 0.000271 time 0.2871 (0.2902) loss 3.6623 (3.3346) grad_norm 2.0689 (1.9312) [2022-10-02 13:33:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 196 training takes 0:06:03 [2022-10-02 13:33:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.332 (3.332) Loss 0.9071 (0.9071) Acc@1 78.711 (78.711) Acc@5 94.727 (94.727) [2022-10-02 13:33:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.102 Acc@5 94.426 [2022-10-02 13:33:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-10-02 13:33:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.10% [2022-10-02 13:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][0/1251] eta 0:57:12 lr 0.000271 time 2.7440 (2.7440) loss 3.5502 (3.5502) grad_norm 2.0438 (2.0438) [2022-10-02 13:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][100/1251] eta 0:06:01 lr 0.000271 time 0.2885 (0.3145) loss 3.5869 (3.3193) grad_norm 1.7808 (1.9247) [2022-10-02 13:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][200/1251] eta 0:05:17 lr 0.000270 time 0.2877 (0.3017) loss 3.4386 (3.3337) grad_norm 2.2650 (1.9132) [2022-10-02 13:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][300/1251] eta 0:04:42 lr 0.000270 time 0.2853 (0.2975) loss 3.6043 (3.3457) grad_norm 2.1562 (1.9349) [2022-10-02 13:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][400/1251] eta 0:04:11 lr 0.000270 time 0.2888 (0.2952) loss 2.7912 (3.3284) grad_norm 1.6570 (1.9363) [2022-10-02 13:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][500/1251] eta 0:03:40 lr 0.000269 time 0.2931 (0.2939) loss 3.4830 (3.3279) grad_norm 2.2202 (1.9449) [2022-10-02 13:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][600/1251] eta 0:03:10 lr 0.000269 time 0.2904 (0.2931) loss 3.7926 (3.3219) grad_norm 1.8095 (1.9455) [2022-10-02 13:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][700/1251] eta 0:02:41 lr 0.000269 time 0.2873 (0.2924) loss 3.4254 (3.3300) grad_norm 1.8036 (1.9457) [2022-10-02 13:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][800/1251] eta 0:02:11 lr 0.000268 time 0.2928 (0.2920) loss 3.0338 (3.3378) grad_norm 2.0135 (1.9418) [2022-10-02 13:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][900/1251] eta 0:01:42 lr 0.000268 time 0.2890 (0.2916) loss 3.6380 (3.3315) grad_norm 1.8276 (1.9371) [2022-10-02 13:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1000/1251] eta 0:01:13 lr 0.000267 time 0.2878 (0.2912) loss 3.5475 (3.3347) grad_norm 2.0201 (1.9329) [2022-10-02 13:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1100/1251] eta 0:00:43 lr 0.000267 time 0.2877 (0.2909) loss 3.6683 (3.3378) grad_norm 1.9350 (1.9335) [2022-10-02 13:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1200/1251] eta 0:00:14 lr 0.000267 time 0.2864 (0.2908) loss 3.6076 (3.3431) grad_norm 1.8669 (1.9375) [2022-10-02 13:40:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 197 training takes 0:06:04 [2022-10-02 13:40:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.661 (2.661) Loss 0.9271 (0.9271) Acc@1 77.734 (77.734) Acc@5 94.238 (94.238) [2022-10-02 13:40:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.158 Acc@5 94.302 [2022-10-02 13:40:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-02 13:40:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.16% [2022-10-02 13:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][0/1251] eta 1:06:39 lr 0.000267 time 3.1974 (3.1974) loss 3.4669 (3.4669) grad_norm 1.7090 (1.7090) [2022-10-02 13:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][100/1251] eta 0:06:08 lr 0.000266 time 0.2916 (0.3202) loss 3.2200 (3.3622) grad_norm 1.9186 (1.9059) [2022-10-02 13:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][200/1251] eta 0:05:21 lr 0.000266 time 0.2909 (0.3055) loss 3.9557 (3.3348) grad_norm 1.7439 (1.9086) [2022-10-02 13:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][300/1251] eta 0:04:45 lr 0.000265 time 0.2889 (0.3004) loss 2.9861 (3.3145) grad_norm 1.9814 (1.9037) [2022-10-02 13:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][400/1251] eta 0:04:13 lr 0.000265 time 0.2913 (0.2979) loss 3.3801 (3.3211) grad_norm 1.6916 (1.9219) [2022-10-02 13:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][500/1251] eta 0:03:42 lr 0.000265 time 0.2868 (0.2964) loss 3.2675 (3.3187) grad_norm 1.7330 (1.9377) [2022-10-02 13:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][600/1251] eta 0:03:12 lr 0.000264 time 0.2929 (0.2954) loss 2.9247 (3.3318) grad_norm 1.6306 (1.9430) [2022-10-02 13:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][700/1251] eta 0:02:42 lr 0.000264 time 0.2881 (0.2945) loss 3.1402 (3.3237) grad_norm 1.8948 (1.9413) [2022-10-02 13:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][800/1251] eta 0:02:12 lr 0.000264 time 0.2887 (0.2938) loss 2.7096 (3.3246) grad_norm 1.6347 (1.9408) [2022-10-02 13:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][900/1251] eta 0:01:42 lr 0.000263 time 0.2859 (0.2932) loss 2.8342 (3.3199) grad_norm 1.7372 (1.9447) [2022-10-02 13:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1000/1251] eta 0:01:13 lr 0.000263 time 0.2923 (0.2928) loss 2.9195 (3.3174) grad_norm 1.7854 (1.9455) [2022-10-02 13:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1100/1251] eta 0:00:44 lr 0.000263 time 0.2852 (0.2924) loss 3.6824 (3.3165) grad_norm 2.2016 (1.9428) [2022-10-02 13:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1200/1251] eta 0:00:14 lr 0.000262 time 0.2875 (0.2921) loss 3.2584 (3.3237) grad_norm 1.8480 (1.9434) [2022-10-02 13:46:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 198 training takes 0:06:05 [2022-10-02 13:46:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.453 (2.453) Loss 0.9245 (0.9245) Acc@1 77.051 (77.051) Acc@5 94.922 (94.922) [2022-10-02 13:46:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.244 Acc@5 94.396 [2022-10-02 13:46:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-10-02 13:46:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.24% [2022-10-02 13:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][0/1251] eta 0:50:48 lr 0.000262 time 2.4371 (2.4371) loss 2.4110 (2.4110) grad_norm 1.8839 (1.8839) [2022-10-02 13:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][100/1251] eta 0:06:08 lr 0.000262 time 0.2929 (0.3201) loss 3.8762 (3.2608) grad_norm 1.6609 (1.9794) [2022-10-02 13:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][200/1251] eta 0:05:21 lr 0.000261 time 0.2907 (0.3056) loss 2.2503 (3.3046) grad_norm 1.9309 (1.9525) [2022-10-02 13:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][300/1251] eta 0:04:45 lr 0.000261 time 0.2929 (0.3007) loss 2.7741 (3.3159) grad_norm 1.7407 (1.9576) [2022-10-02 13:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][400/1251] eta 0:04:13 lr 0.000261 time 0.2964 (0.2982) loss 3.2132 (3.3091) grad_norm 1.8238 (1.9543) [2022-10-02 13:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][500/1251] eta 0:03:42 lr 0.000260 time 0.2927 (0.2967) loss 2.7551 (3.3031) grad_norm 1.6712 (1.9469) [2022-10-02 13:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][600/1251] eta 0:03:12 lr 0.000260 time 0.2955 (0.2957) loss 3.6132 (3.3039) grad_norm 1.8031 (1.9444) [2022-10-02 13:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][700/1251] eta 0:02:42 lr 0.000259 time 0.2945 (0.2950) loss 3.1601 (3.3089) grad_norm 1.6832 (1.9498) [2022-10-02 13:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][800/1251] eta 0:02:12 lr 0.000259 time 0.2910 (0.2944) loss 3.9911 (3.3151) grad_norm 2.2019 (1.9483) [2022-10-02 13:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][900/1251] eta 0:01:43 lr 0.000259 time 0.2905 (0.2939) loss 2.6781 (3.3249) grad_norm 1.8084 (1.9479) [2022-10-02 13:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1000/1251] eta 0:01:13 lr 0.000258 time 0.2902 (0.2935) loss 3.7002 (3.3235) grad_norm 1.9872 (1.9484) [2022-10-02 13:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1100/1251] eta 0:00:44 lr 0.000258 time 0.2885 (0.2932) loss 3.6119 (3.3166) grad_norm 2.8053 (1.9501) [2022-10-02 13:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1200/1251] eta 0:00:14 lr 0.000258 time 0.2913 (0.2929) loss 3.3298 (3.3052) grad_norm 1.7286 (1.9522) [2022-10-02 13:52:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 199 training takes 0:06:06 [2022-10-02 13:52:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.872 (2.872) Loss 0.9310 (0.9310) Acc@1 78.516 (78.516) Acc@5 94.043 (94.043) [2022-10-02 13:52:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.386 Acc@5 94.380 [2022-10-02 13:52:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-02 13:52:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.39% [2022-10-02 13:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][0/1251] eta 1:01:13 lr 0.000258 time 2.9363 (2.9363) loss 3.1311 (3.1311) grad_norm 2.1606 (2.1606) [2022-10-02 13:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][100/1251] eta 0:06:04 lr 0.000257 time 0.2889 (0.3165) loss 3.6338 (3.2945) grad_norm 1.8191 (1.9620) [2022-10-02 13:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][200/1251] eta 0:05:18 lr 0.000257 time 0.2913 (0.3028) loss 3.0424 (3.2921) grad_norm 1.9781 (1.9541) [2022-10-02 13:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][300/1251] eta 0:04:43 lr 0.000256 time 0.2859 (0.2982) loss 3.2490 (3.2919) grad_norm 2.1559 (1.9475) [2022-10-02 13:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][400/1251] eta 0:04:11 lr 0.000256 time 0.2876 (0.2958) loss 3.2385 (3.2951) grad_norm 2.0358 (1.9617) [2022-10-02 13:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][500/1251] eta 0:03:41 lr 0.000256 time 0.2890 (0.2944) loss 3.5945 (3.2909) grad_norm 1.7585 (1.9657) [2022-10-02 13:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][600/1251] eta 0:03:11 lr 0.000255 time 0.2858 (0.2935) loss 3.4815 (3.2980) grad_norm 1.7506 (1.9728) [2022-10-02 13:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][700/1251] eta 0:02:41 lr 0.000255 time 0.2911 (0.2927) loss 2.9270 (3.3012) grad_norm 1.9424 (1.9718) [2022-10-02 13:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][800/1251] eta 0:02:11 lr 0.000255 time 0.2868 (0.2922) loss 3.5030 (3.3088) grad_norm 2.1965 (1.9773) [2022-10-02 13:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][900/1251] eta 0:01:42 lr 0.000254 time 0.2867 (0.2917) loss 3.6765 (3.3119) grad_norm 1.7842 (1.9774) [2022-10-02 13:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1000/1251] eta 0:01:13 lr 0.000254 time 0.2885 (0.2913) loss 2.7164 (3.3140) grad_norm 1.8295 (1.9812) [2022-10-02 13:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1100/1251] eta 0:00:43 lr 0.000254 time 0.2890 (0.2912) loss 3.8634 (3.3131) grad_norm 2.1023 (1.9853) [2022-10-02 13:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1200/1251] eta 0:00:14 lr 0.000253 time 0.2867 (0.2910) loss 3.7299 (3.3204) grad_norm 2.3359 (1.9851) [2022-10-02 13:58:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 200 training takes 0:06:04 [2022-10-02 13:58:57 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saving...... [2022-10-02 13:58:57 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saved !!! [2022-10-02 13:59:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.970 (2.970) Loss 0.8691 (0.8691) Acc@1 79.883 (79.883) Acc@5 94.824 (94.824) [2022-10-02 13:59:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.354 Acc@5 94.540 [2022-10-02 13:59:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-02 13:59:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.39% [2022-10-02 13:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][0/1251] eta 0:45:56 lr 0.000253 time 2.2036 (2.2036) loss 2.4784 (2.4784) grad_norm 1.8755 (1.8755) [2022-10-02 13:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][100/1251] eta 0:06:01 lr 0.000253 time 0.2860 (0.3137) loss 3.2720 (3.3309) grad_norm 2.0281 (2.0080) [2022-10-02 14:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][200/1251] eta 0:05:15 lr 0.000252 time 0.2874 (0.3003) loss 4.1358 (3.3803) grad_norm 2.1742 (1.9893) [2022-10-02 14:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][300/1251] eta 0:04:41 lr 0.000252 time 0.2863 (0.2958) loss 3.7675 (3.3338) grad_norm 2.0315 (1.9857) [2022-10-02 14:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][400/1251] eta 0:04:09 lr 0.000252 time 0.2854 (0.2936) loss 3.4698 (3.3322) grad_norm 2.1168 (1.9768) [2022-10-02 14:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][500/1251] eta 0:03:39 lr 0.000251 time 0.2860 (0.2924) loss 3.4906 (3.3459) grad_norm 1.8942 (1.9877) [2022-10-02 14:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][600/1251] eta 0:03:09 lr 0.000251 time 0.2859 (0.2915) loss 3.5188 (3.3363) grad_norm 2.8226 (2.0000) [2022-10-02 14:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][700/1251] eta 0:02:40 lr 0.000251 time 0.2867 (0.2909) loss 3.6408 (3.3294) grad_norm 1.8700 (1.9970) [2022-10-02 14:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][800/1251] eta 0:02:10 lr 0.000250 time 0.2848 (0.2904) loss 3.9116 (3.3346) grad_norm 2.0315 (1.9969) [2022-10-02 14:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][900/1251] eta 0:01:41 lr 0.000250 time 0.2879 (0.2900) loss 4.0205 (3.3355) grad_norm 2.0039 (1.9956) [2022-10-02 14:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1000/1251] eta 0:01:12 lr 0.000249 time 0.2859 (0.2897) loss 3.0823 (3.3362) grad_norm 1.8296 (1.9924) [2022-10-02 14:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1100/1251] eta 0:00:43 lr 0.000249 time 0.2884 (0.2895) loss 3.8876 (3.3317) grad_norm 2.0613 (1.9903) [2022-10-02 14:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1200/1251] eta 0:00:14 lr 0.000249 time 0.2863 (0.2893) loss 3.2979 (3.3309) grad_norm 2.2107 (1.9918) [2022-10-02 14:05:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 201 training takes 0:06:02 [2022-10-02 14:05:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.581 (2.581) Loss 0.9347 (0.9347) Acc@1 78.418 (78.418) Acc@5 94.531 (94.531) [2022-10-02 14:05:25 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.426 Acc@5 94.404 [2022-10-02 14:05:25 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-10-02 14:05:25 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.43% [2022-10-02 14:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][0/1251] eta 0:59:16 lr 0.000249 time 2.8426 (2.8426) loss 3.8187 (3.8187) grad_norm 1.8322 (1.8322) [2022-10-02 14:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][100/1251] eta 0:06:06 lr 0.000248 time 0.2910 (0.3180) loss 3.9147 (3.4027) grad_norm 1.8129 (1.9928) [2022-10-02 14:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][200/1251] eta 0:05:20 lr 0.000248 time 0.2885 (0.3046) loss 3.4347 (3.3498) grad_norm 1.8192 (2.0068) [2022-10-02 14:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][300/1251] eta 0:04:45 lr 0.000248 time 0.2954 (0.3001) loss 3.5146 (3.3326) grad_norm 2.0544 (2.0015) [2022-10-02 14:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][400/1251] eta 0:04:13 lr 0.000247 time 0.2876 (0.2979) loss 4.0638 (3.3228) grad_norm 2.0109 (1.9986) [2022-10-02 14:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][500/1251] eta 0:03:42 lr 0.000247 time 0.2892 (0.2964) loss 3.7836 (3.3335) grad_norm 1.9524 (1.9992) [2022-10-02 14:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][600/1251] eta 0:03:12 lr 0.000246 time 0.2903 (0.2954) loss 3.8782 (3.3410) grad_norm 1.9251 (1.9925) [2022-10-02 14:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][700/1251] eta 0:02:42 lr 0.000246 time 0.2940 (0.2945) loss 3.8839 (3.3378) grad_norm 1.9947 (1.9864) [2022-10-02 14:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][800/1251] eta 0:02:12 lr 0.000246 time 0.2906 (0.2937) loss 4.1909 (3.3281) grad_norm 2.0145 (1.9900) [2022-10-02 14:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][900/1251] eta 0:01:42 lr 0.000245 time 0.2886 (0.2931) loss 3.1418 (3.3160) grad_norm 2.1841 (1.9929) [2022-10-02 14:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1000/1251] eta 0:01:13 lr 0.000245 time 0.2897 (0.2927) loss 3.6112 (3.3120) grad_norm 1.7786 (1.9965) [2022-10-02 14:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1100/1251] eta 0:00:44 lr 0.000245 time 0.2941 (0.2924) loss 3.2854 (3.3127) grad_norm 1.9537 (1.9920) [2022-10-02 14:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1200/1251] eta 0:00:14 lr 0.000244 time 0.2889 (0.2920) loss 3.2589 (3.3141) grad_norm 2.0391 (1.9956) [2022-10-02 14:11:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 202 training takes 0:06:05 [2022-10-02 14:11:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.404 (2.404) Loss 0.9022 (0.9022) Acc@1 78.027 (78.027) Acc@5 94.141 (94.141) [2022-10-02 14:11:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.670 Acc@5 94.546 [2022-10-02 14:11:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-10-02 14:11:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.67% [2022-10-02 14:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][0/1251] eta 0:50:11 lr 0.000244 time 2.4076 (2.4076) loss 2.8953 (2.8953) grad_norm 2.0809 (2.0809) [2022-10-02 14:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][100/1251] eta 0:06:01 lr 0.000244 time 0.2946 (0.3142) loss 3.8162 (3.3300) grad_norm 1.8030 (1.9660) [2022-10-02 14:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][200/1251] eta 0:05:16 lr 0.000243 time 0.2854 (0.3007) loss 3.9822 (3.3564) grad_norm 1.7898 (1.9797) [2022-10-02 14:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][300/1251] eta 0:04:41 lr 0.000243 time 0.2878 (0.2961) loss 3.3675 (3.3252) grad_norm 2.0785 (1.9751) [2022-10-02 14:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][400/1251] eta 0:04:10 lr 0.000243 time 0.2882 (0.2939) loss 3.0993 (3.3112) grad_norm 2.1584 (1.9852) [2022-10-02 14:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][500/1251] eta 0:03:39 lr 0.000242 time 0.2876 (0.2925) loss 2.1616 (3.3021) grad_norm 1.9969 (1.9840) [2022-10-02 14:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][600/1251] eta 0:03:09 lr 0.000242 time 0.2839 (0.2916) loss 4.0151 (3.3144) grad_norm 2.2092 (2.0025) [2022-10-02 14:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][700/1251] eta 0:02:40 lr 0.000242 time 0.2867 (0.2909) loss 2.9976 (3.3130) grad_norm 1.9580 (2.0000) [2022-10-02 14:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][800/1251] eta 0:02:10 lr 0.000241 time 0.2857 (0.2903) loss 2.9203 (3.3259) grad_norm 1.8498 (1.9943) [2022-10-02 14:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][900/1251] eta 0:01:41 lr 0.000241 time 0.2862 (0.2899) loss 3.0697 (3.3262) grad_norm 2.5626 (2.0032) [2022-10-02 14:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1000/1251] eta 0:01:12 lr 0.000241 time 0.2908 (0.2898) loss 2.5082 (3.3311) grad_norm 1.9387 (2.0014) [2022-10-02 14:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1100/1251] eta 0:00:43 lr 0.000240 time 0.2864 (0.2896) loss 2.5703 (3.3279) grad_norm 1.9513 (2.0028) [2022-10-02 14:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1200/1251] eta 0:00:14 lr 0.000240 time 0.2833 (0.2894) loss 3.7134 (3.3303) grad_norm 1.8224 (2.0023) [2022-10-02 14:17:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 203 training takes 0:06:02 [2022-10-02 14:17:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.215 (3.215) Loss 0.8825 (0.8825) Acc@1 76.953 (76.953) Acc@5 95.605 (95.605) [2022-10-02 14:17:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.546 Acc@5 94.560 [2022-10-02 14:17:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-02 14:17:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.67% [2022-10-02 14:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][0/1251] eta 0:49:04 lr 0.000240 time 2.3535 (2.3535) loss 2.5798 (2.5798) grad_norm 1.8826 (1.8826) [2022-10-02 14:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][100/1251] eta 0:06:05 lr 0.000239 time 0.2868 (0.3174) loss 3.3014 (3.3648) grad_norm 2.0496 (1.9902) [2022-10-02 14:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][200/1251] eta 0:05:19 lr 0.000239 time 0.2918 (0.3040) loss 3.5656 (3.3100) grad_norm 1.8871 (1.9834) [2022-10-02 14:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][300/1251] eta 0:04:44 lr 0.000239 time 0.2921 (0.2994) loss 2.3292 (3.3246) grad_norm 1.7915 (1.9940) [2022-10-02 14:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][400/1251] eta 0:04:12 lr 0.000238 time 0.2904 (0.2969) loss 3.1846 (3.3447) grad_norm 2.1460 (2.0012) [2022-10-02 14:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][500/1251] eta 0:03:41 lr 0.000238 time 0.2893 (0.2955) loss 3.5882 (3.3373) grad_norm 1.8252 (2.0094) [2022-10-02 14:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][600/1251] eta 0:03:11 lr 0.000238 time 0.2892 (0.2945) loss 2.4325 (3.3512) grad_norm 2.0732 (2.0113) [2022-10-02 14:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][700/1251] eta 0:02:41 lr 0.000237 time 0.2901 (0.2938) loss 3.4881 (3.3459) grad_norm 2.1206 (2.0154) [2022-10-02 14:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][800/1251] eta 0:02:12 lr 0.000237 time 0.2889 (0.2932) loss 3.3481 (3.3480) grad_norm 1.9133 (2.0155) [2022-10-02 14:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][900/1251] eta 0:01:42 lr 0.000237 time 0.2899 (0.2927) loss 2.7246 (3.3377) grad_norm 1.8674 (2.0118) [2022-10-02 14:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1000/1251] eta 0:01:13 lr 0.000236 time 0.2897 (0.2923) loss 2.8587 (3.3306) grad_norm 2.8825 (2.0148) [2022-10-02 14:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1100/1251] eta 0:00:44 lr 0.000236 time 0.2902 (0.2919) loss 2.7825 (3.3378) grad_norm 1.9803 (2.0209) [2022-10-02 14:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1200/1251] eta 0:00:14 lr 0.000236 time 0.2889 (0.2916) loss 2.9501 (3.3377) grad_norm 1.7712 (2.0244) [2022-10-02 14:24:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 204 training takes 0:06:05 [2022-10-02 14:24:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.751 (2.751) Loss 0.9934 (0.9934) Acc@1 78.516 (78.516) Acc@5 93.555 (93.555) [2022-10-02 14:24:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.476 Acc@5 94.526 [2022-10-02 14:24:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-02 14:24:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.67% [2022-10-02 14:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][0/1251] eta 0:48:11 lr 0.000235 time 2.3110 (2.3110) loss 3.4362 (3.4362) grad_norm 2.0203 (2.0203) [2022-10-02 14:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][100/1251] eta 0:06:03 lr 0.000235 time 0.2893 (0.3161) loss 3.2906 (3.4108) grad_norm 1.9814 (2.0253) [2022-10-02 14:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][200/1251] eta 0:05:18 lr 0.000235 time 0.2955 (0.3028) loss 3.9414 (3.3486) grad_norm 2.3878 (2.0378) [2022-10-02 14:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][300/1251] eta 0:04:43 lr 0.000234 time 0.2859 (0.2981) loss 3.5859 (3.3014) grad_norm 2.0943 (2.0206) [2022-10-02 14:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][400/1251] eta 0:04:11 lr 0.000234 time 0.2869 (0.2959) loss 3.5736 (3.2885) grad_norm 1.9663 (2.0237) [2022-10-02 14:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][500/1251] eta 0:03:41 lr 0.000234 time 0.2895 (0.2944) loss 3.9521 (3.3119) grad_norm 1.8761 (2.0251) [2022-10-02 14:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][600/1251] eta 0:03:11 lr 0.000233 time 0.2891 (0.2935) loss 3.2225 (3.3124) grad_norm 1.8978 (2.0263) [2022-10-02 14:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][700/1251] eta 0:02:41 lr 0.000233 time 0.2886 (0.2929) loss 3.2702 (3.2989) grad_norm 2.2595 (2.0246) [2022-10-02 14:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][800/1251] eta 0:02:11 lr 0.000233 time 0.2888 (0.2923) loss 3.5182 (3.2964) grad_norm 1.7974 (2.0348) [2022-10-02 14:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][900/1251] eta 0:01:42 lr 0.000232 time 0.2933 (0.2919) loss 3.6441 (3.2939) grad_norm 2.0379 (2.0306) [2022-10-02 14:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1000/1251] eta 0:01:13 lr 0.000232 time 0.2927 (0.2915) loss 2.5653 (3.2950) grad_norm 2.0412 (2.0310) [2022-10-02 14:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1100/1251] eta 0:00:43 lr 0.000232 time 0.2891 (0.2913) loss 2.1848 (3.2968) grad_norm 2.2837 (2.0327) [2022-10-02 14:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1200/1251] eta 0:00:14 lr 0.000231 time 0.2894 (0.2910) loss 3.7265 (3.2982) grad_norm 2.3265 (2.0310) [2022-10-02 14:30:20 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 205 training takes 0:06:04 [2022-10-02 14:30:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.713 (2.713) Loss 0.8356 (0.8356) Acc@1 81.055 (81.055) Acc@5 95.703 (95.703) [2022-10-02 14:30:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.490 Acc@5 94.572 [2022-10-02 14:30:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-10-02 14:30:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.67% [2022-10-02 14:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][0/1251] eta 0:51:06 lr 0.000231 time 2.4513 (2.4513) loss 3.6743 (3.6743) grad_norm 1.8113 (1.8113) [2022-10-02 14:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][100/1251] eta 0:06:03 lr 0.000231 time 0.2919 (0.3156) loss 2.5104 (3.2889) grad_norm 1.9895 (2.0058) [2022-10-02 14:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][200/1251] eta 0:05:18 lr 0.000230 time 0.2890 (0.3030) loss 3.9172 (3.3129) grad_norm 2.2581 (2.0634) [2022-10-02 14:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][300/1251] eta 0:04:43 lr 0.000230 time 0.2883 (0.2986) loss 2.2425 (3.3171) grad_norm 1.6660 (2.0550) [2022-10-02 14:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][400/1251] eta 0:04:12 lr 0.000230 time 0.2893 (0.2964) loss 3.3535 (3.3026) grad_norm 2.2530 (2.0583) [2022-10-02 14:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][500/1251] eta 0:03:41 lr 0.000229 time 0.2901 (0.2952) loss 1.8571 (3.2747) grad_norm 1.8003 (2.0515) [2022-10-02 14:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][600/1251] eta 0:03:11 lr 0.000229 time 0.2906 (0.2942) loss 3.4333 (3.2764) grad_norm 2.0567 (2.0537) [2022-10-02 14:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][700/1251] eta 0:02:41 lr 0.000229 time 0.2907 (0.2937) loss 3.3179 (3.2848) grad_norm 1.8786 (2.0473) [2022-10-02 14:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][800/1251] eta 0:02:12 lr 0.000228 time 0.2903 (0.2933) loss 3.5028 (3.2752) grad_norm 1.7735 (2.0474) [2022-10-02 14:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][900/1251] eta 0:01:42 lr 0.000228 time 0.2927 (0.2930) loss 3.3108 (3.2871) grad_norm 2.6933 (2.0546) [2022-10-02 14:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1000/1251] eta 0:01:13 lr 0.000228 time 0.2909 (0.2927) loss 3.2948 (3.2850) grad_norm 1.8795 (2.0560) [2022-10-02 14:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1100/1251] eta 0:00:44 lr 0.000227 time 0.2882 (0.2925) loss 3.1646 (3.2952) grad_norm 2.2804 (2.0534) [2022-10-02 14:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1200/1251] eta 0:00:14 lr 0.000227 time 0.2886 (0.2922) loss 3.6514 (3.2977) grad_norm 1.7801 (2.0550) [2022-10-02 14:36:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 206 training takes 0:06:05 [2022-10-02 14:36:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.357 (2.357) Loss 0.9382 (0.9382) Acc@1 78.223 (78.223) Acc@5 93.652 (93.652) [2022-10-02 14:36:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.762 Acc@5 94.610 [2022-10-02 14:36:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 14:36:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.76% [2022-10-02 14:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][0/1251] eta 0:57:31 lr 0.000227 time 2.7592 (2.7592) loss 3.4188 (3.4188) grad_norm 2.0412 (2.0412) [2022-10-02 14:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][100/1251] eta 0:06:04 lr 0.000226 time 0.2866 (0.3163) loss 3.3601 (3.2444) grad_norm 2.2490 (2.0651) [2022-10-02 14:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][200/1251] eta 0:05:18 lr 0.000226 time 0.2883 (0.3026) loss 3.5497 (3.2887) grad_norm 1.8529 (2.0669) [2022-10-02 14:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][300/1251] eta 0:04:43 lr 0.000226 time 0.2908 (0.2982) loss 3.6196 (3.2988) grad_norm 2.3304 (2.0781) [2022-10-02 14:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][400/1251] eta 0:04:11 lr 0.000225 time 0.2888 (0.2960) loss 2.8180 (3.2973) grad_norm 1.8848 (2.0818) [2022-10-02 14:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][500/1251] eta 0:03:41 lr 0.000225 time 0.2881 (0.2946) loss 3.5754 (3.3067) grad_norm 1.9734 (2.0745) [2022-10-02 14:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][600/1251] eta 0:03:11 lr 0.000225 time 0.2882 (0.2936) loss 3.4353 (3.2932) grad_norm 2.3218 (2.0729) [2022-10-02 14:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][700/1251] eta 0:02:41 lr 0.000224 time 0.2879 (0.2928) loss 3.6713 (3.2850) grad_norm 2.6854 (2.0751) [2022-10-02 14:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][800/1251] eta 0:02:11 lr 0.000224 time 0.2870 (0.2923) loss 3.6454 (3.2833) grad_norm 2.1676 (2.0806) [2022-10-02 14:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][900/1251] eta 0:01:42 lr 0.000224 time 0.2849 (0.2918) loss 3.3505 (3.2791) grad_norm 2.0849 (2.0807) [2022-10-02 14:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1000/1251] eta 0:01:13 lr 0.000223 time 0.2880 (0.2914) loss 3.8978 (3.2867) grad_norm 1.8635 (2.0845) [2022-10-02 14:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1100/1251] eta 0:00:43 lr 0.000223 time 0.2864 (0.2911) loss 4.1302 (3.2934) grad_norm 2.0735 (2.0766) [2022-10-02 14:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1200/1251] eta 0:00:14 lr 0.000223 time 0.2886 (0.2908) loss 2.6477 (3.2867) grad_norm 2.2185 (2.0738) [2022-10-02 14:42:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 207 training takes 0:06:04 [2022-10-02 14:42:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.694 (2.694) Loss 0.8861 (0.8861) Acc@1 78.809 (78.809) Acc@5 93.945 (93.945) [2022-10-02 14:43:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.774 Acc@5 94.592 [2022-10-02 14:43:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 14:43:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.77% [2022-10-02 14:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][0/1251] eta 0:48:42 lr 0.000222 time 2.3361 (2.3361) loss 4.1920 (4.1920) grad_norm 2.0020 (2.0020) [2022-10-02 14:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][100/1251] eta 0:06:02 lr 0.000222 time 0.2867 (0.3147) loss 3.2523 (3.2617) grad_norm 1.7930 (2.1078) [2022-10-02 14:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][200/1251] eta 0:05:16 lr 0.000222 time 0.2878 (0.3013) loss 2.4590 (3.2781) grad_norm 1.9720 (2.0823) [2022-10-02 14:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][300/1251] eta 0:04:42 lr 0.000221 time 0.2858 (0.2968) loss 2.5014 (3.2643) grad_norm 2.0011 (2.0782) [2022-10-02 14:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][400/1251] eta 0:04:10 lr 0.000221 time 0.2929 (0.2945) loss 2.9459 (3.2640) grad_norm 2.0386 (2.0822) [2022-10-02 14:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][500/1251] eta 0:03:40 lr 0.000221 time 0.2841 (0.2932) loss 3.7668 (3.2757) grad_norm 1.6866 (2.0722) [2022-10-02 14:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][600/1251] eta 0:03:10 lr 0.000220 time 0.2902 (0.2922) loss 3.5583 (3.2781) grad_norm 1.7658 (2.0773) [2022-10-02 14:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][700/1251] eta 0:02:40 lr 0.000220 time 0.2868 (0.2915) loss 4.0794 (3.2786) grad_norm 1.9413 (2.0757) [2022-10-02 14:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][800/1251] eta 0:02:11 lr 0.000220 time 0.2873 (0.2910) loss 2.6155 (3.2710) grad_norm 1.8462 (2.0740) [2022-10-02 14:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][900/1251] eta 0:01:41 lr 0.000219 time 0.2864 (0.2905) loss 3.9225 (3.2672) grad_norm 2.1054 (2.0744) [2022-10-02 14:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1000/1251] eta 0:01:12 lr 0.000219 time 0.2845 (0.2901) loss 3.7025 (3.2681) grad_norm 1.9444 (2.0752) [2022-10-02 14:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1100/1251] eta 0:00:43 lr 0.000219 time 0.2858 (0.2898) loss 2.1661 (3.2697) grad_norm 2.0200 (2.0760) [2022-10-02 14:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1200/1251] eta 0:00:14 lr 0.000218 time 0.2875 (0.2896) loss 3.2557 (3.2712) grad_norm 2.0767 (2.0764) [2022-10-02 14:49:11 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 208 training takes 0:06:02 [2022-10-02 14:49:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.147 (2.147) Loss 0.8884 (0.8884) Acc@1 78.320 (78.320) Acc@5 95.020 (95.020) [2022-10-02 14:49:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.628 Acc@5 94.602 [2022-10-02 14:49:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-10-02 14:49:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.77% [2022-10-02 14:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][0/1251] eta 1:09:32 lr 0.000218 time 3.3351 (3.3351) loss 3.0668 (3.0668) grad_norm 2.2585 (2.2585) [2022-10-02 14:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][100/1251] eta 0:06:08 lr 0.000218 time 0.2882 (0.3205) loss 2.5913 (3.3410) grad_norm 1.7441 (2.1282) [2022-10-02 14:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][200/1251] eta 0:05:21 lr 0.000218 time 0.2928 (0.3055) loss 2.6329 (3.2942) grad_norm 2.0407 (2.1286) [2022-10-02 14:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][300/1251] eta 0:04:45 lr 0.000217 time 0.2876 (0.3004) loss 3.2655 (3.2792) grad_norm 2.1965 (2.1068) [2022-10-02 14:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][400/1251] eta 0:04:13 lr 0.000217 time 0.2890 (0.2979) loss 3.6996 (3.2721) grad_norm 2.5936 (2.0922) [2022-10-02 14:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][500/1251] eta 0:03:43 lr 0.000217 time 0.2907 (0.2980) loss 3.6643 (3.2755) grad_norm 2.3311 (2.1004) [2022-10-02 14:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][600/1251] eta 0:03:13 lr 0.000216 time 0.2888 (0.2967) loss 2.4165 (3.2611) grad_norm 1.8386 (2.1179) [2022-10-02 14:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][700/1251] eta 0:02:43 lr 0.000216 time 0.2867 (0.2962) loss 2.6812 (3.2663) grad_norm 1.8600 (2.1305) [2022-10-02 14:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][800/1251] eta 0:02:13 lr 0.000216 time 0.2864 (0.2954) loss 3.2345 (3.2715) grad_norm 2.0656 (2.1297) [2022-10-02 14:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][900/1251] eta 0:01:43 lr 0.000215 time 0.2896 (0.2948) loss 4.0937 (3.2710) grad_norm 2.5843 (2.1214) [2022-10-02 14:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1000/1251] eta 0:01:13 lr 0.000215 time 0.2885 (0.2946) loss 3.6969 (3.2818) grad_norm 1.7268 (2.1173) [2022-10-02 14:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1100/1251] eta 0:00:44 lr 0.000215 time 0.2888 (0.2942) loss 3.1109 (3.2839) grad_norm 1.9174 (2.1161) [2022-10-02 14:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1200/1251] eta 0:00:14 lr 0.000214 time 0.2880 (0.2939) loss 3.2119 (3.2841) grad_norm 2.0077 (2.1135) [2022-10-02 14:55:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 209 training takes 0:06:07 [2022-10-02 14:55:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.919 (2.919) Loss 0.9212 (0.9212) Acc@1 77.930 (77.930) Acc@5 94.336 (94.336) [2022-10-02 14:55:44 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.780 Acc@5 94.526 [2022-10-02 14:55:44 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 14:55:44 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.78% [2022-10-02 14:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][0/1251] eta 0:57:38 lr 0.000214 time 2.7642 (2.7642) loss 3.9790 (3.9790) grad_norm 2.0716 (2.0716) [2022-10-02 14:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][100/1251] eta 0:06:08 lr 0.000214 time 0.2931 (0.3203) loss 2.5788 (3.1944) grad_norm 2.0844 (2.1733) [2022-10-02 14:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][200/1251] eta 0:05:22 lr 0.000213 time 0.2905 (0.3065) loss 3.3491 (3.2290) grad_norm 2.4285 (2.1220) [2022-10-02 14:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][300/1251] eta 0:04:46 lr 0.000213 time 0.2942 (0.3011) loss 4.1175 (3.2657) grad_norm 1.7577 (2.1206) [2022-10-02 14:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][400/1251] eta 0:04:13 lr 0.000213 time 0.2910 (0.2983) loss 3.6905 (3.2808) grad_norm 2.0911 (2.1255) [2022-10-02 14:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][500/1251] eta 0:03:42 lr 0.000212 time 0.2960 (0.2968) loss 3.4870 (3.2708) grad_norm 2.0361 (2.1192) [2022-10-02 14:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][600/1251] eta 0:03:12 lr 0.000212 time 0.2908 (0.2956) loss 3.6109 (3.2709) grad_norm 1.9757 (2.1150) [2022-10-02 14:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][700/1251] eta 0:02:42 lr 0.000212 time 0.2897 (0.2948) loss 3.6751 (3.2829) grad_norm 2.2076 (2.1203) [2022-10-02 14:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][800/1251] eta 0:02:12 lr 0.000211 time 0.2926 (0.2941) loss 2.6720 (3.2800) grad_norm 2.1269 (2.1166) [2022-10-02 15:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][900/1251] eta 0:01:43 lr 0.000211 time 0.2911 (0.2936) loss 3.3717 (3.2892) grad_norm 1.9358 (2.1130) [2022-10-02 15:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1000/1251] eta 0:01:13 lr 0.000211 time 0.2938 (0.2931) loss 3.5529 (3.2866) grad_norm 2.3995 (2.1114) [2022-10-02 15:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1100/1251] eta 0:00:44 lr 0.000210 time 0.2926 (0.2927) loss 3.4692 (3.2836) grad_norm 2.7700 (2.1094) [2022-10-02 15:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1200/1251] eta 0:00:14 lr 0.000210 time 0.2875 (0.2923) loss 3.2032 (3.2818) grad_norm 2.0451 (2.1142) [2022-10-02 15:01:50 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 210 training takes 0:06:06 [2022-10-02 15:01:50 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saving...... [2022-10-02 15:01:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saved !!! [2022-10-02 15:01:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.953 (2.953) Loss 0.9319 (0.9319) Acc@1 78.223 (78.223) Acc@5 94.238 (94.238) [2022-10-02 15:02:03 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.808 Acc@5 94.730 [2022-10-02 15:02:03 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 15:02:03 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.81% [2022-10-02 15:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][0/1251] eta 1:08:13 lr 0.000210 time 3.2718 (3.2718) loss 3.0765 (3.0765) grad_norm 1.9634 (1.9634) [2022-10-02 15:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][100/1251] eta 0:06:05 lr 0.000210 time 0.2886 (0.3176) loss 2.3756 (3.1780) grad_norm 2.0520 (2.1147) [2022-10-02 15:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][200/1251] eta 0:05:17 lr 0.000209 time 0.2871 (0.3022) loss 2.7483 (3.1718) grad_norm 1.9308 (2.0877) [2022-10-02 15:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][300/1251] eta 0:04:42 lr 0.000209 time 0.2892 (0.2971) loss 3.9805 (3.2187) grad_norm 1.8716 (2.0820) [2022-10-02 15:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][400/1251] eta 0:04:10 lr 0.000209 time 0.2851 (0.2946) loss 3.2598 (3.2201) grad_norm 2.2328 (2.1059) [2022-10-02 15:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][500/1251] eta 0:03:40 lr 0.000208 time 0.2860 (0.2931) loss 3.9627 (3.2330) grad_norm 2.8212 (2.1161) [2022-10-02 15:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][600/1251] eta 0:03:10 lr 0.000208 time 0.2887 (0.2922) loss 2.9844 (3.2418) grad_norm 2.0021 (2.1140) [2022-10-02 15:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][700/1251] eta 0:02:40 lr 0.000208 time 0.2863 (0.2914) loss 2.9269 (3.2511) grad_norm 2.2036 (2.1188) [2022-10-02 15:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][800/1251] eta 0:02:11 lr 0.000207 time 0.2872 (0.2909) loss 2.8462 (3.2563) grad_norm 1.9114 (2.1215) [2022-10-02 15:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][900/1251] eta 0:01:41 lr 0.000207 time 0.2867 (0.2905) loss 2.5494 (3.2638) grad_norm 2.1161 (2.1202) [2022-10-02 15:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1000/1251] eta 0:01:12 lr 0.000207 time 0.2868 (0.2902) loss 3.3377 (3.2615) grad_norm 1.9189 (2.1209) [2022-10-02 15:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1100/1251] eta 0:00:43 lr 0.000206 time 0.2887 (0.2899) loss 3.5361 (3.2574) grad_norm 1.9046 (2.1217) [2022-10-02 15:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1200/1251] eta 0:00:14 lr 0.000206 time 0.2873 (0.2896) loss 2.1992 (3.2526) grad_norm 1.8650 (2.1209) [2022-10-02 15:08:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 211 training takes 0:06:02 [2022-10-02 15:08:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.213 (3.213) Loss 0.7964 (0.7964) Acc@1 80.762 (80.762) Acc@5 95.508 (95.508) [2022-10-02 15:08:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.918 Acc@5 94.714 [2022-10-02 15:08:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-02 15:08:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-02 15:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][0/1251] eta 1:05:41 lr 0.000206 time 3.1505 (3.1505) loss 2.5475 (2.5475) grad_norm 2.3618 (2.3618) [2022-10-02 15:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][100/1251] eta 0:06:09 lr 0.000205 time 0.3024 (0.3207) loss 2.3826 (3.1963) grad_norm 2.4232 (2.0925) [2022-10-02 15:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][200/1251] eta 0:05:21 lr 0.000205 time 0.2915 (0.3059) loss 3.4379 (3.2709) grad_norm 3.0807 (2.1024) [2022-10-02 15:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][300/1251] eta 0:04:46 lr 0.000205 time 0.2960 (0.3009) loss 2.3885 (3.2324) grad_norm 2.1390 (2.1136) [2022-10-02 15:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][400/1251] eta 0:04:13 lr 0.000204 time 0.2906 (0.2984) loss 3.6380 (3.2249) grad_norm 2.1012 (2.1164) [2022-10-02 15:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][500/1251] eta 0:03:42 lr 0.000204 time 0.2972 (0.2968) loss 3.6389 (3.2242) grad_norm 2.2740 (2.1195) [2022-10-02 15:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][600/1251] eta 0:03:12 lr 0.000204 time 0.2890 (0.2959) loss 2.8806 (3.2324) grad_norm 2.2231 (2.1243) [2022-10-02 15:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][700/1251] eta 0:02:42 lr 0.000203 time 0.2971 (0.2952) loss 3.3261 (3.2387) grad_norm 2.0746 (2.1396) [2022-10-02 15:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][800/1251] eta 0:02:12 lr 0.000203 time 0.2906 (0.2946) loss 3.3021 (3.2408) grad_norm 2.0770 (2.1306) [2022-10-02 15:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][900/1251] eta 0:01:43 lr 0.000203 time 0.2907 (0.2940) loss 3.7727 (3.2423) grad_norm 2.1350 (2.1284) [2022-10-02 15:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1000/1251] eta 0:01:13 lr 0.000202 time 0.2905 (0.2936) loss 3.9178 (3.2417) grad_norm 2.3490 (2.1302) [2022-10-02 15:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1100/1251] eta 0:00:44 lr 0.000202 time 0.2899 (0.2933) loss 2.7233 (3.2414) grad_norm 2.6907 (2.1347) [2022-10-02 15:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1200/1251] eta 0:00:14 lr 0.000202 time 0.2911 (0.2930) loss 3.2148 (3.2427) grad_norm 1.8905 (2.1337) [2022-10-02 15:14:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 212 training takes 0:06:06 [2022-10-02 15:14:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.476 (2.476) Loss 0.9397 (0.9397) Acc@1 78.027 (78.027) Acc@5 94.141 (94.141) [2022-10-02 15:14:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.878 Acc@5 94.728 [2022-10-02 15:14:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.9% [2022-10-02 15:14:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-02 15:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][0/1251] eta 1:08:01 lr 0.000202 time 3.2625 (3.2625) loss 3.8253 (3.8253) grad_norm 2.1886 (2.1886) [2022-10-02 15:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][100/1251] eta 0:06:09 lr 0.000201 time 0.2897 (0.3213) loss 3.7986 (3.2372) grad_norm 1.9575 (2.1361) [2022-10-02 15:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][200/1251] eta 0:05:21 lr 0.000201 time 0.2945 (0.3063) loss 3.7474 (3.2606) grad_norm 2.1419 (2.1908) [2022-10-02 15:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][300/1251] eta 0:04:46 lr 0.000201 time 0.2941 (0.3013) loss 3.6541 (3.2752) grad_norm 2.2121 (2.1875) [2022-10-02 15:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][400/1251] eta 0:04:14 lr 0.000200 time 0.2997 (0.2988) loss 2.5730 (3.2632) grad_norm 2.1224 (2.1641) [2022-10-02 15:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][500/1251] eta 0:03:43 lr 0.000200 time 0.2905 (0.2972) loss 2.4203 (3.2462) grad_norm 1.7640 (2.1638) [2022-10-02 15:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][600/1251] eta 0:03:12 lr 0.000200 time 0.2948 (0.2963) loss 3.5613 (3.2551) grad_norm 2.2208 (2.1567) [2022-10-02 15:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][700/1251] eta 0:02:42 lr 0.000199 time 0.2930 (0.2954) loss 2.2771 (3.2613) grad_norm 2.4042 (2.1632) [2022-10-02 15:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][800/1251] eta 0:02:12 lr 0.000199 time 0.2929 (0.2948) loss 3.9526 (3.2564) grad_norm 2.0253 (2.1620) [2022-10-02 15:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][900/1251] eta 0:01:43 lr 0.000199 time 0.2898 (0.2942) loss 3.9980 (3.2576) grad_norm 2.3644 (2.1607) [2022-10-02 15:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1000/1251] eta 0:01:13 lr 0.000198 time 0.2902 (0.2938) loss 3.0333 (3.2536) grad_norm 2.3552 (2.1587) [2022-10-02 15:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1100/1251] eta 0:00:44 lr 0.000198 time 0.2881 (0.2934) loss 3.5339 (3.2622) grad_norm 2.4042 (2.1577) [2022-10-02 15:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1200/1251] eta 0:00:14 lr 0.000198 time 0.2907 (0.2930) loss 4.1459 (3.2598) grad_norm 2.0383 (2.1547) [2022-10-02 15:20:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 213 training takes 0:06:06 [2022-10-02 15:20:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.287 (3.287) Loss 0.9075 (0.9075) Acc@1 79.102 (79.102) Acc@5 93.750 (93.750) [2022-10-02 15:20:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.836 Acc@5 94.726 [2022-10-02 15:20:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 15:20:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-02 15:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][0/1251] eta 0:50:44 lr 0.000198 time 2.4339 (2.4339) loss 3.3960 (3.3960) grad_norm 1.9063 (1.9063) [2022-10-02 15:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][100/1251] eta 0:06:06 lr 0.000197 time 0.2922 (0.3187) loss 3.8419 (3.2641) grad_norm 1.9373 (2.1526) [2022-10-02 15:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][200/1251] eta 0:05:19 lr 0.000197 time 0.2890 (0.3044) loss 3.3882 (3.2549) grad_norm 2.0882 (2.1404) [2022-10-02 15:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][300/1251] eta 0:04:45 lr 0.000197 time 0.2900 (0.2998) loss 3.3090 (3.2771) grad_norm 2.2143 (2.1380) [2022-10-02 15:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][400/1251] eta 0:04:13 lr 0.000196 time 0.2898 (0.2975) loss 3.7781 (3.2579) grad_norm 2.2926 (2.1536) [2022-10-02 15:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][500/1251] eta 0:03:42 lr 0.000196 time 0.2898 (0.2962) loss 2.9160 (3.2617) grad_norm 2.4073 (2.1596) [2022-10-02 15:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][600/1251] eta 0:03:12 lr 0.000196 time 0.2882 (0.2953) loss 3.3354 (3.2756) grad_norm 1.9176 (2.1700) [2022-10-02 15:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][700/1251] eta 0:02:42 lr 0.000195 time 0.2899 (0.2947) loss 4.1574 (3.2662) grad_norm 1.8548 (2.1785) [2022-10-02 15:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][800/1251] eta 0:02:12 lr 0.000195 time 0.2868 (0.2942) loss 3.1260 (3.2702) grad_norm 2.3196 (2.1736) [2022-10-02 15:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][900/1251] eta 0:01:43 lr 0.000195 time 0.2878 (0.2938) loss 3.8641 (3.2736) grad_norm 1.9180 (2.1770) [2022-10-02 15:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1000/1251] eta 0:01:13 lr 0.000194 time 0.2902 (0.2935) loss 3.3552 (3.2747) grad_norm 2.2490 (2.1726) [2022-10-02 15:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1100/1251] eta 0:00:44 lr 0.000194 time 0.2911 (0.2933) loss 2.4325 (3.2716) grad_norm 2.2046 (2.1665) [2022-10-02 15:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1200/1251] eta 0:00:14 lr 0.000194 time 0.2898 (0.2930) loss 3.5262 (3.2702) grad_norm 2.4703 (2.1630) [2022-10-02 15:27:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 214 training takes 0:06:06 [2022-10-02 15:27:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.173 (3.173) Loss 0.9044 (0.9044) Acc@1 78.809 (78.809) Acc@5 95.117 (95.117) [2022-10-02 15:27:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 78.808 Acc@5 94.704 [2022-10-02 15:27:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-10-02 15:27:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 78.92% [2022-10-02 15:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][0/1251] eta 1:05:04 lr 0.000193 time 3.1212 (3.1212) loss 2.5043 (2.5043) grad_norm 2.0422 (2.0422) [2022-10-02 15:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][100/1251] eta 0:06:07 lr 0.000193 time 0.2936 (0.3189) loss 3.7426 (3.2816) grad_norm 2.4004 (2.1898) [2022-10-02 15:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][200/1251] eta 0:05:20 lr 0.000193 time 0.2897 (0.3047) loss 3.8207 (3.2629) grad_norm 1.9877 (2.1539) [2022-10-02 15:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][300/1251] eta 0:04:45 lr 0.000193 time 0.2905 (0.2999) loss 3.6612 (3.2585) grad_norm 2.3260 (2.1889) [2022-10-02 15:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][400/1251] eta 0:04:13 lr 0.000192 time 0.2874 (0.2974) loss 3.3059 (3.2679) grad_norm 1.9156 (2.1803) [2022-10-02 15:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][500/1251] eta 0:03:42 lr 0.000192 time 0.2902 (0.2960) loss 3.3919 (3.2709) grad_norm 2.0097 (2.1872) [2022-10-02 15:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][600/1251] eta 0:03:12 lr 0.000192 time 0.2933 (0.2950) loss 2.3656 (3.2668) grad_norm 10.5052 (2.1981) [2022-10-02 15:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][700/1251] eta 0:02:42 lr 0.000191 time 0.2876 (0.2941) loss 2.3470 (3.2646) grad_norm 2.0290 (2.2029) [2022-10-02 15:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][800/1251] eta 0:02:12 lr 0.000191 time 0.2851 (0.2935) loss 3.4690 (3.2644) grad_norm 1.8020 (2.1944) [2022-10-02 15:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][900/1251] eta 0:01:42 lr 0.000191 time 0.2894 (0.2930) loss 3.4344 (3.2693) grad_norm 2.1529 (2.1918) [2022-10-02 15:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1000/1251] eta 0:01:13 lr 0.000190 time 0.2891 (0.2927) loss 3.5036 (3.2650) grad_norm 2.6296 (2.1865) [2022-10-02 15:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1100/1251] eta 0:00:44 lr 0.000190 time 0.2851 (0.2923) loss 2.1894 (3.2670) grad_norm 2.0431 (2.1860) [2022-10-02 15:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1200/1251] eta 0:00:14 lr 0.000190 time 0.2939 (0.2921) loss 3.6799 (3.2624) grad_norm 1.7531 (2.1903) [2022-10-02 15:33:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 215 training takes 0:06:05 [2022-10-02 15:33:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.887 (2.887) Loss 0.9645 (0.9645) Acc@1 76.660 (76.660) Acc@5 94.043 (94.043) [2022-10-02 15:33:36 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.112 Acc@5 94.826 [2022-10-02 15:33:36 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-02 15:33:36 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.11% [2022-10-02 15:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][0/1251] eta 0:53:40 lr 0.000189 time 2.5743 (2.5743) loss 3.2286 (3.2286) grad_norm 2.1133 (2.1133) [2022-10-02 15:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][100/1251] eta 0:06:07 lr 0.000189 time 0.2874 (0.3190) loss 3.7583 (3.2448) grad_norm 2.2157 (2.1309) [2022-10-02 15:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][200/1251] eta 0:05:20 lr 0.000189 time 0.2968 (0.3052) loss 3.3142 (3.2178) grad_norm 2.6304 (2.1341) [2022-10-02 15:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][300/1251] eta 0:04:45 lr 0.000189 time 0.2888 (0.3005) loss 3.4995 (3.2082) grad_norm 2.0433 (2.1459) [2022-10-02 15:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][400/1251] eta 0:04:13 lr 0.000188 time 0.2891 (0.2981) loss 3.4824 (3.2221) grad_norm 1.7437 (2.1543) [2022-10-02 15:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][500/1251] eta 0:03:42 lr 0.000188 time 0.2907 (0.2966) loss 3.0448 (3.2246) grad_norm 2.0601 (2.1636) [2022-10-02 15:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][600/1251] eta 0:03:12 lr 0.000188 time 0.2908 (0.2956) loss 3.2878 (3.2048) grad_norm 1.8858 (2.1675) [2022-10-02 15:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][700/1251] eta 0:02:42 lr 0.000187 time 0.2907 (0.2949) loss 3.5409 (3.2040) grad_norm 2.4330 (2.1693) [2022-10-02 15:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][800/1251] eta 0:02:12 lr 0.000187 time 0.2868 (0.2943) loss 2.7850 (3.2030) grad_norm 2.0927 (2.1808) [2022-10-02 15:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][900/1251] eta 0:01:43 lr 0.000187 time 0.2903 (0.2938) loss 3.2066 (3.2098) grad_norm 2.5804 (2.1788) [2022-10-02 15:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1000/1251] eta 0:01:13 lr 0.000186 time 0.2907 (0.2934) loss 2.9193 (3.2125) grad_norm 1.9959 (2.1861) [2022-10-02 15:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1100/1251] eta 0:00:44 lr 0.000186 time 0.2898 (0.2930) loss 3.3058 (3.2277) grad_norm 2.4404 (2.1877) [2022-10-02 15:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1200/1251] eta 0:00:14 lr 0.000186 time 0.2898 (0.2926) loss 3.5324 (3.2287) grad_norm 2.0208 (2.1921) [2022-10-02 15:39:42 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 216 training takes 0:06:06 [2022-10-02 15:39:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.555 (2.555) Loss 0.8327 (0.8327) Acc@1 79.297 (79.297) Acc@5 95.215 (95.215) [2022-10-02 15:39:55 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.028 Acc@5 94.716 [2022-10-02 15:39:55 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-10-02 15:39:55 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.11% [2022-10-02 15:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][0/1251] eta 1:08:04 lr 0.000185 time 3.2653 (3.2653) loss 3.1178 (3.1178) grad_norm 2.3416 (2.3416) [2022-10-02 15:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][100/1251] eta 0:06:14 lr 0.000185 time 0.2875 (0.3251) loss 3.2876 (3.2076) grad_norm 2.2862 (2.1503) [2022-10-02 15:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][200/1251] eta 0:05:26 lr 0.000185 time 0.2946 (0.3103) loss 3.5262 (3.2286) grad_norm 2.5410 (2.1724) [2022-10-02 15:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][300/1251] eta 0:04:50 lr 0.000185 time 0.2903 (0.3052) loss 3.2044 (3.2139) grad_norm 1.9228 (2.1854) [2022-10-02 15:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][400/1251] eta 0:04:17 lr 0.000184 time 0.2888 (0.3023) loss 3.5092 (3.2262) grad_norm 1.8994 (2.1870) [2022-10-02 15:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][500/1251] eta 0:03:45 lr 0.000184 time 0.2876 (0.3007) loss 2.9153 (3.2341) grad_norm 2.1981 (2.1899) [2022-10-02 15:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][600/1251] eta 0:03:14 lr 0.000184 time 0.2910 (0.2995) loss 3.0600 (3.2452) grad_norm 2.4659 (2.2000) [2022-10-02 15:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][700/1251] eta 0:02:44 lr 0.000183 time 0.2928 (0.2987) loss 3.3962 (3.2449) grad_norm 2.0942 (2.2068) [2022-10-02 15:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][800/1251] eta 0:02:14 lr 0.000183 time 0.2899 (0.2980) loss 2.6314 (3.2466) grad_norm 2.2922 (2.2024) [2022-10-02 15:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][900/1251] eta 0:01:44 lr 0.000183 time 0.2873 (0.2975) loss 3.4129 (3.2378) grad_norm 2.2247 (2.1984) [2022-10-02 15:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1000/1251] eta 0:01:14 lr 0.000182 time 0.2906 (0.2970) loss 3.6378 (3.2365) grad_norm 2.1348 (2.1966) [2022-10-02 15:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1100/1251] eta 0:00:44 lr 0.000182 time 0.2886 (0.2965) loss 3.2949 (3.2217) grad_norm 2.4560 (2.1964) [2022-10-02 15:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1200/1251] eta 0:00:15 lr 0.000182 time 0.2898 (0.2961) loss 3.2324 (3.2187) grad_norm 2.2996 (2.1977) [2022-10-02 15:46:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 217 training takes 0:06:10 [2022-10-02 15:46:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.799 (2.799) Loss 0.9836 (0.9836) Acc@1 75.684 (75.684) Acc@5 94.531 (94.531) [2022-10-02 15:46:18 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.368 Acc@5 94.810 [2022-10-02 15:46:18 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-02 15:46:18 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 15:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][0/1251] eta 0:50:30 lr 0.000182 time 2.4226 (2.4226) loss 3.3648 (3.3648) grad_norm 2.0114 (2.0114) [2022-10-02 15:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][100/1251] eta 0:06:00 lr 0.000181 time 0.2875 (0.3129) loss 2.5492 (3.2317) grad_norm 2.1177 (2.2176) [2022-10-02 15:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][200/1251] eta 0:05:16 lr 0.000181 time 0.2924 (0.3007) loss 3.3302 (3.2555) grad_norm 2.2369 (2.2166) [2022-10-02 15:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][300/1251] eta 0:04:42 lr 0.000181 time 0.2858 (0.2967) loss 3.5918 (3.2395) grad_norm 2.0721 (2.1893) [2022-10-02 15:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][400/1251] eta 0:04:10 lr 0.000180 time 0.2860 (0.2947) loss 3.4692 (3.2363) grad_norm 2.1452 (2.1834) [2022-10-02 15:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][500/1251] eta 0:03:40 lr 0.000180 time 0.2862 (0.2934) loss 3.2258 (3.2356) grad_norm 1.9703 (2.1844) [2022-10-02 15:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][600/1251] eta 0:03:10 lr 0.000180 time 0.2897 (0.2925) loss 3.8887 (3.2414) grad_norm 2.6037 (2.1862) [2022-10-02 15:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][700/1251] eta 0:02:40 lr 0.000179 time 0.2878 (0.2918) loss 3.5873 (3.2432) grad_norm 2.0360 (2.1903) [2022-10-02 15:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][800/1251] eta 0:02:11 lr 0.000179 time 0.2881 (0.2913) loss 3.4270 (3.2444) grad_norm 2.1715 (2.1962) [2022-10-02 15:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][900/1251] eta 0:01:42 lr 0.000179 time 0.2925 (0.2909) loss 3.5257 (3.2449) grad_norm 2.2743 (2.2005) [2022-10-02 15:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1000/1251] eta 0:01:12 lr 0.000178 time 0.2912 (0.2905) loss 2.8472 (3.2410) grad_norm 3.7516 (2.1993) [2022-10-02 15:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1100/1251] eta 0:00:43 lr 0.000178 time 0.2862 (0.2903) loss 2.5319 (3.2422) grad_norm 2.5087 (2.1992) [2022-10-02 15:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1200/1251] eta 0:00:14 lr 0.000178 time 0.2900 (0.2901) loss 3.4863 (3.2436) grad_norm 2.1075 (2.2020) [2022-10-02 15:52:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 218 training takes 0:06:03 [2022-10-02 15:52:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.621 (2.621) Loss 0.8356 (0.8356) Acc@1 79.199 (79.199) Acc@5 95.508 (95.508) [2022-10-02 15:52:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.344 Acc@5 94.756 [2022-10-02 15:52:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-02 15:52:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 15:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][0/1251] eta 1:13:43 lr 0.000178 time 3.5359 (3.5359) loss 3.4978 (3.4978) grad_norm 1.9590 (1.9590) [2022-10-02 15:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][100/1251] eta 0:06:13 lr 0.000177 time 0.2899 (0.3242) loss 2.5570 (3.2036) grad_norm 2.1970 (2.2178) [2022-10-02 15:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][200/1251] eta 0:05:23 lr 0.000177 time 0.2902 (0.3076) loss 3.5240 (3.2372) grad_norm 2.4566 (2.2336) [2022-10-02 15:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][300/1251] eta 0:04:47 lr 0.000177 time 0.2905 (0.3020) loss 2.9063 (3.2415) grad_norm 2.1361 (2.2292) [2022-10-02 15:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][400/1251] eta 0:04:14 lr 0.000176 time 0.2882 (0.2992) loss 3.9301 (3.2349) grad_norm 2.1304 (2.2330) [2022-10-02 15:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][500/1251] eta 0:03:43 lr 0.000176 time 0.2946 (0.2975) loss 2.3989 (3.2247) grad_norm 2.4141 (2.2455) [2022-10-02 15:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][600/1251] eta 0:03:12 lr 0.000176 time 0.2911 (0.2963) loss 3.4058 (3.2321) grad_norm 2.6735 (2.2591) [2022-10-02 15:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][700/1251] eta 0:02:42 lr 0.000175 time 0.2930 (0.2954) loss 3.0980 (3.2330) grad_norm 2.3951 (2.2559) [2022-10-02 15:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][800/1251] eta 0:02:12 lr 0.000175 time 0.2901 (0.2948) loss 3.2711 (3.2290) grad_norm 2.6897 (2.2520) [2022-10-02 15:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][900/1251] eta 0:01:43 lr 0.000175 time 0.2958 (0.2943) loss 3.4363 (3.2332) grad_norm 2.0047 (2.2532) [2022-10-02 15:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1000/1251] eta 0:01:13 lr 0.000175 time 0.2946 (0.2938) loss 2.7614 (3.2339) grad_norm 3.1815 (2.2532) [2022-10-02 15:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1100/1251] eta 0:00:44 lr 0.000174 time 0.2881 (0.2935) loss 2.5718 (3.2347) grad_norm 2.2577 (2.2500) [2022-10-02 15:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1200/1251] eta 0:00:14 lr 0.000174 time 0.2910 (0.2932) loss 2.8285 (3.2277) grad_norm 2.7331 (2.2472) [2022-10-02 15:58:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 219 training takes 0:06:07 [2022-10-02 15:58:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.085 (3.085) Loss 0.8206 (0.8206) Acc@1 79.980 (79.980) Acc@5 95.801 (95.801) [2022-10-02 15:58:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.156 Acc@5 94.820 [2022-10-02 15:58:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-02 15:58:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 15:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][0/1251] eta 1:12:05 lr 0.000174 time 3.4573 (3.4573) loss 3.5488 (3.5488) grad_norm 1.9543 (1.9543) [2022-10-02 15:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][100/1251] eta 0:06:09 lr 0.000173 time 0.2879 (0.3209) loss 3.2374 (3.2059) grad_norm 2.2480 (2.2601) [2022-10-02 15:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][200/1251] eta 0:05:20 lr 0.000173 time 0.2926 (0.3049) loss 3.4074 (3.2456) grad_norm 2.0172 (2.2398) [2022-10-02 16:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][300/1251] eta 0:04:45 lr 0.000173 time 0.2884 (0.2997) loss 2.9770 (3.2348) grad_norm 2.0284 (2.2411) [2022-10-02 16:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][400/1251] eta 0:04:12 lr 0.000173 time 0.2915 (0.2971) loss 2.2554 (3.2319) grad_norm 1.9307 (2.2655) [2022-10-02 16:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][500/1251] eta 0:03:41 lr 0.000172 time 0.2864 (0.2954) loss 3.3869 (3.2214) grad_norm 2.3106 (2.2641) [2022-10-02 16:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][600/1251] eta 0:03:11 lr 0.000172 time 0.2925 (0.2942) loss 3.1125 (3.2174) grad_norm 2.1693 (2.2619) [2022-10-02 16:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][700/1251] eta 0:02:41 lr 0.000172 time 0.2922 (0.2934) loss 3.7082 (3.2303) grad_norm 2.5280 (2.2615) [2022-10-02 16:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][800/1251] eta 0:02:12 lr 0.000171 time 0.2888 (0.2927) loss 2.9998 (3.2310) grad_norm 2.2073 (2.2674) [2022-10-02 16:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][900/1251] eta 0:01:42 lr 0.000171 time 0.2915 (0.2922) loss 2.8483 (3.2338) grad_norm 2.3659 (2.2654) [2022-10-02 16:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1000/1251] eta 0:01:13 lr 0.000171 time 0.2900 (0.2917) loss 2.4957 (3.2400) grad_norm 2.6965 (2.2786) [2022-10-02 16:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1100/1251] eta 0:00:43 lr 0.000170 time 0.2858 (0.2913) loss 2.5938 (3.2292) grad_norm 2.0643 (2.2795) [2022-10-02 16:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1200/1251] eta 0:00:14 lr 0.000170 time 0.2917 (0.2910) loss 3.7381 (3.2311) grad_norm 2.1006 (2.2790) [2022-10-02 16:04:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 220 training takes 0:06:04 [2022-10-02 16:04:58 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saving...... [2022-10-02 16:04:58 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saved !!! [2022-10-02 16:05:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.454 (2.454) Loss 0.8767 (0.8767) Acc@1 78.809 (78.809) Acc@5 94.824 (94.824) [2022-10-02 16:05:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.108 Acc@5 94.818 [2022-10-02 16:05:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-10-02 16:05:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 16:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][0/1251] eta 0:50:38 lr 0.000170 time 2.4285 (2.4285) loss 2.6690 (2.6690) grad_norm 2.1208 (2.1208) [2022-10-02 16:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][100/1251] eta 0:06:01 lr 0.000170 time 0.2893 (0.3143) loss 2.1144 (3.2230) grad_norm 2.2910 (2.2817) [2022-10-02 16:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][200/1251] eta 0:05:17 lr 0.000169 time 0.2894 (0.3025) loss 2.8894 (3.2308) grad_norm 2.1076 (2.2707) [2022-10-02 16:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][300/1251] eta 0:04:44 lr 0.000169 time 0.2905 (0.2989) loss 3.5094 (3.2178) grad_norm 2.0897 (2.2696) [2022-10-02 16:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][400/1251] eta 0:04:12 lr 0.000169 time 0.2907 (0.2968) loss 3.7311 (3.2128) grad_norm 2.5398 (2.2722) [2022-10-02 16:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][500/1251] eta 0:03:41 lr 0.000168 time 0.2895 (0.2956) loss 3.7305 (3.2063) grad_norm 2.4181 (2.2946) [2022-10-02 16:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][600/1251] eta 0:03:11 lr 0.000168 time 0.2918 (0.2947) loss 3.5267 (3.2194) grad_norm 2.1773 (2.2939) [2022-10-02 16:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][700/1251] eta 0:02:42 lr 0.000168 time 0.2877 (0.2941) loss 3.5680 (3.2282) grad_norm 2.3136 (2.3064) [2022-10-02 16:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][800/1251] eta 0:02:12 lr 0.000168 time 0.2893 (0.2937) loss 3.7074 (3.2241) grad_norm 2.1341 (2.3056) [2022-10-02 16:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][900/1251] eta 0:01:42 lr 0.000167 time 0.2881 (0.2933) loss 3.0469 (3.2136) grad_norm 2.1430 (2.3025) [2022-10-02 16:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1000/1251] eta 0:01:13 lr 0.000167 time 0.2893 (0.2930) loss 3.5788 (3.2172) grad_norm 2.1244 (2.2935) [2022-10-02 16:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1100/1251] eta 0:00:44 lr 0.000167 time 0.2881 (0.2927) loss 2.2357 (3.2148) grad_norm 2.1906 (2.2909) [2022-10-02 16:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1200/1251] eta 0:00:14 lr 0.000166 time 0.2929 (0.2924) loss 3.7178 (3.2128) grad_norm 2.1276 (2.2888) [2022-10-02 16:11:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 221 training takes 0:06:06 [2022-10-02 16:11:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.532 (2.532) Loss 0.8858 (0.8858) Acc@1 79.102 (79.102) Acc@5 95.312 (95.312) [2022-10-02 16:11:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.298 Acc@5 94.834 [2022-10-02 16:11:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-10-02 16:11:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 16:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][0/1251] eta 0:50:24 lr 0.000166 time 2.4173 (2.4173) loss 3.6580 (3.6580) grad_norm 2.6129 (2.6129) [2022-10-02 16:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][100/1251] eta 0:05:59 lr 0.000166 time 0.2874 (0.3124) loss 3.4782 (3.1337) grad_norm 3.9795 (2.3320) [2022-10-02 16:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][200/1251] eta 0:05:15 lr 0.000166 time 0.2874 (0.3000) loss 3.5150 (3.1489) grad_norm 2.0889 (2.2762) [2022-10-02 16:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][300/1251] eta 0:04:41 lr 0.000165 time 0.2869 (0.2960) loss 2.2730 (3.1817) grad_norm 1.8502 (2.2848) [2022-10-02 16:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][400/1251] eta 0:04:10 lr 0.000165 time 0.2891 (0.2939) loss 3.0154 (3.1803) grad_norm 2.3564 (2.2709) [2022-10-02 16:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][500/1251] eta 0:03:39 lr 0.000165 time 0.2884 (0.2927) loss 2.0930 (3.1754) grad_norm 2.0145 (2.2759) [2022-10-02 16:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][600/1251] eta 0:03:09 lr 0.000164 time 0.2887 (0.2918) loss 3.4148 (3.1784) grad_norm 2.6691 (2.2769) [2022-10-02 16:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][700/1251] eta 0:02:40 lr 0.000164 time 0.2885 (0.2912) loss 3.4997 (3.1713) grad_norm 1.9833 (2.2774) [2022-10-02 16:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][800/1251] eta 0:02:11 lr 0.000164 time 0.2852 (0.2907) loss 3.8584 (3.1693) grad_norm 2.2440 (2.2729) [2022-10-02 16:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][900/1251] eta 0:01:41 lr 0.000163 time 0.2890 (0.2903) loss 2.1817 (3.1805) grad_norm 2.1531 (2.2777) [2022-10-02 16:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1000/1251] eta 0:01:12 lr 0.000163 time 0.2857 (0.2900) loss 3.6910 (3.1867) grad_norm 1.8485 (2.2734) [2022-10-02 16:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1100/1251] eta 0:00:43 lr 0.000163 time 0.2844 (0.2898) loss 2.9440 (3.1850) grad_norm 2.7454 (2.2709) [2022-10-02 16:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1200/1251] eta 0:00:14 lr 0.000163 time 0.2880 (0.2896) loss 3.2501 (3.1885) grad_norm 2.1212 (2.2758) [2022-10-02 16:17:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 222 training takes 0:06:02 [2022-10-02 16:17:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.494 (2.494) Loss 0.9447 (0.9447) Acc@1 77.930 (77.930) Acc@5 93.164 (93.164) [2022-10-02 16:17:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.188 Acc@5 94.904 [2022-10-02 16:17:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-10-02 16:17:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.37% [2022-10-02 16:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][0/1251] eta 1:08:01 lr 0.000162 time 3.2623 (3.2623) loss 3.5315 (3.5315) grad_norm 2.1658 (2.1658) [2022-10-02 16:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][100/1251] eta 0:06:07 lr 0.000162 time 0.2871 (0.3189) loss 2.7517 (3.2159) grad_norm 1.9523 (2.3082) [2022-10-02 16:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][200/1251] eta 0:05:19 lr 0.000162 time 0.2928 (0.3040) loss 3.6334 (3.2146) grad_norm 2.0546 (2.2869) [2022-10-02 16:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][300/1251] eta 0:04:44 lr 0.000161 time 0.2867 (0.2992) loss 3.4998 (3.1960) grad_norm 1.9723 (2.2956) [2022-10-02 16:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][400/1251] eta 0:04:12 lr 0.000161 time 0.2901 (0.2968) loss 3.4920 (3.2133) grad_norm 2.1024 (2.3015) [2022-10-02 16:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][500/1251] eta 0:03:41 lr 0.000161 time 0.2861 (0.2953) loss 3.5322 (3.2267) grad_norm 2.2769 (2.2913) [2022-10-02 16:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][600/1251] eta 0:03:11 lr 0.000161 time 0.2867 (0.2943) loss 3.2959 (3.2355) grad_norm 1.7359 (2.2900) [2022-10-02 16:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][700/1251] eta 0:02:41 lr 0.000160 time 0.2852 (0.2935) loss 3.8501 (3.2445) grad_norm 2.2347 (2.2842) [2022-10-02 16:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][800/1251] eta 0:02:12 lr 0.000160 time 0.2856 (0.2930) loss 3.1042 (3.2312) grad_norm 2.1126 (2.2879) [2022-10-02 16:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][900/1251] eta 0:01:42 lr 0.000160 time 0.2858 (0.2925) loss 3.2876 (3.2231) grad_norm 3.0837 (2.2930) [2022-10-02 16:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1000/1251] eta 0:01:13 lr 0.000159 time 0.2908 (0.2921) loss 3.7997 (3.2214) grad_norm 2.2737 (2.2978) [2022-10-02 16:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1100/1251] eta 0:00:44 lr 0.000159 time 0.2859 (0.2918) loss 1.9118 (3.2240) grad_norm 2.1260 (2.2907) [2022-10-02 16:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1200/1251] eta 0:00:14 lr 0.000159 time 0.2865 (0.2916) loss 2.1277 (3.2171) grad_norm 2.1972 (2.2895) [2022-10-02 16:23:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 223 training takes 0:06:05 [2022-10-02 16:23:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.230 (3.230) Loss 0.9007 (0.9007) Acc@1 78.223 (78.223) Acc@5 95.117 (95.117) [2022-10-02 16:24:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.450 Acc@5 94.966 [2022-10-02 16:24:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-02 16:24:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.45% [2022-10-02 16:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][0/1251] eta 0:57:22 lr 0.000159 time 2.7516 (2.7516) loss 3.1935 (3.1935) grad_norm 1.9847 (1.9847) [2022-10-02 16:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][100/1251] eta 0:06:06 lr 0.000158 time 0.2895 (0.3182) loss 3.8301 (3.2737) grad_norm 2.3337 (2.3262) [2022-10-02 16:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][200/1251] eta 0:05:20 lr 0.000158 time 0.2912 (0.3046) loss 3.1175 (3.1703) grad_norm 2.5526 (2.3171) [2022-10-02 16:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][300/1251] eta 0:04:45 lr 0.000158 time 0.2855 (0.2997) loss 3.7113 (3.1987) grad_norm 2.2820 (2.3194) [2022-10-02 16:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][400/1251] eta 0:04:12 lr 0.000157 time 0.2901 (0.2972) loss 3.0068 (3.2049) grad_norm 2.1132 (2.3206) [2022-10-02 16:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][500/1251] eta 0:03:42 lr 0.000157 time 0.2858 (0.2958) loss 3.0677 (3.2198) grad_norm 2.0093 (2.3226) [2022-10-02 16:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][600/1251] eta 0:03:11 lr 0.000157 time 0.2904 (0.2948) loss 3.7507 (3.2217) grad_norm 2.2411 (2.3271) [2022-10-02 16:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][700/1251] eta 0:02:42 lr 0.000157 time 0.2845 (0.2941) loss 3.2475 (3.2148) grad_norm 2.1440 (2.3223) [2022-10-02 16:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][800/1251] eta 0:02:12 lr 0.000156 time 0.2925 (0.2937) loss 2.4934 (3.2015) grad_norm 2.6790 (2.3219) [2022-10-02 16:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][900/1251] eta 0:01:42 lr 0.000156 time 0.2875 (0.2933) loss 3.4785 (3.2115) grad_norm 2.0958 (2.3182) [2022-10-02 16:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1000/1251] eta 0:01:13 lr 0.000156 time 0.2925 (0.2929) loss 3.5047 (3.2132) grad_norm 1.9099 (2.3229) [2022-10-02 16:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1100/1251] eta 0:00:44 lr 0.000155 time 0.2869 (0.2926) loss 3.5636 (3.2108) grad_norm 2.5344 (2.3154) [2022-10-02 16:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1200/1251] eta 0:00:14 lr 0.000155 time 0.2898 (0.2924) loss 3.7369 (3.2080) grad_norm 2.0868 (2.3191) [2022-10-02 16:30:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 224 training takes 0:06:06 [2022-10-02 16:30:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.379 (3.379) Loss 0.8557 (0.8557) Acc@1 80.078 (80.078) Acc@5 94.629 (94.629) [2022-10-02 16:30:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.438 Acc@5 94.956 [2022-10-02 16:30:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-02 16:30:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.45% [2022-10-02 16:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][0/1251] eta 1:05:36 lr 0.000155 time 3.1468 (3.1468) loss 2.3453 (2.3453) grad_norm 2.0853 (2.0853) [2022-10-02 16:30:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][100/1251] eta 0:06:07 lr 0.000155 time 0.2936 (0.3189) loss 3.2758 (3.0923) grad_norm 2.0217 (2.2642) [2022-10-02 16:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][200/1251] eta 0:05:19 lr 0.000154 time 0.2916 (0.3043) loss 3.1429 (3.1176) grad_norm 2.0573 (2.3009) [2022-10-02 16:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][300/1251] eta 0:04:44 lr 0.000154 time 0.2901 (0.2992) loss 2.1824 (3.1288) grad_norm 2.0606 (2.3107) [2022-10-02 16:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][400/1251] eta 0:04:12 lr 0.000154 time 0.2883 (0.2966) loss 3.5235 (3.1089) grad_norm 2.5898 (2.3124) [2022-10-02 16:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][500/1251] eta 0:03:41 lr 0.000154 time 0.2905 (0.2950) loss 3.3100 (3.1289) grad_norm 2.3059 (2.3057) [2022-10-02 16:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][600/1251] eta 0:03:11 lr 0.000153 time 0.2865 (0.2939) loss 2.2651 (3.1360) grad_norm 2.3687 (2.2999) [2022-10-02 16:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][700/1251] eta 0:02:41 lr 0.000153 time 0.2941 (0.2931) loss 1.9950 (3.1457) grad_norm 2.2029 (2.3005) [2022-10-02 16:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][800/1251] eta 0:02:11 lr 0.000153 time 0.2878 (0.2925) loss 3.1693 (3.1498) grad_norm 2.4064 (2.3021) [2022-10-02 16:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][900/1251] eta 0:01:42 lr 0.000152 time 0.2913 (0.2920) loss 2.4804 (3.1560) grad_norm 2.1670 (2.3011) [2022-10-02 16:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1000/1251] eta 0:01:13 lr 0.000152 time 0.2921 (0.2916) loss 2.7417 (3.1632) grad_norm 2.0498 (2.3026) [2022-10-02 16:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1100/1251] eta 0:00:43 lr 0.000152 time 0.2923 (0.2914) loss 3.1873 (3.1756) grad_norm 2.9569 (2.3074) [2022-10-02 16:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1200/1251] eta 0:00:14 lr 0.000151 time 0.2903 (0.2911) loss 3.5557 (3.1761) grad_norm 2.0295 (2.3089) [2022-10-02 16:36:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 225 training takes 0:06:04 [2022-10-02 16:36:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.391 (2.391) Loss 0.8728 (0.8728) Acc@1 79.785 (79.785) Acc@5 94.629 (94.629) [2022-10-02 16:36:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.410 Acc@5 94.966 [2022-10-02 16:36:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-02 16:36:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.45% [2022-10-02 16:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][0/1251] eta 1:09:04 lr 0.000151 time 3.3126 (3.3126) loss 3.6818 (3.6818) grad_norm 1.9600 (1.9600) [2022-10-02 16:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][100/1251] eta 0:06:08 lr 0.000151 time 0.2862 (0.3198) loss 2.4550 (3.1733) grad_norm 2.1823 (2.3267) [2022-10-02 16:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][200/1251] eta 0:05:20 lr 0.000151 time 0.2918 (0.3048) loss 3.6800 (3.1912) grad_norm 2.4462 (2.3056) [2022-10-02 16:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][300/1251] eta 0:04:44 lr 0.000150 time 0.2875 (0.2997) loss 3.2940 (3.1811) grad_norm 2.4180 (2.3124) [2022-10-02 16:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][400/1251] eta 0:04:12 lr 0.000150 time 0.2891 (0.2971) loss 3.2981 (3.1863) grad_norm 2.7021 (2.3087) [2022-10-02 16:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][500/1251] eta 0:03:41 lr 0.000150 time 0.2885 (0.2955) loss 3.2633 (3.1857) grad_norm 2.1184 (2.3106) [2022-10-02 16:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][600/1251] eta 0:03:11 lr 0.000150 time 0.2938 (0.2945) loss 3.2378 (3.1730) grad_norm 2.0639 (2.3159) [2022-10-02 16:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][700/1251] eta 0:02:41 lr 0.000149 time 0.2888 (0.2937) loss 3.5082 (3.1839) grad_norm 1.9395 (2.3180) [2022-10-02 16:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][800/1251] eta 0:02:12 lr 0.000149 time 0.2921 (0.2931) loss 3.7484 (3.1957) grad_norm 2.4829 (2.3172) [2022-10-02 16:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][900/1251] eta 0:01:42 lr 0.000149 time 0.2861 (0.2927) loss 3.2142 (3.2090) grad_norm 2.0597 (2.3172) [2022-10-02 16:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1000/1251] eta 0:01:13 lr 0.000148 time 0.2922 (0.2923) loss 2.8230 (3.2067) grad_norm 2.2144 (2.3179) [2022-10-02 16:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1100/1251] eta 0:00:44 lr 0.000148 time 0.2874 (0.2920) loss 3.7269 (3.2051) grad_norm 2.2683 (2.3191) [2022-10-02 16:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1200/1251] eta 0:00:14 lr 0.000148 time 0.2902 (0.2917) loss 3.5381 (3.2046) grad_norm 2.0885 (2.3225) [2022-10-02 16:42:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 226 training takes 0:06:05 [2022-10-02 16:42:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.202 (3.202) Loss 0.8049 (0.8049) Acc@1 81.445 (81.445) Acc@5 95.801 (95.801) [2022-10-02 16:42:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.482 Acc@5 94.934 [2022-10-02 16:42:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-02 16:42:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.48% [2022-10-02 16:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][0/1251] eta 1:04:47 lr 0.000148 time 3.1078 (3.1078) loss 2.3649 (2.3649) grad_norm 2.6887 (2.6887) [2022-10-02 16:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][100/1251] eta 0:06:07 lr 0.000147 time 0.2860 (0.3192) loss 2.7526 (3.2110) grad_norm 2.2970 (2.3713) [2022-10-02 16:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][200/1251] eta 0:05:19 lr 0.000147 time 0.2913 (0.3042) loss 2.9600 (3.2158) grad_norm 2.6496 (2.3775) [2022-10-02 16:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][300/1251] eta 0:04:44 lr 0.000147 time 0.2881 (0.2991) loss 3.2473 (3.1992) grad_norm 2.7792 (2.3706) [2022-10-02 16:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][400/1251] eta 0:04:12 lr 0.000147 time 0.2854 (0.2964) loss 2.7358 (3.1844) grad_norm 2.1411 (2.3515) [2022-10-02 16:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][500/1251] eta 0:03:41 lr 0.000146 time 0.2877 (0.2949) loss 3.6059 (3.1883) grad_norm 2.0325 (2.3517) [2022-10-02 16:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][600/1251] eta 0:03:11 lr 0.000146 time 0.2850 (0.2938) loss 3.7197 (3.1825) grad_norm 2.3923 (2.3516) [2022-10-02 16:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][700/1251] eta 0:02:41 lr 0.000146 time 0.2879 (0.2930) loss 3.7890 (3.1888) grad_norm 2.1717 (2.3471) [2022-10-02 16:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][800/1251] eta 0:02:11 lr 0.000145 time 0.2872 (0.2925) loss 3.1781 (3.1941) grad_norm 2.2522 (2.3442) [2022-10-02 16:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][900/1251] eta 0:01:42 lr 0.000145 time 0.2887 (0.2920) loss 3.2242 (3.1991) grad_norm 2.5088 (2.3513) [2022-10-02 16:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1000/1251] eta 0:01:13 lr 0.000145 time 0.2875 (0.2917) loss 3.5767 (3.1882) grad_norm 2.1885 (2.3526) [2022-10-02 16:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1100/1251] eta 0:00:43 lr 0.000145 time 0.2885 (0.2914) loss 2.6568 (3.1909) grad_norm 2.4078 (2.3512) [2022-10-02 16:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1200/1251] eta 0:00:14 lr 0.000144 time 0.2879 (0.2911) loss 2.1548 (3.1912) grad_norm 2.3159 (2.3507) [2022-10-02 16:49:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 227 training takes 0:06:04 [2022-10-02 16:49:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.252 (2.252) Loss 0.8416 (0.8416) Acc@1 79.590 (79.590) Acc@5 95.508 (95.508) [2022-10-02 16:49:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.538 Acc@5 95.010 [2022-10-02 16:49:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-10-02 16:49:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.54% [2022-10-02 16:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][0/1251] eta 0:54:11 lr 0.000144 time 2.5992 (2.5992) loss 3.8916 (3.8916) grad_norm 2.5635 (2.5635) [2022-10-02 16:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][100/1251] eta 0:06:04 lr 0.000144 time 0.2864 (0.3169) loss 3.6158 (3.2289) grad_norm 2.1370 (2.3968) [2022-10-02 16:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][200/1251] eta 0:05:17 lr 0.000144 time 0.2888 (0.3023) loss 2.5091 (3.1859) grad_norm 2.1994 (2.3741) [2022-10-02 16:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][300/1251] eta 0:04:42 lr 0.000143 time 0.2942 (0.2975) loss 4.0474 (3.2212) grad_norm 2.2540 (2.3686) [2022-10-02 16:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][400/1251] eta 0:04:11 lr 0.000143 time 0.2889 (0.2951) loss 3.6543 (3.2039) grad_norm 2.3459 (2.3640) [2022-10-02 16:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][500/1251] eta 0:03:40 lr 0.000143 time 0.2887 (0.2937) loss 3.1439 (3.2020) grad_norm 2.1883 (2.3703) [2022-10-02 16:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][600/1251] eta 0:03:10 lr 0.000142 time 0.2873 (0.2927) loss 3.4642 (3.2024) grad_norm 2.2989 (2.3812) [2022-10-02 16:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][700/1251] eta 0:02:40 lr 0.000142 time 0.2865 (0.2920) loss 3.5573 (3.2082) grad_norm 2.0939 (2.3795) [2022-10-02 16:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][800/1251] eta 0:02:11 lr 0.000142 time 0.2868 (0.2915) loss 3.5645 (3.1940) grad_norm 2.4411 (2.3789) [2022-10-02 16:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][900/1251] eta 0:01:42 lr 0.000142 time 0.2878 (0.2911) loss 2.0803 (3.1925) grad_norm 2.4824 (2.3732) [2022-10-02 16:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1000/1251] eta 0:01:12 lr 0.000141 time 0.2873 (0.2907) loss 2.1254 (3.1972) grad_norm 2.3021 (2.3705) [2022-10-02 16:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1100/1251] eta 0:00:43 lr 0.000141 time 0.2870 (0.2905) loss 3.4939 (3.2028) grad_norm 2.0648 (2.3688) [2022-10-02 16:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1200/1251] eta 0:00:14 lr 0.000141 time 0.2875 (0.2903) loss 3.7141 (3.2032) grad_norm 2.6859 (2.3712) [2022-10-02 16:55:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 228 training takes 0:06:03 [2022-10-02 16:55:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.478 (2.478) Loss 0.9326 (0.9326) Acc@1 76.758 (76.758) Acc@5 94.434 (94.434) [2022-10-02 16:55:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.390 Acc@5 95.050 [2022-10-02 16:55:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-10-02 16:55:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.54% [2022-10-02 16:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][0/1251] eta 0:51:42 lr 0.000141 time 2.4796 (2.4796) loss 2.4266 (2.4266) grad_norm 2.5524 (2.5524) [2022-10-02 16:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][100/1251] eta 0:06:05 lr 0.000140 time 0.2916 (0.3171) loss 3.4268 (3.1392) grad_norm 2.2062 (2.3112) [2022-10-02 16:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][200/1251] eta 0:05:18 lr 0.000140 time 0.2918 (0.3031) loss 3.0415 (3.1520) grad_norm 2.4830 (2.3550) [2022-10-02 16:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][300/1251] eta 0:04:43 lr 0.000140 time 0.2912 (0.2984) loss 3.9305 (3.1648) grad_norm 2.1868 (2.3569) [2022-10-02 16:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][400/1251] eta 0:04:11 lr 0.000140 time 0.2935 (0.2960) loss 3.1036 (3.1494) grad_norm 2.8287 (2.3592) [2022-10-02 16:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][500/1251] eta 0:03:41 lr 0.000139 time 0.2864 (0.2946) loss 2.9897 (3.1426) grad_norm 2.4062 (2.3583) [2022-10-02 16:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][600/1251] eta 0:03:11 lr 0.000139 time 0.2924 (0.2936) loss 3.5115 (3.1521) grad_norm 2.1134 (2.3618) [2022-10-02 16:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][700/1251] eta 0:02:41 lr 0.000139 time 0.2883 (0.2929) loss 3.4512 (3.1617) grad_norm 2.1480 (2.3603) [2022-10-02 16:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][800/1251] eta 0:02:11 lr 0.000138 time 0.2906 (0.2923) loss 2.8563 (3.1478) grad_norm 2.3172 (2.3637) [2022-10-02 16:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][900/1251] eta 0:01:42 lr 0.000138 time 0.2857 (0.2918) loss 3.2332 (3.1503) grad_norm 2.3576 (2.3758) [2022-10-02 17:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1000/1251] eta 0:01:13 lr 0.000138 time 0.2885 (0.2914) loss 2.6775 (3.1503) grad_norm 2.5517 (2.3903) [2022-10-02 17:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1100/1251] eta 0:00:43 lr 0.000138 time 0.2875 (0.2911) loss 3.1645 (3.1596) grad_norm 2.3446 (2.3873) [2022-10-02 17:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1200/1251] eta 0:00:14 lr 0.000137 time 0.2889 (0.2909) loss 3.4770 (3.1663) grad_norm 2.4083 (2.3895) [2022-10-02 17:01:35 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 229 training takes 0:06:04 [2022-10-02 17:01:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.556 (2.556) Loss 0.8546 (0.8546) Acc@1 80.664 (80.664) Acc@5 95.117 (95.117) [2022-10-02 17:01:48 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.618 Acc@5 94.956 [2022-10-02 17:01:48 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-02 17:01:48 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.62% [2022-10-02 17:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][0/1251] eta 0:48:53 lr 0.000137 time 2.3450 (2.3450) loss 3.4852 (3.4852) grad_norm 2.6197 (2.6197) [2022-10-02 17:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][100/1251] eta 0:06:01 lr 0.000137 time 0.2869 (0.3141) loss 2.6276 (3.1217) grad_norm 2.4510 (2.3532) [2022-10-02 17:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][200/1251] eta 0:05:16 lr 0.000137 time 0.2883 (0.3007) loss 3.7535 (3.1268) grad_norm 2.5245 (2.3658) [2022-10-02 17:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][300/1251] eta 0:04:41 lr 0.000136 time 0.2887 (0.2964) loss 3.3791 (3.1148) grad_norm 2.3691 (2.3664) [2022-10-02 17:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][400/1251] eta 0:04:10 lr 0.000136 time 0.2876 (0.2942) loss 3.5538 (3.1212) grad_norm 2.2962 (2.3758) [2022-10-02 17:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][500/1251] eta 0:03:39 lr 0.000136 time 0.2880 (0.2929) loss 3.5154 (3.1213) grad_norm 2.5197 (2.3783) [2022-10-02 17:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][600/1251] eta 0:03:10 lr 0.000135 time 0.2881 (0.2920) loss 3.3597 (3.1376) grad_norm 2.2744 (2.3801) [2022-10-02 17:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][700/1251] eta 0:02:40 lr 0.000135 time 0.2874 (0.2914) loss 3.3790 (3.1318) grad_norm 2.5350 (2.3787) [2022-10-02 17:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][800/1251] eta 0:02:11 lr 0.000135 time 0.2901 (0.2909) loss 2.8917 (3.1378) grad_norm 2.3611 (2.3865) [2022-10-02 17:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][900/1251] eta 0:01:42 lr 0.000135 time 0.2861 (0.2906) loss 3.7099 (3.1325) grad_norm 2.2270 (2.3888) [2022-10-02 17:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1000/1251] eta 0:01:12 lr 0.000134 time 0.2880 (0.2903) loss 3.4469 (3.1418) grad_norm 2.1785 (2.3885) [2022-10-02 17:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1100/1251] eta 0:00:43 lr 0.000134 time 0.2854 (0.2901) loss 2.9861 (3.1439) grad_norm 2.6257 (2.3887) [2022-10-02 17:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1200/1251] eta 0:00:14 lr 0.000134 time 0.2947 (0.2899) loss 3.4847 (3.1461) grad_norm 2.2583 (2.3852) [2022-10-02 17:07:51 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 230 training takes 0:06:02 [2022-10-02 17:07:51 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saving...... [2022-10-02 17:07:51 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saved !!! [2022-10-02 17:07:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.638 (2.638) Loss 0.9389 (0.9389) Acc@1 77.734 (77.734) Acc@5 93.457 (93.457) [2022-10-02 17:08:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.610 Acc@5 94.944 [2022-10-02 17:08:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-02 17:08:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.62% [2022-10-02 17:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][0/1251] eta 0:59:41 lr 0.000134 time 2.8633 (2.8633) loss 2.8466 (2.8466) grad_norm 2.4516 (2.4516) [2022-10-02 17:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][100/1251] eta 0:06:06 lr 0.000133 time 0.2884 (0.3181) loss 3.4114 (3.2477) grad_norm 2.2078 (2.3894) [2022-10-02 17:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][200/1251] eta 0:05:19 lr 0.000133 time 0.2897 (0.3043) loss 2.8862 (3.2425) grad_norm 2.6931 (2.3953) [2022-10-02 17:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][300/1251] eta 0:04:45 lr 0.000133 time 0.2953 (0.2998) loss 3.6419 (3.1990) grad_norm 2.4815 (2.3871) [2022-10-02 17:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][400/1251] eta 0:04:13 lr 0.000133 time 0.2883 (0.2973) loss 3.5122 (3.1912) grad_norm 2.4485 (2.3721) [2022-10-02 17:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][500/1251] eta 0:03:42 lr 0.000132 time 0.2908 (0.2958) loss 3.8665 (3.1903) grad_norm 2.9139 (2.3893) [2022-10-02 17:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][600/1251] eta 0:03:11 lr 0.000132 time 0.2866 (0.2949) loss 3.3434 (3.1842) grad_norm 2.2784 (2.3881) [2022-10-02 17:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][700/1251] eta 0:02:42 lr 0.000132 time 0.2940 (0.2942) loss 2.5520 (3.1792) grad_norm 2.6909 (2.3959) [2022-10-02 17:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][800/1251] eta 0:02:12 lr 0.000132 time 0.2874 (0.2937) loss 3.3163 (3.1845) grad_norm 2.2987 (2.3952) [2022-10-02 17:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][900/1251] eta 0:01:42 lr 0.000131 time 0.2871 (0.2932) loss 3.3494 (3.1746) grad_norm 2.5007 (2.3991) [2022-10-02 17:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1000/1251] eta 0:01:13 lr 0.000131 time 0.2874 (0.2928) loss 3.3447 (3.1702) grad_norm 2.5604 (2.4032) [2022-10-02 17:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1100/1251] eta 0:00:44 lr 0.000131 time 0.2907 (0.2925) loss 2.2850 (3.1642) grad_norm 2.2699 (2.4035) [2022-10-02 17:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1200/1251] eta 0:00:14 lr 0.000130 time 0.2909 (0.2922) loss 3.4144 (3.1689) grad_norm 2.3348 (2.4019) [2022-10-02 17:14:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 231 training takes 0:06:05 [2022-10-02 17:14:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.725 (2.725) Loss 0.8601 (0.8601) Acc@1 79.395 (79.395) Acc@5 95.215 (95.215) [2022-10-02 17:14:23 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.644 Acc@5 95.032 [2022-10-02 17:14:23 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.6% [2022-10-02 17:14:23 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.64% [2022-10-02 17:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][0/1251] eta 1:08:54 lr 0.000130 time 3.3048 (3.3048) loss 3.5559 (3.5559) grad_norm 2.1591 (2.1591) [2022-10-02 17:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][100/1251] eta 0:06:12 lr 0.000130 time 0.2955 (0.3232) loss 3.0829 (3.2260) grad_norm 2.0538 (2.3895) [2022-10-02 17:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][200/1251] eta 0:05:23 lr 0.000130 time 0.2889 (0.3074) loss 3.6330 (3.1572) grad_norm 2.9129 (2.3988) [2022-10-02 17:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][300/1251] eta 0:04:47 lr 0.000129 time 0.2933 (0.3023) loss 2.2500 (3.1652) grad_norm 2.3452 (2.4100) [2022-10-02 17:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][400/1251] eta 0:04:14 lr 0.000129 time 0.2897 (0.2996) loss 2.9303 (3.1636) grad_norm 2.6484 (2.4120) [2022-10-02 17:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][500/1251] eta 0:03:43 lr 0.000129 time 0.2947 (0.2980) loss 3.1943 (3.1662) grad_norm 2.2539 (2.4019) [2022-10-02 17:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][600/1251] eta 0:03:13 lr 0.000129 time 0.2911 (0.2969) loss 3.5840 (3.1670) grad_norm 3.1078 (2.4071) [2022-10-02 17:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][700/1251] eta 0:02:43 lr 0.000128 time 0.2949 (0.2961) loss 3.4067 (3.1742) grad_norm 2.1149 (2.4078) [2022-10-02 17:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][800/1251] eta 0:02:13 lr 0.000128 time 0.2901 (0.2955) loss 3.4998 (3.1652) grad_norm 2.2670 (2.4101) [2022-10-02 17:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][900/1251] eta 0:01:43 lr 0.000128 time 0.2903 (0.2949) loss 2.3863 (3.1521) grad_norm 2.4897 (2.4050) [2022-10-02 17:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1000/1251] eta 0:01:13 lr 0.000128 time 0.2877 (0.2943) loss 3.6542 (3.1581) grad_norm 2.1728 (2.4057) [2022-10-02 17:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1100/1251] eta 0:00:44 lr 0.000127 time 0.2927 (0.2939) loss 3.1271 (3.1587) grad_norm 2.3037 (2.4098) [2022-10-02 17:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1200/1251] eta 0:00:14 lr 0.000127 time 0.2908 (0.2936) loss 3.5067 (3.1534) grad_norm 2.2634 (2.4106) [2022-10-02 17:20:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 232 training takes 0:06:07 [2022-10-02 17:20:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.166 (3.166) Loss 0.9075 (0.9075) Acc@1 78.613 (78.613) Acc@5 95.703 (95.703) [2022-10-02 17:20:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.738 Acc@5 95.102 [2022-10-02 17:20:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-02 17:20:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.74% [2022-10-02 17:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][0/1251] eta 1:07:50 lr 0.000127 time 3.2540 (3.2540) loss 3.3019 (3.3019) grad_norm 2.3759 (2.3759) [2022-10-02 17:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][100/1251] eta 0:06:06 lr 0.000127 time 0.2876 (0.3186) loss 2.2401 (3.2314) grad_norm 2.3041 (2.3456) [2022-10-02 17:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][200/1251] eta 0:05:19 lr 0.000126 time 0.2908 (0.3036) loss 3.6877 (3.2194) grad_norm 2.5548 (2.3796) [2022-10-02 17:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][300/1251] eta 0:04:43 lr 0.000126 time 0.2859 (0.2985) loss 3.4695 (3.1945) grad_norm 2.5756 (2.4407) [2022-10-02 17:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][400/1251] eta 0:04:11 lr 0.000126 time 0.2900 (0.2959) loss 3.5039 (3.1993) grad_norm 4.4330 (2.4559) [2022-10-02 17:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][500/1251] eta 0:03:41 lr 0.000126 time 0.2860 (0.2944) loss 3.2007 (3.2072) grad_norm 2.3060 (2.4568) [2022-10-02 17:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][600/1251] eta 0:03:11 lr 0.000125 time 0.2895 (0.2935) loss 2.9867 (3.2038) grad_norm 2.3229 (2.4632) [2022-10-02 17:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][700/1251] eta 0:02:41 lr 0.000125 time 0.2882 (0.2928) loss 2.9977 (3.1960) grad_norm 2.7141 (2.4546) [2022-10-02 17:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][800/1251] eta 0:02:11 lr 0.000125 time 0.2920 (0.2922) loss 3.3638 (3.1873) grad_norm 3.1578 (2.4621) [2022-10-02 17:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][900/1251] eta 0:01:42 lr 0.000125 time 0.2856 (0.2917) loss 3.4678 (3.1845) grad_norm 2.5388 (2.4685) [2022-10-02 17:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1000/1251] eta 0:01:13 lr 0.000124 time 0.2916 (0.2914) loss 2.5787 (3.1917) grad_norm 2.6327 (2.4754) [2022-10-02 17:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1100/1251] eta 0:00:43 lr 0.000124 time 0.2872 (0.2911) loss 3.5164 (3.1982) grad_norm 2.9595 (2.4776) [2022-10-02 17:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1200/1251] eta 0:00:14 lr 0.000124 time 0.2898 (0.2908) loss 3.6210 (3.1997) grad_norm 2.1792 (2.4815) [2022-10-02 17:26:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 233 training takes 0:06:04 [2022-10-02 17:26:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.674 (2.674) Loss 0.7826 (0.7826) Acc@1 80.273 (80.273) Acc@5 95.801 (95.801) [2022-10-02 17:27:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.762 Acc@5 95.124 [2022-10-02 17:27:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-02 17:27:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.76% [2022-10-02 17:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][0/1251] eta 1:09:19 lr 0.000124 time 3.3246 (3.3246) loss 3.5945 (3.5945) grad_norm 2.3855 (2.3855) [2022-10-02 17:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][100/1251] eta 0:06:13 lr 0.000123 time 0.2937 (0.3245) loss 3.8001 (3.2931) grad_norm 2.1321 (2.4332) [2022-10-02 17:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][200/1251] eta 0:05:25 lr 0.000123 time 0.2903 (0.3093) loss 2.5768 (3.1952) grad_norm 2.3740 (2.4255) [2022-10-02 17:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][300/1251] eta 0:04:49 lr 0.000123 time 0.2927 (0.3040) loss 3.5951 (3.1682) grad_norm 2.2578 (2.4398) [2022-10-02 17:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][400/1251] eta 0:04:16 lr 0.000123 time 0.2915 (0.3011) loss 2.8274 (3.1744) grad_norm 2.6254 (2.4236) [2022-10-02 17:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][500/1251] eta 0:03:44 lr 0.000122 time 0.2933 (0.2992) loss 3.4476 (3.1660) grad_norm 2.3750 (2.4244) [2022-10-02 17:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][600/1251] eta 0:03:13 lr 0.000122 time 0.2921 (0.2978) loss 3.7994 (3.1692) grad_norm 2.5571 (2.4142) [2022-10-02 17:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][700/1251] eta 0:02:43 lr 0.000122 time 0.2920 (0.2967) loss 3.4155 (3.1647) grad_norm 3.1721 (2.4248) [2022-10-02 17:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][800/1251] eta 0:02:13 lr 0.000121 time 0.2903 (0.2960) loss 2.5743 (3.1670) grad_norm 2.4350 (2.4240) [2022-10-02 17:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][900/1251] eta 0:01:43 lr 0.000121 time 0.2935 (0.2953) loss 3.6431 (3.1699) grad_norm 2.2120 (2.4388) [2022-10-02 17:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1000/1251] eta 0:01:14 lr 0.000121 time 0.2886 (0.2948) loss 3.5716 (3.1666) grad_norm 2.3247 (2.4398) [2022-10-02 17:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1100/1251] eta 0:00:44 lr 0.000121 time 0.2891 (0.2944) loss 3.4720 (3.1688) grad_norm 2.6513 (2.4456) [2022-10-02 17:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1200/1251] eta 0:00:14 lr 0.000120 time 0.2908 (0.2939) loss 3.3860 (3.1652) grad_norm 2.1732 (2.4475) [2022-10-02 17:33:08 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 234 training takes 0:06:07 [2022-10-02 17:33:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.149 (2.149) Loss 0.8780 (0.8780) Acc@1 78.809 (78.809) Acc@5 95.410 (95.410) [2022-10-02 17:33:21 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.694 Acc@5 95.110 [2022-10-02 17:33:21 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-10-02 17:33:21 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.76% [2022-10-02 17:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][0/1251] eta 1:08:46 lr 0.000120 time 3.2987 (3.2987) loss 3.3510 (3.3510) grad_norm 2.2950 (2.2950) [2022-10-02 17:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][100/1251] eta 0:06:10 lr 0.000120 time 0.2892 (0.3221) loss 3.5726 (3.1388) grad_norm 2.3670 (2.5080) [2022-10-02 17:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][200/1251] eta 0:05:22 lr 0.000120 time 0.2896 (0.3065) loss 2.7146 (3.1503) grad_norm 2.5476 (2.4736) [2022-10-02 17:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][300/1251] eta 0:04:46 lr 0.000120 time 0.2913 (0.3014) loss 3.6613 (3.1605) grad_norm 2.5261 (2.4547) [2022-10-02 17:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][400/1251] eta 0:04:14 lr 0.000119 time 0.2904 (0.2988) loss 2.9700 (3.1439) grad_norm 2.4599 (2.4659) [2022-10-02 17:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][500/1251] eta 0:03:43 lr 0.000119 time 0.2889 (0.2972) loss 2.7686 (3.1377) grad_norm 2.5627 (2.4565) [2022-10-02 17:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][600/1251] eta 0:03:12 lr 0.000119 time 0.2902 (0.2961) loss 3.2724 (3.1488) grad_norm 2.7289 (2.4654) [2022-10-02 17:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][700/1251] eta 0:02:42 lr 0.000118 time 0.2896 (0.2953) loss 3.3949 (3.1503) grad_norm 2.5161 (2.4625) [2022-10-02 17:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][800/1251] eta 0:02:12 lr 0.000118 time 0.2909 (0.2946) loss 3.4367 (3.1498) grad_norm 2.5256 (2.4653) [2022-10-02 17:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][900/1251] eta 0:01:43 lr 0.000118 time 0.2907 (0.2941) loss 3.7036 (3.1565) grad_norm 2.3887 (2.4723) [2022-10-02 17:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1000/1251] eta 0:01:13 lr 0.000118 time 0.2987 (0.2939) loss 3.0636 (3.1523) grad_norm 2.5324 (2.4757) [2022-10-02 17:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1100/1251] eta 0:00:44 lr 0.000117 time 0.2905 (0.2935) loss 3.0779 (3.1563) grad_norm 2.1990 (2.4752) [2022-10-02 17:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1200/1251] eta 0:00:14 lr 0.000117 time 0.2866 (0.2931) loss 2.2837 (3.1565) grad_norm 2.5949 (2.4774) [2022-10-02 17:39:28 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 235 training takes 0:06:06 [2022-10-02 17:39:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.661 (2.661) Loss 0.7985 (0.7985) Acc@1 82.324 (82.324) Acc@5 95.703 (95.703) [2022-10-02 17:39:41 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.768 Acc@5 95.054 [2022-10-02 17:39:41 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-02 17:39:41 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 79.77% [2022-10-02 17:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][0/1251] eta 0:59:47 lr 0.000117 time 2.8676 (2.8676) loss 3.8378 (3.8378) grad_norm 2.3726 (2.3726) [2022-10-02 17:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][100/1251] eta 0:06:03 lr 0.000117 time 0.2886 (0.3161) loss 2.7281 (3.1808) grad_norm 2.1810 (2.5001) [2022-10-02 17:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][200/1251] eta 0:05:17 lr 0.000117 time 0.2850 (0.3023) loss 3.5975 (3.1564) grad_norm 2.6846 (2.5103) [2022-10-02 17:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][300/1251] eta 0:04:43 lr 0.000116 time 0.2891 (0.2978) loss 3.5013 (3.1464) grad_norm 2.5532 (2.5172) [2022-10-02 17:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][400/1251] eta 0:04:11 lr 0.000116 time 0.2886 (0.2956) loss 3.3134 (3.1480) grad_norm 2.7831 (2.5195) [2022-10-02 17:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][500/1251] eta 0:03:41 lr 0.000116 time 0.2866 (0.2943) loss 2.5337 (3.1443) grad_norm 2.4713 (2.5171) [2022-10-02 17:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][600/1251] eta 0:03:11 lr 0.000116 time 0.2865 (0.2934) loss 3.3976 (3.1400) grad_norm 2.2962 (2.5232) [2022-10-02 17:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][700/1251] eta 0:02:41 lr 0.000115 time 0.2873 (0.2928) loss 3.4814 (3.1344) grad_norm 2.2107 (2.5218) [2022-10-02 17:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][800/1251] eta 0:02:11 lr 0.000115 time 0.2880 (0.2923) loss 3.1703 (3.1358) grad_norm 2.4492 (2.5126) [2022-10-02 17:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][900/1251] eta 0:01:42 lr 0.000115 time 0.2876 (0.2920) loss 2.9416 (3.1325) grad_norm 2.6098 (2.5120) [2022-10-02 17:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1000/1251] eta 0:01:13 lr 0.000115 time 0.2883 (0.2916) loss 3.7217 (3.1376) grad_norm 2.5609 (2.5069) [2022-10-02 17:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1100/1251] eta 0:00:43 lr 0.000114 time 0.2897 (0.2913) loss 3.1059 (3.1400) grad_norm 2.5796 (2.5067) [2022-10-02 17:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1200/1251] eta 0:00:14 lr 0.000114 time 0.2908 (0.2911) loss 3.5688 (3.1468) grad_norm 2.2381 (2.5061) [2022-10-02 17:45:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 236 training takes 0:06:04 [2022-10-02 17:45:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.818 (2.818) Loss 0.9423 (0.9423) Acc@1 78.613 (78.613) Acc@5 93.652 (93.652) [2022-10-02 17:45:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.014 Acc@5 95.112 [2022-10-02 17:45:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-02 17:45:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.01% [2022-10-02 17:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][0/1251] eta 1:12:06 lr 0.000114 time 3.4582 (3.4582) loss 3.1774 (3.1774) grad_norm 2.4744 (2.4744) [2022-10-02 17:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][100/1251] eta 0:06:10 lr 0.000114 time 0.2879 (0.3219) loss 2.6824 (3.0714) grad_norm 2.7168 (2.5707) [2022-10-02 17:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][200/1251] eta 0:05:22 lr 0.000113 time 0.2921 (0.3065) loss 3.1520 (3.0883) grad_norm 2.1361 (2.5858) [2022-10-02 17:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][300/1251] eta 0:04:46 lr 0.000113 time 0.2879 (0.3013) loss 3.7149 (3.1124) grad_norm 3.0118 (2.5848) [2022-10-02 17:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][400/1251] eta 0:04:14 lr 0.000113 time 0.2916 (0.2987) loss 2.9570 (3.1260) grad_norm 2.3615 (2.5854) [2022-10-02 17:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][500/1251] eta 0:03:43 lr 0.000113 time 0.2864 (0.2970) loss 3.0377 (3.1283) grad_norm 2.2252 (2.5822) [2022-10-02 17:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][600/1251] eta 0:03:12 lr 0.000112 time 0.2902 (0.2958) loss 3.4467 (3.1389) grad_norm 2.1656 (2.5767) [2022-10-02 17:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][700/1251] eta 0:02:42 lr 0.000112 time 0.2890 (0.2950) loss 3.7062 (3.1374) grad_norm 2.5759 (2.5711) [2022-10-02 17:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][800/1251] eta 0:02:12 lr 0.000112 time 0.2922 (0.2943) loss 3.7090 (3.1343) grad_norm 2.8658 (2.5687) [2022-10-02 17:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][900/1251] eta 0:01:43 lr 0.000112 time 0.2907 (0.2937) loss 2.3437 (3.1358) grad_norm 2.2128 (2.5646) [2022-10-02 17:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1000/1251] eta 0:01:13 lr 0.000111 time 0.2917 (0.2933) loss 2.8444 (3.1375) grad_norm 3.0253 (2.5612) [2022-10-02 17:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1100/1251] eta 0:00:44 lr 0.000111 time 0.2872 (0.2929) loss 2.9107 (3.1337) grad_norm 5.4783 (2.5607) [2022-10-02 17:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1200/1251] eta 0:00:14 lr 0.000111 time 0.2934 (0.2926) loss 3.4946 (3.1350) grad_norm 2.2871 (2.5593) [2022-10-02 17:52:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 237 training takes 0:06:06 [2022-10-02 17:52:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.336 (3.336) Loss 0.8679 (0.8679) Acc@1 80.762 (80.762) Acc@5 94.629 (94.629) [2022-10-02 17:52:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.828 Acc@5 95.084 [2022-10-02 17:52:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-10-02 17:52:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.01% [2022-10-02 17:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][0/1251] eta 1:09:53 lr 0.000111 time 3.3521 (3.3521) loss 3.4211 (3.4211) grad_norm 2.4693 (2.4693) [2022-10-02 17:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][100/1251] eta 0:06:07 lr 0.000110 time 0.2915 (0.3192) loss 3.3757 (3.2130) grad_norm 2.4677 (2.5195) [2022-10-02 17:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][200/1251] eta 0:05:19 lr 0.000110 time 0.2845 (0.3037) loss 3.2233 (3.1732) grad_norm 2.4261 (2.5558) [2022-10-02 17:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][300/1251] eta 0:04:44 lr 0.000110 time 0.2886 (0.2987) loss 3.4404 (3.1631) grad_norm 2.5857 (2.5428) [2022-10-02 17:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][400/1251] eta 0:04:11 lr 0.000110 time 0.2877 (0.2961) loss 3.3960 (3.1834) grad_norm 2.5610 (2.5287) [2022-10-02 17:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][500/1251] eta 0:03:41 lr 0.000109 time 0.2857 (0.2945) loss 3.4085 (3.1487) grad_norm 2.2609 (2.5268) [2022-10-02 17:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][600/1251] eta 0:03:11 lr 0.000109 time 0.2908 (0.2935) loss 2.5406 (3.1502) grad_norm 2.6279 (2.5196) [2022-10-02 17:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][700/1251] eta 0:02:41 lr 0.000109 time 0.2862 (0.2927) loss 3.2962 (3.1451) grad_norm 2.1675 (2.5283) [2022-10-02 17:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][800/1251] eta 0:02:11 lr 0.000109 time 0.2868 (0.2921) loss 3.5634 (3.1525) grad_norm 2.5957 (2.5279) [2022-10-02 17:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][900/1251] eta 0:01:42 lr 0.000108 time 0.2891 (0.2918) loss 3.2064 (3.1506) grad_norm 2.1878 (2.5352) [2022-10-02 17:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1000/1251] eta 0:01:13 lr 0.000108 time 0.2861 (0.2915) loss 3.5084 (3.1560) grad_norm 2.4516 (2.5482) [2022-10-02 17:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1100/1251] eta 0:00:43 lr 0.000108 time 0.2853 (0.2912) loss 3.4528 (3.1571) grad_norm 2.7903 (2.5570) [2022-10-02 17:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1200/1251] eta 0:00:14 lr 0.000108 time 0.2890 (0.2909) loss 3.3222 (3.1560) grad_norm 2.2700 (2.5603) [2022-10-02 17:58:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 238 training takes 0:06:04 [2022-10-02 17:58:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.189 (3.189) Loss 0.8167 (0.8167) Acc@1 81.738 (81.738) Acc@5 95.117 (95.117) [2022-10-02 17:58:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.868 Acc@5 94.992 [2022-10-02 17:58:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-02 17:58:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.01% [2022-10-02 17:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][0/1251] eta 1:09:20 lr 0.000108 time 3.3258 (3.3258) loss 3.4275 (3.4275) grad_norm 3.1243 (3.1243) [2022-10-02 17:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][100/1251] eta 0:06:07 lr 0.000107 time 0.2877 (0.3191) loss 2.4495 (3.0645) grad_norm 2.3119 (2.5722) [2022-10-02 17:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][200/1251] eta 0:05:19 lr 0.000107 time 0.2882 (0.3040) loss 2.4588 (3.1465) grad_norm 2.2605 (2.6165) [2022-10-02 18:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][300/1251] eta 0:04:44 lr 0.000107 time 0.2895 (0.2988) loss 3.3664 (3.1572) grad_norm 2.5043 (2.5964) [2022-10-02 18:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][400/1251] eta 0:04:12 lr 0.000107 time 0.2988 (0.2962) loss 3.4562 (3.1637) grad_norm 2.5088 (2.5826) [2022-10-02 18:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][500/1251] eta 0:03:41 lr 0.000106 time 0.2884 (0.2946) loss 2.9156 (3.1591) grad_norm 2.6431 (2.5883) [2022-10-02 18:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][600/1251] eta 0:03:11 lr 0.000106 time 0.2877 (0.2935) loss 3.5159 (3.1644) grad_norm 2.5670 (2.5771) [2022-10-02 18:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][700/1251] eta 0:02:41 lr 0.000106 time 0.2913 (0.2927) loss 3.7238 (3.1606) grad_norm 2.4608 (2.5666) [2022-10-02 18:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][800/1251] eta 0:02:11 lr 0.000106 time 0.2877 (0.2922) loss 3.8979 (3.1590) grad_norm 2.3282 (2.5629) [2022-10-02 18:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][900/1251] eta 0:01:42 lr 0.000105 time 0.2926 (0.2918) loss 2.1845 (3.1547) grad_norm 4.1940 (2.5647) [2022-10-02 18:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1000/1251] eta 0:01:13 lr 0.000105 time 0.2877 (0.2914) loss 3.1620 (3.1390) grad_norm 2.5976 (2.5648) [2022-10-02 18:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1100/1251] eta 0:00:43 lr 0.000105 time 0.2866 (0.2911) loss 3.3538 (3.1407) grad_norm 2.6581 (2.5670) [2022-10-02 18:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1200/1251] eta 0:00:14 lr 0.000105 time 0.2885 (0.2909) loss 3.5824 (3.1441) grad_norm 2.3458 (2.5705) [2022-10-02 18:04:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 239 training takes 0:06:04 [2022-10-02 18:04:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.564 (2.564) Loss 0.8389 (0.8389) Acc@1 79.883 (79.883) Acc@5 94.824 (94.824) [2022-10-02 18:04:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.980 Acc@5 95.104 [2022-10-02 18:04:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-02 18:04:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.01% [2022-10-02 18:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][0/1251] eta 1:07:45 lr 0.000105 time 3.2495 (3.2495) loss 2.7907 (2.7907) grad_norm 2.4008 (2.4008) [2022-10-02 18:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][100/1251] eta 0:06:06 lr 0.000104 time 0.2863 (0.3181) loss 3.5164 (3.1917) grad_norm 2.6646 (2.5954) [2022-10-02 18:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][200/1251] eta 0:05:18 lr 0.000104 time 0.2907 (0.3031) loss 3.3314 (3.1365) grad_norm 2.2642 (2.5908) [2022-10-02 18:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][300/1251] eta 0:04:43 lr 0.000104 time 0.2845 (0.2980) loss 2.9895 (3.1478) grad_norm 2.7611 (2.6063) [2022-10-02 18:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][400/1251] eta 0:04:11 lr 0.000104 time 0.2925 (0.2956) loss 3.7885 (3.1450) grad_norm 2.3898 (2.6044) [2022-10-02 18:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][500/1251] eta 0:03:40 lr 0.000103 time 0.2856 (0.2941) loss 2.5758 (3.1573) grad_norm 2.4919 (2.6199) [2022-10-02 18:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][600/1251] eta 0:03:10 lr 0.000103 time 0.2904 (0.2930) loss 2.9069 (3.1442) grad_norm 2.5143 (2.6051) [2022-10-02 18:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][700/1251] eta 0:02:41 lr 0.000103 time 0.2887 (0.2922) loss 2.5078 (3.1477) grad_norm 2.4472 (2.6017) [2022-10-02 18:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][800/1251] eta 0:02:11 lr 0.000103 time 0.2880 (0.2916) loss 2.1797 (3.1435) grad_norm 2.8408 (2.5973) [2022-10-02 18:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][900/1251] eta 0:01:42 lr 0.000102 time 0.2849 (0.2912) loss 3.7306 (3.1516) grad_norm 2.6349 (2.6062) [2022-10-02 18:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1000/1251] eta 0:01:12 lr 0.000102 time 0.2888 (0.2908) loss 3.4485 (3.1481) grad_norm 2.3862 (2.5990) [2022-10-02 18:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1100/1251] eta 0:00:43 lr 0.000102 time 0.2818 (0.2905) loss 3.3799 (3.1441) grad_norm 2.3227 (2.6024) [2022-10-02 18:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1200/1251] eta 0:00:14 lr 0.000102 time 0.2883 (0.2902) loss 2.6397 (3.1398) grad_norm 2.4577 (2.5926) [2022-10-02 18:10:54 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 240 training takes 0:06:03 [2022-10-02 18:10:54 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saving...... [2022-10-02 18:10:54 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saved !!! [2022-10-02 18:10:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.894 (2.894) Loss 0.8376 (0.8376) Acc@1 80.957 (80.957) Acc@5 94.727 (94.727) [2022-10-02 18:11:07 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.034 Acc@5 95.110 [2022-10-02 18:11:07 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-10-02 18:11:07 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.03% [2022-10-02 18:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][0/1251] eta 0:48:07 lr 0.000102 time 2.3082 (2.3082) loss 3.0412 (3.0412) grad_norm 2.2467 (2.2467) [2022-10-02 18:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][100/1251] eta 0:06:05 lr 0.000101 time 0.2866 (0.3172) loss 3.5795 (3.1316) grad_norm 2.5775 (2.5392) [2022-10-02 18:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][200/1251] eta 0:05:18 lr 0.000101 time 0.2862 (0.3029) loss 3.4151 (3.1431) grad_norm 2.7279 (2.5580) [2022-10-02 18:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][300/1251] eta 0:04:43 lr 0.000101 time 0.2852 (0.2981) loss 3.2056 (3.1540) grad_norm 2.9192 (2.5583) [2022-10-02 18:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][400/1251] eta 0:04:11 lr 0.000101 time 0.2897 (0.2959) loss 3.6370 (3.1453) grad_norm 2.4437 (2.5922) [2022-10-02 18:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][500/1251] eta 0:03:41 lr 0.000100 time 0.2886 (0.2944) loss 2.4813 (3.1574) grad_norm 3.1328 (2.5788) [2022-10-02 18:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][600/1251] eta 0:03:11 lr 0.000100 time 0.2885 (0.2935) loss 3.4481 (3.1453) grad_norm 2.8036 (2.5608) [2022-10-02 18:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][700/1251] eta 0:02:41 lr 0.000100 time 0.2849 (0.2927) loss 2.6784 (3.1338) grad_norm 2.4472 (2.5628) [2022-10-02 18:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][800/1251] eta 0:02:11 lr 0.000100 time 0.2892 (0.2924) loss 3.3514 (3.1405) grad_norm 1.9269 (2.5584) [2022-10-02 18:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][900/1251] eta 0:01:42 lr 0.000099 time 0.2903 (0.2921) loss 3.2207 (3.1441) grad_norm 2.4748 (2.5628) [2022-10-02 18:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1000/1251] eta 0:01:13 lr 0.000099 time 0.2931 (0.2917) loss 2.5314 (3.1431) grad_norm 2.5590 (2.5685) [2022-10-02 18:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1100/1251] eta 0:00:44 lr 0.000099 time 0.2892 (0.2915) loss 3.2944 (3.1399) grad_norm 2.6912 (2.5699) [2022-10-02 18:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1200/1251] eta 0:00:14 lr 0.000099 time 0.2890 (0.2913) loss 3.0968 (3.1407) grad_norm 2.6904 (2.5743) [2022-10-02 18:17:12 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 241 training takes 0:06:04 [2022-10-02 18:17:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.640 (2.640) Loss 0.7483 (0.7483) Acc@1 83.398 (83.398) Acc@5 96.387 (96.387) [2022-10-02 18:17:24 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 79.948 Acc@5 95.168 [2022-10-02 18:17:24 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-10-02 18:17:24 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.03% [2022-10-02 18:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][0/1251] eta 1:10:13 lr 0.000099 time 3.3682 (3.3682) loss 3.4855 (3.4855) grad_norm 2.8297 (2.8297) [2022-10-02 18:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][100/1251] eta 0:06:06 lr 0.000098 time 0.2865 (0.3185) loss 3.4392 (3.1043) grad_norm 2.6139 (2.6087) [2022-10-02 18:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][200/1251] eta 0:05:18 lr 0.000098 time 0.2890 (0.3032) loss 3.2776 (3.1063) grad_norm 2.8286 (2.6089) [2022-10-02 18:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][300/1251] eta 0:04:43 lr 0.000098 time 0.2859 (0.2980) loss 3.3694 (3.1076) grad_norm 2.2666 (2.6068) [2022-10-02 18:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][400/1251] eta 0:04:11 lr 0.000098 time 0.2918 (0.2954) loss 3.5893 (3.1038) grad_norm 2.4987 (2.6108) [2022-10-02 18:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][500/1251] eta 0:03:40 lr 0.000097 time 0.2874 (0.2939) loss 3.4773 (3.1196) grad_norm 2.6703 (2.6141) [2022-10-02 18:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][600/1251] eta 0:03:10 lr 0.000097 time 0.2885 (0.2929) loss 3.5322 (3.1311) grad_norm 2.3546 (2.6088) [2022-10-02 18:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][700/1251] eta 0:02:40 lr 0.000097 time 0.2922 (0.2921) loss 3.5698 (3.1321) grad_norm 2.4354 (2.6025) [2022-10-02 18:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][800/1251] eta 0:02:11 lr 0.000097 time 0.2887 (0.2916) loss 3.7075 (3.1314) grad_norm 3.0917 (2.6035) [2022-10-02 18:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][900/1251] eta 0:01:42 lr 0.000096 time 0.2861 (0.2912) loss 2.5522 (3.1260) grad_norm 2.6569 (2.6073) [2022-10-02 18:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1000/1251] eta 0:01:12 lr 0.000096 time 0.2879 (0.2908) loss 3.2357 (3.1265) grad_norm 2.6805 (2.6085) [2022-10-02 18:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1100/1251] eta 0:00:43 lr 0.000096 time 0.2875 (0.2904) loss 1.8634 (3.1178) grad_norm 2.5510 (2.6101) [2022-10-02 18:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1200/1251] eta 0:00:14 lr 0.000096 time 0.2909 (0.2902) loss 3.2634 (3.1089) grad_norm 2.4282 (2.6077) [2022-10-02 18:23:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 242 training takes 0:06:03 [2022-10-02 18:23:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.779 (2.779) Loss 0.8214 (0.8214) Acc@1 80.762 (80.762) Acc@5 95.996 (95.996) [2022-10-02 18:23:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.076 Acc@5 95.236 [2022-10-02 18:23:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-02 18:23:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.08% [2022-10-02 18:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][0/1251] eta 0:51:08 lr 0.000096 time 2.4529 (2.4529) loss 3.0248 (3.0248) grad_norm 2.4945 (2.4945) [2022-10-02 18:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][100/1251] eta 0:06:01 lr 0.000095 time 0.2903 (0.3144) loss 3.1446 (3.2147) grad_norm 2.5319 (2.6101) [2022-10-02 18:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][200/1251] eta 0:05:17 lr 0.000095 time 0.2868 (0.3020) loss 2.4537 (3.1550) grad_norm 2.6746 (2.5874) [2022-10-02 18:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][300/1251] eta 0:04:43 lr 0.000095 time 0.2967 (0.2979) loss 3.6238 (3.1461) grad_norm 2.7036 (2.5917) [2022-10-02 18:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][400/1251] eta 0:04:11 lr 0.000095 time 0.2867 (0.2958) loss 3.1674 (3.1559) grad_norm 2.5308 (2.5963) [2022-10-02 18:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][500/1251] eta 0:03:41 lr 0.000094 time 0.2853 (0.2944) loss 3.4432 (3.1395) grad_norm 2.9144 (2.6092) [2022-10-02 18:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][600/1251] eta 0:03:11 lr 0.000094 time 0.2885 (0.2934) loss 2.9038 (3.1390) grad_norm 2.3187 (2.6032) [2022-10-02 18:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][700/1251] eta 0:02:41 lr 0.000094 time 0.2893 (0.2927) loss 1.8812 (3.1342) grad_norm 2.4524 (2.6035) [2022-10-02 18:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][800/1251] eta 0:02:11 lr 0.000094 time 0.2877 (0.2922) loss 3.5267 (3.1256) grad_norm 2.5259 (2.6018) [2022-10-02 18:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][900/1251] eta 0:01:42 lr 0.000094 time 0.2901 (0.2918) loss 3.0037 (3.1213) grad_norm 2.6825 (2.6035) [2022-10-02 18:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1000/1251] eta 0:01:13 lr 0.000093 time 0.2867 (0.2915) loss 3.3328 (3.1314) grad_norm 3.9702 (2.6095) [2022-10-02 18:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1100/1251] eta 0:00:43 lr 0.000093 time 0.2881 (0.2912) loss 3.2174 (3.1347) grad_norm 3.1796 (2.6089) [2022-10-02 18:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1200/1251] eta 0:00:14 lr 0.000093 time 0.2896 (0.2909) loss 2.8930 (3.1202) grad_norm 2.8895 (2.6133) [2022-10-02 18:29:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 243 training takes 0:06:04 [2022-10-02 18:29:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.272 (3.272) Loss 0.8351 (0.8351) Acc@1 80.859 (80.859) Acc@5 95.215 (95.215) [2022-10-02 18:29:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.134 Acc@5 95.148 [2022-10-02 18:29:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-02 18:29:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.13% [2022-10-02 18:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][0/1251] eta 1:09:49 lr 0.000093 time 3.3489 (3.3489) loss 2.6164 (2.6164) grad_norm 2.4457 (2.4457) [2022-10-02 18:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][100/1251] eta 0:06:08 lr 0.000092 time 0.2911 (0.3205) loss 2.7588 (3.1759) grad_norm 2.4015 (2.6348) [2022-10-02 18:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][200/1251] eta 0:05:21 lr 0.000092 time 0.2930 (0.3056) loss 2.0997 (3.1447) grad_norm 2.1712 (2.6655) [2022-10-02 18:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][300/1251] eta 0:04:45 lr 0.000092 time 0.2901 (0.3006) loss 3.2416 (3.1344) grad_norm 2.3750 (2.6729) [2022-10-02 18:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][400/1251] eta 0:04:13 lr 0.000092 time 0.2910 (0.2981) loss 3.0122 (3.1299) grad_norm 2.7881 (2.6613) [2022-10-02 18:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][500/1251] eta 0:03:42 lr 0.000092 time 0.2904 (0.2965) loss 2.8057 (3.1296) grad_norm 2.7559 (2.6644) [2022-10-02 18:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][600/1251] eta 0:03:12 lr 0.000091 time 0.2917 (0.2956) loss 3.6436 (3.1231) grad_norm 2.6484 (2.6447) [2022-10-02 18:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][700/1251] eta 0:02:42 lr 0.000091 time 0.2898 (0.2948) loss 3.1272 (3.1153) grad_norm 2.2393 (2.6456) [2022-10-02 18:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][800/1251] eta 0:02:12 lr 0.000091 time 0.2921 (0.2942) loss 2.5906 (3.1089) grad_norm 2.4994 (2.6393) [2022-10-02 18:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][900/1251] eta 0:01:43 lr 0.000091 time 0.2884 (0.2937) loss 3.4022 (3.1116) grad_norm 2.8745 (2.6434) [2022-10-02 18:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1000/1251] eta 0:01:13 lr 0.000090 time 0.2879 (0.2934) loss 3.3208 (3.1185) grad_norm 2.6301 (2.6447) [2022-10-02 18:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1100/1251] eta 0:00:44 lr 0.000090 time 0.2866 (0.2930) loss 3.7238 (3.1156) grad_norm 2.7207 (2.6381) [2022-10-02 18:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1200/1251] eta 0:00:14 lr 0.000090 time 0.2960 (0.2927) loss 3.2743 (3.1199) grad_norm 2.4982 (2.6405) [2022-10-02 18:36:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 244 training takes 0:06:06 [2022-10-02 18:36:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.834 (2.834) Loss 0.8378 (0.8378) Acc@1 80.273 (80.273) Acc@5 95.312 (95.312) [2022-10-02 18:36:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.248 Acc@5 95.356 [2022-10-02 18:36:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-10-02 18:36:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.25% [2022-10-02 18:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][0/1251] eta 1:08:17 lr 0.000090 time 3.2755 (3.2755) loss 2.6261 (2.6261) grad_norm 2.3517 (2.3517) [2022-10-02 18:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][100/1251] eta 0:06:09 lr 0.000090 time 0.2872 (0.3213) loss 2.9461 (3.1578) grad_norm 2.7721 (2.6340) [2022-10-02 18:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][200/1251] eta 0:05:21 lr 0.000089 time 0.2919 (0.3063) loss 3.7492 (3.1092) grad_norm 2.5918 (2.6610) [2022-10-02 18:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][300/1251] eta 0:04:46 lr 0.000089 time 0.2901 (0.3011) loss 3.5826 (3.1024) grad_norm 2.4326 (2.6317) [2022-10-02 18:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][400/1251] eta 0:04:14 lr 0.000089 time 0.2903 (0.2985) loss 3.6519 (3.1009) grad_norm 2.1090 (2.6264) [2022-10-02 18:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][500/1251] eta 0:03:42 lr 0.000089 time 0.2855 (0.2969) loss 3.2793 (3.0956) grad_norm 2.7758 (2.6309) [2022-10-02 18:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][600/1251] eta 0:03:12 lr 0.000089 time 0.2926 (0.2958) loss 2.6208 (3.0924) grad_norm 2.7525 (2.6196) [2022-10-02 18:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][700/1251] eta 0:02:42 lr 0.000088 time 0.2911 (0.2950) loss 3.5343 (3.1101) grad_norm 2.7367 (2.6139) [2022-10-02 18:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][800/1251] eta 0:02:12 lr 0.000088 time 0.2886 (0.2944) loss 2.7067 (3.0987) grad_norm 4.1110 (2.6195) [2022-10-02 18:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][900/1251] eta 0:01:43 lr 0.000088 time 0.2861 (0.2938) loss 2.6117 (3.1024) grad_norm 2.3880 (2.6264) [2022-10-02 18:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1000/1251] eta 0:01:13 lr 0.000088 time 0.2907 (0.2933) loss 3.3605 (3.0983) grad_norm 2.6147 (2.6310) [2022-10-02 18:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1100/1251] eta 0:00:44 lr 0.000087 time 0.2887 (0.2929) loss 3.6354 (3.1002) grad_norm 2.2728 (2.6332) [2022-10-02 18:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1200/1251] eta 0:00:14 lr 0.000087 time 0.2891 (0.2925) loss 3.3606 (3.1021) grad_norm 2.4265 (2.6455) [2022-10-02 18:42:23 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 245 training takes 0:06:06 [2022-10-02 18:42:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.738 (2.738) Loss 0.8306 (0.8306) Acc@1 80.469 (80.469) Acc@5 94.922 (94.922) [2022-10-02 18:42:35 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.148 Acc@5 95.196 [2022-10-02 18:42:35 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-10-02 18:42:35 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.25% [2022-10-02 18:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][0/1251] eta 0:59:28 lr 0.000087 time 2.8525 (2.8525) loss 3.4339 (3.4339) grad_norm 3.4277 (3.4277) [2022-10-02 18:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][100/1251] eta 0:06:02 lr 0.000087 time 0.2861 (0.3152) loss 3.6797 (3.0988) grad_norm 2.6281 (2.6464) [2022-10-02 18:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][200/1251] eta 0:05:17 lr 0.000087 time 0.2903 (0.3017) loss 3.4343 (3.0774) grad_norm 2.7084 (2.6884) [2022-10-02 18:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][300/1251] eta 0:04:42 lr 0.000086 time 0.2843 (0.2971) loss 2.8045 (3.0926) grad_norm 2.2133 (2.6735) [2022-10-02 18:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][400/1251] eta 0:04:10 lr 0.000086 time 0.2880 (0.2948) loss 2.2860 (3.1105) grad_norm 2.7302 (2.6563) [2022-10-02 18:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][500/1251] eta 0:03:40 lr 0.000086 time 0.2849 (0.2935) loss 2.2502 (3.1294) grad_norm 2.7365 (2.6478) [2022-10-02 18:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][600/1251] eta 0:03:10 lr 0.000086 time 0.2925 (0.2925) loss 3.8869 (3.1053) grad_norm 2.8076 (2.6463) [2022-10-02 18:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][700/1251] eta 0:02:40 lr 0.000086 time 0.2890 (0.2919) loss 2.3741 (3.0919) grad_norm 2.2079 (2.6540) [2022-10-02 18:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][800/1251] eta 0:02:11 lr 0.000085 time 0.2865 (0.2913) loss 3.5422 (3.0933) grad_norm 2.7965 (2.6530) [2022-10-02 18:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][900/1251] eta 0:01:42 lr 0.000085 time 0.2848 (0.2909) loss 3.4223 (3.0941) grad_norm 2.3458 (2.6553) [2022-10-02 18:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1000/1251] eta 0:01:12 lr 0.000085 time 0.2901 (0.2906) loss 3.4184 (3.1015) grad_norm 2.9048 (2.6537) [2022-10-02 18:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1100/1251] eta 0:00:43 lr 0.000085 time 0.2880 (0.2905) loss 2.4081 (3.0971) grad_norm 3.0577 (2.6610) [2022-10-02 18:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1200/1251] eta 0:00:14 lr 0.000084 time 0.2874 (0.2903) loss 3.3163 (3.1004) grad_norm 2.7000 (2.6607) [2022-10-02 18:48:39 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 246 training takes 0:06:03 [2022-10-02 18:48:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.770 (2.770) Loss 0.8669 (0.8669) Acc@1 79.492 (79.492) Acc@5 95.215 (95.215) [2022-10-02 18:48:51 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.262 Acc@5 95.248 [2022-10-02 18:48:51 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-02 18:48:51 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.26% [2022-10-02 18:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][0/1251] eta 0:52:52 lr 0.000084 time 2.5363 (2.5363) loss 2.4257 (2.4257) grad_norm 2.5461 (2.5461) [2022-10-02 18:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][100/1251] eta 0:06:06 lr 0.000084 time 0.2897 (0.3180) loss 3.5962 (3.0812) grad_norm 2.7111 (2.6315) [2022-10-02 18:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][200/1251] eta 0:05:21 lr 0.000084 time 0.2959 (0.3059) loss 2.9110 (3.0741) grad_norm 3.0341 (2.6659) [2022-10-02 18:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][300/1251] eta 0:04:46 lr 0.000084 time 0.2904 (0.3018) loss 2.8165 (3.0879) grad_norm 2.5292 (2.6896) [2022-10-02 18:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][400/1251] eta 0:04:15 lr 0.000083 time 0.2961 (0.2997) loss 3.2967 (3.1040) grad_norm 2.5974 (2.6916) [2022-10-02 18:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][500/1251] eta 0:03:43 lr 0.000083 time 0.2904 (0.2981) loss 3.1995 (3.0908) grad_norm 2.9778 (2.6861) [2022-10-02 18:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][600/1251] eta 0:03:13 lr 0.000083 time 0.2889 (0.2968) loss 2.6942 (3.1016) grad_norm 2.8080 (2.6791) [2022-10-02 18:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][700/1251] eta 0:02:43 lr 0.000083 time 0.2960 (0.2961) loss 1.9799 (3.0983) grad_norm 2.6291 (2.6766) [2022-10-02 18:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][800/1251] eta 0:02:13 lr 0.000083 time 0.2946 (0.2957) loss 2.6337 (3.1120) grad_norm 2.5532 (2.6826) [2022-10-02 18:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][900/1251] eta 0:01:43 lr 0.000082 time 0.2987 (0.2951) loss 2.3025 (3.1180) grad_norm 3.0173 (2.6753) [2022-10-02 18:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1000/1251] eta 0:01:13 lr 0.000082 time 0.2924 (0.2946) loss 3.7400 (3.1158) grad_norm 2.3810 (2.6779) [2022-10-02 18:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1100/1251] eta 0:00:44 lr 0.000082 time 0.2908 (0.2942) loss 3.3643 (3.1143) grad_norm 2.9127 (2.6816) [2022-10-02 18:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1200/1251] eta 0:00:14 lr 0.000082 time 0.2921 (0.2940) loss 3.6973 (3.1101) grad_norm 2.9196 (2.6796) [2022-10-02 18:54:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 247 training takes 0:06:08 [2022-10-02 18:55:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.769 (2.769) Loss 0.8398 (0.8398) Acc@1 79.004 (79.004) Acc@5 95.605 (95.605) [2022-10-02 18:55:12 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.342 Acc@5 95.318 [2022-10-02 18:55:12 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-02 18:55:12 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.34% [2022-10-02 18:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][0/1251] eta 1:10:15 lr 0.000082 time 3.3696 (3.3696) loss 2.0794 (2.0794) grad_norm 2.6284 (2.6284) [2022-10-02 18:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][100/1251] eta 0:06:09 lr 0.000081 time 0.2933 (0.3214) loss 2.9421 (3.0719) grad_norm 3.1447 (2.7046) [2022-10-02 18:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][200/1251] eta 0:05:21 lr 0.000081 time 0.2938 (0.3060) loss 3.7010 (3.0995) grad_norm 3.1057 (2.6934) [2022-10-02 18:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][300/1251] eta 0:04:46 lr 0.000081 time 0.2913 (0.3008) loss 2.5484 (3.0967) grad_norm 3.1444 (2.6973) [2022-10-02 18:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][400/1251] eta 0:04:13 lr 0.000081 time 0.2994 (0.2981) loss 3.2262 (3.1044) grad_norm 2.3347 (2.6741) [2022-10-02 18:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][500/1251] eta 0:03:42 lr 0.000081 time 0.2902 (0.2964) loss 3.5753 (3.1095) grad_norm 2.8144 (2.6850) [2022-10-02 18:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][600/1251] eta 0:03:12 lr 0.000080 time 0.2869 (0.2955) loss 3.3781 (3.1113) grad_norm 2.4609 (2.6923) [2022-10-02 18:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][700/1251] eta 0:02:42 lr 0.000080 time 0.2934 (0.2946) loss 3.4206 (3.1052) grad_norm 2.6455 (2.6856) [2022-10-02 18:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][800/1251] eta 0:02:12 lr 0.000080 time 0.2884 (0.2939) loss 3.1386 (3.0984) grad_norm 2.5469 (2.6896) [2022-10-02 18:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][900/1251] eta 0:01:42 lr 0.000080 time 0.2906 (0.2934) loss 3.6420 (3.0984) grad_norm 2.9248 (2.6846) [2022-10-02 19:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1000/1251] eta 0:01:13 lr 0.000079 time 0.2887 (0.2929) loss 3.7352 (3.1034) grad_norm 2.6126 (2.6846) [2022-10-02 19:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1100/1251] eta 0:00:44 lr 0.000079 time 0.2914 (0.2925) loss 3.4675 (3.1161) grad_norm 2.7526 (2.6892) [2022-10-02 19:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1200/1251] eta 0:00:14 lr 0.000079 time 0.2890 (0.2923) loss 3.7538 (3.1098) grad_norm 3.0279 (2.6904) [2022-10-02 19:01:18 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 248 training takes 0:06:05 [2022-10-02 19:01:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.634 (2.634) Loss 0.8580 (0.8580) Acc@1 80.176 (80.176) Acc@5 95.703 (95.703) [2022-10-02 19:01:31 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.274 Acc@5 95.144 [2022-10-02 19:01:31 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-02 19:01:31 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.34% [2022-10-02 19:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][0/1251] eta 0:58:17 lr 0.000079 time 2.7957 (2.7957) loss 2.8310 (2.8310) grad_norm 2.9416 (2.9416) [2022-10-02 19:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][100/1251] eta 0:06:04 lr 0.000079 time 0.2927 (0.3167) loss 3.2546 (2.9730) grad_norm 2.7774 (2.6458) [2022-10-02 19:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][200/1251] eta 0:05:18 lr 0.000079 time 0.2930 (0.3032) loss 3.4095 (3.0059) grad_norm 2.3263 (2.7014) [2022-10-02 19:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][300/1251] eta 0:04:44 lr 0.000078 time 0.2924 (0.2987) loss 3.3172 (3.0520) grad_norm 2.2991 (2.7317) [2022-10-02 19:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][400/1251] eta 0:04:12 lr 0.000078 time 0.2902 (0.2963) loss 2.4946 (3.0433) grad_norm 2.5265 (2.7164) [2022-10-02 19:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][500/1251] eta 0:03:41 lr 0.000078 time 0.2915 (0.2950) loss 2.0634 (3.0508) grad_norm 2.2849 (2.7279) [2022-10-02 19:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][600/1251] eta 0:03:11 lr 0.000078 time 0.2890 (0.2940) loss 3.5588 (3.0727) grad_norm 2.7627 (2.7233) [2022-10-02 19:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][700/1251] eta 0:02:41 lr 0.000077 time 0.2889 (0.2934) loss 3.5074 (3.0791) grad_norm 2.9280 (2.7264) [2022-10-02 19:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][800/1251] eta 0:02:12 lr 0.000077 time 0.2875 (0.2929) loss 2.0059 (3.0828) grad_norm 2.7923 (2.7325) [2022-10-02 19:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][900/1251] eta 0:01:42 lr 0.000077 time 0.2903 (0.2925) loss 3.1538 (3.0882) grad_norm 2.3096 (2.7294) [2022-10-02 19:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1000/1251] eta 0:01:13 lr 0.000077 time 0.2860 (0.2922) loss 2.5045 (3.0858) grad_norm 2.3374 (2.7260) [2022-10-02 19:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1100/1251] eta 0:00:44 lr 0.000077 time 0.2922 (0.2919) loss 3.6466 (3.0886) grad_norm 2.6698 (2.7403) [2022-10-02 19:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1200/1251] eta 0:00:14 lr 0.000076 time 0.2875 (0.2917) loss 1.9754 (3.0859) grad_norm 2.6596 (2.7364) [2022-10-02 19:07:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 249 training takes 0:06:05 [2022-10-02 19:07:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.238 (2.238) Loss 0.8182 (0.8182) Acc@1 80.957 (80.957) Acc@5 94.824 (94.824) [2022-10-02 19:07:49 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.400 Acc@5 95.198 [2022-10-02 19:07:49 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:07:49 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-02 19:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][0/1251] eta 0:54:55 lr 0.000076 time 2.6343 (2.6343) loss 2.1973 (2.1973) grad_norm 2.4709 (2.4709) [2022-10-02 19:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][100/1251] eta 0:06:04 lr 0.000076 time 0.2929 (0.3167) loss 3.6481 (3.0309) grad_norm 2.7565 (2.7405) [2022-10-02 19:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][200/1251] eta 0:05:19 lr 0.000076 time 0.2907 (0.3037) loss 3.3190 (3.0741) grad_norm 2.4916 (2.7617) [2022-10-02 19:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][300/1251] eta 0:04:44 lr 0.000076 time 0.2940 (0.2992) loss 3.5128 (3.0702) grad_norm 3.9284 (2.7718) [2022-10-02 19:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][400/1251] eta 0:04:13 lr 0.000075 time 0.2880 (0.2975) loss 3.2116 (3.0740) grad_norm 2.5034 (2.7580) [2022-10-02 19:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][500/1251] eta 0:03:42 lr 0.000075 time 0.2902 (0.2960) loss 3.5026 (3.0829) grad_norm 2.3905 (2.7678) [2022-10-02 19:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][600/1251] eta 0:03:12 lr 0.000075 time 0.2866 (0.2950) loss 3.0539 (3.0846) grad_norm 2.8815 (2.7688) [2022-10-02 19:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][700/1251] eta 0:02:42 lr 0.000075 time 0.2923 (0.2943) loss 2.1262 (3.0923) grad_norm 2.7095 (2.7663) [2022-10-02 19:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][800/1251] eta 0:02:12 lr 0.000075 time 0.2880 (0.2936) loss 3.3115 (3.0942) grad_norm 2.8463 (2.7680) [2022-10-02 19:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][900/1251] eta 0:01:42 lr 0.000074 time 0.2964 (0.2932) loss 3.3429 (3.0951) grad_norm 2.3456 (2.7705) [2022-10-02 19:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1000/1251] eta 0:01:13 lr 0.000074 time 0.2867 (0.2927) loss 3.2038 (3.0908) grad_norm 2.5156 (2.7665) [2022-10-02 19:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1100/1251] eta 0:00:44 lr 0.000074 time 0.2926 (0.2924) loss 3.7494 (3.0880) grad_norm 2.3460 (2.7640) [2022-10-02 19:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1200/1251] eta 0:00:14 lr 0.000074 time 0.2874 (0.2921) loss 2.8477 (3.0868) grad_norm 2.7480 (2.7618) [2022-10-02 19:13:55 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 250 training takes 0:06:05 [2022-10-02 19:13:55 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saving...... [2022-10-02 19:13:55 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saved !!! [2022-10-02 19:13:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.751 (2.751) Loss 0.8272 (0.8272) Acc@1 81.250 (81.250) Acc@5 95.801 (95.801) [2022-10-02 19:14:08 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.358 Acc@5 95.264 [2022-10-02 19:14:08 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:14:08 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-02 19:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][0/1251] eta 1:04:13 lr 0.000074 time 3.0806 (3.0806) loss 3.1077 (3.1077) grad_norm 2.5033 (2.5033) [2022-10-02 19:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][100/1251] eta 0:06:07 lr 0.000074 time 0.2924 (0.3193) loss 3.1190 (3.0446) grad_norm 3.0862 (2.7490) [2022-10-02 19:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][200/1251] eta 0:05:20 lr 0.000073 time 0.2893 (0.3049) loss 2.7024 (3.0819) grad_norm 2.5241 (2.7609) [2022-10-02 19:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][300/1251] eta 0:04:45 lr 0.000073 time 0.2961 (0.3000) loss 3.1311 (3.0721) grad_norm 2.3490 (2.7417) [2022-10-02 19:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][400/1251] eta 0:04:13 lr 0.000073 time 0.2889 (0.2976) loss 3.4580 (3.0710) grad_norm 2.7648 (2.7306) [2022-10-02 19:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][500/1251] eta 0:03:42 lr 0.000073 time 0.2899 (0.2960) loss 3.3365 (3.0713) grad_norm 2.4381 (2.7267) [2022-10-02 19:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][600/1251] eta 0:03:12 lr 0.000073 time 0.2912 (0.2951) loss 3.2157 (3.0715) grad_norm 2.5725 (2.7255) [2022-10-02 19:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][700/1251] eta 0:02:42 lr 0.000072 time 0.2862 (0.2943) loss 3.6891 (3.0682) grad_norm 2.6485 (2.7289) [2022-10-02 19:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][800/1251] eta 0:02:12 lr 0.000072 time 0.2923 (0.2937) loss 2.3881 (3.0714) grad_norm 2.7652 (2.7359) [2022-10-02 19:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][900/1251] eta 0:01:42 lr 0.000072 time 0.2874 (0.2933) loss 2.2322 (3.0748) grad_norm 2.5422 (2.7345) [2022-10-02 19:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1000/1251] eta 0:01:13 lr 0.000072 time 0.2925 (0.2929) loss 2.5744 (3.0696) grad_norm 2.8546 (2.7347) [2022-10-02 19:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1100/1251] eta 0:00:44 lr 0.000072 time 0.2922 (0.2926) loss 2.6714 (3.0724) grad_norm 2.5063 (2.7322) [2022-10-02 19:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1200/1251] eta 0:00:14 lr 0.000071 time 0.2887 (0.2924) loss 3.0844 (3.0696) grad_norm 2.6370 (2.7376) [2022-10-02 19:20:14 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 251 training takes 0:06:06 [2022-10-02 19:20:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.193 (3.193) Loss 0.9291 (0.9291) Acc@1 78.613 (78.613) Acc@5 94.727 (94.727) [2022-10-02 19:20:26 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.348 Acc@5 95.334 [2022-10-02 19:20:26 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-10-02 19:20:26 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-02 19:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][0/1251] eta 1:05:08 lr 0.000071 time 3.1247 (3.1247) loss 3.3750 (3.3750) grad_norm 3.8723 (3.8723) [2022-10-02 19:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][100/1251] eta 0:06:04 lr 0.000071 time 0.2853 (0.3168) loss 3.3029 (3.0321) grad_norm 3.2441 (2.7631) [2022-10-02 19:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][200/1251] eta 0:05:17 lr 0.000071 time 0.2863 (0.3023) loss 3.7404 (3.0671) grad_norm 2.6051 (2.7533) [2022-10-02 19:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][300/1251] eta 0:04:42 lr 0.000071 time 0.2872 (0.2975) loss 2.7025 (3.0655) grad_norm 2.4224 (2.7578) [2022-10-02 19:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][400/1251] eta 0:04:11 lr 0.000070 time 0.2890 (0.2951) loss 3.3109 (3.0730) grad_norm 2.7518 (2.7726) [2022-10-02 19:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][500/1251] eta 0:03:40 lr 0.000070 time 0.2889 (0.2938) loss 2.6868 (3.0639) grad_norm 3.2309 (2.7783) [2022-10-02 19:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][600/1251] eta 0:03:10 lr 0.000070 time 0.2853 (0.2928) loss 3.0879 (3.0669) grad_norm 2.7427 (2.7652) [2022-10-02 19:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][700/1251] eta 0:02:40 lr 0.000070 time 0.2862 (0.2920) loss 3.2006 (3.0634) grad_norm 2.3513 (2.7693) [2022-10-02 19:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][800/1251] eta 0:02:11 lr 0.000070 time 0.2883 (0.2915) loss 2.7911 (3.0751) grad_norm 4.1214 (2.7804) [2022-10-02 19:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][900/1251] eta 0:01:42 lr 0.000069 time 0.2883 (0.2911) loss 3.2412 (3.0795) grad_norm 2.6541 (2.7726) [2022-10-02 19:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1000/1251] eta 0:01:12 lr 0.000069 time 0.2865 (0.2908) loss 2.6578 (3.0813) grad_norm 2.3335 (2.7764) [2022-10-02 19:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1100/1251] eta 0:00:43 lr 0.000069 time 0.2862 (0.2905) loss 2.2678 (3.0831) grad_norm 2.6260 (2.7750) [2022-10-02 19:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1200/1251] eta 0:00:14 lr 0.000069 time 0.2890 (0.2903) loss 2.6568 (3.0806) grad_norm 2.9084 (2.7702) [2022-10-02 19:26:30 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 252 training takes 0:06:03 [2022-10-02 19:26:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.279 (2.279) Loss 0.8153 (0.8153) Acc@1 80.762 (80.762) Acc@5 95.410 (95.410) [2022-10-02 19:26:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.358 Acc@5 95.272 [2022-10-02 19:26:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:26:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.40% [2022-10-02 19:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][0/1251] eta 1:09:49 lr 0.000069 time 3.3490 (3.3490) loss 3.5910 (3.5910) grad_norm 2.6376 (2.6376) [2022-10-02 19:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][100/1251] eta 0:06:09 lr 0.000069 time 0.2892 (0.3213) loss 3.3360 (3.0314) grad_norm 2.3479 (2.7566) [2022-10-02 19:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][200/1251] eta 0:05:21 lr 0.000068 time 0.2966 (0.3055) loss 2.3253 (3.0432) grad_norm 3.0418 (2.7568) [2022-10-02 19:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][300/1251] eta 0:04:46 lr 0.000068 time 0.2917 (0.3008) loss 3.5494 (3.0669) grad_norm 3.4444 (2.7691) [2022-10-02 19:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][400/1251] eta 0:04:13 lr 0.000068 time 0.2906 (0.2983) loss 3.2630 (3.0628) grad_norm 2.8361 (2.7696) [2022-10-02 19:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][500/1251] eta 0:03:42 lr 0.000068 time 0.2907 (0.2966) loss 3.8397 (3.0754) grad_norm 2.8731 (2.7764) [2022-10-02 19:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][600/1251] eta 0:03:12 lr 0.000068 time 0.2882 (0.2956) loss 3.3274 (3.0840) grad_norm 3.7984 (2.7853) [2022-10-02 19:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][700/1251] eta 0:02:42 lr 0.000067 time 0.2983 (0.2948) loss 3.1533 (3.0898) grad_norm 3.5591 (2.7895) [2022-10-02 19:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][800/1251] eta 0:02:12 lr 0.000067 time 0.2895 (0.2941) loss 2.4349 (3.0800) grad_norm 3.1843 (2.7805) [2022-10-02 19:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][900/1251] eta 0:01:43 lr 0.000067 time 0.2888 (0.2937) loss 2.3576 (3.0813) grad_norm 2.7153 (2.7769) [2022-10-02 19:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1000/1251] eta 0:01:13 lr 0.000067 time 0.2858 (0.2932) loss 1.9442 (3.0752) grad_norm 2.3447 (2.7886) [2022-10-02 19:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1100/1251] eta 0:00:44 lr 0.000067 time 0.2881 (0.2929) loss 3.3926 (3.0767) grad_norm 2.8638 (2.7893) [2022-10-02 19:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1200/1251] eta 0:00:14 lr 0.000066 time 0.2866 (0.2926) loss 3.0590 (3.0691) grad_norm 2.9935 (2.7919) [2022-10-02 19:32:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 253 training takes 0:06:06 [2022-10-02 19:32:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.750 (2.750) Loss 0.8372 (0.8372) Acc@1 81.348 (81.348) Acc@5 95.312 (95.312) [2022-10-02 19:33:01 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.436 Acc@5 95.284 [2022-10-02 19:33:01 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:33:01 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.44% [2022-10-02 19:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][0/1251] eta 1:09:57 lr 0.000066 time 3.3554 (3.3554) loss 3.4386 (3.4386) grad_norm 2.5953 (2.5953) [2022-10-02 19:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][100/1251] eta 0:06:09 lr 0.000066 time 0.2933 (0.3211) loss 3.3976 (3.0965) grad_norm 2.9215 (2.7835) [2022-10-02 19:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][200/1251] eta 0:05:21 lr 0.000066 time 0.2901 (0.3058) loss 3.5456 (3.0464) grad_norm 3.2353 (2.8183) [2022-10-02 19:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][300/1251] eta 0:04:45 lr 0.000066 time 0.2863 (0.3005) loss 2.2508 (3.0339) grad_norm 2.4236 (2.8116) [2022-10-02 19:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][400/1251] eta 0:04:13 lr 0.000066 time 0.2943 (0.2979) loss 2.7842 (3.0320) grad_norm 2.2637 (2.8326) [2022-10-02 19:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][500/1251] eta 0:03:42 lr 0.000065 time 0.2873 (0.2962) loss 2.9805 (3.0293) grad_norm 2.2561 (2.8094) [2022-10-02 19:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][600/1251] eta 0:03:12 lr 0.000065 time 0.2913 (0.2951) loss 3.5880 (3.0257) grad_norm 3.0653 (2.8127) [2022-10-02 19:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][700/1251] eta 0:02:42 lr 0.000065 time 0.2904 (0.2944) loss 3.3232 (3.0328) grad_norm 3.4296 (2.8055) [2022-10-02 19:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][800/1251] eta 0:02:12 lr 0.000065 time 0.2914 (0.2938) loss 3.4422 (3.0274) grad_norm 2.8655 (2.8024) [2022-10-02 19:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][900/1251] eta 0:01:42 lr 0.000065 time 0.2862 (0.2933) loss 3.1670 (3.0390) grad_norm 2.4158 (2.8000) [2022-10-02 19:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1000/1251] eta 0:01:13 lr 0.000064 time 0.2926 (0.2929) loss 3.4509 (3.0522) grad_norm 2.9292 (2.7976) [2022-10-02 19:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1100/1251] eta 0:00:44 lr 0.000064 time 0.2869 (0.2926) loss 2.3672 (3.0513) grad_norm 2.7548 (2.8024) [2022-10-02 19:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1200/1251] eta 0:00:14 lr 0.000064 time 0.2911 (0.2923) loss 2.8471 (3.0465) grad_norm 2.4672 (2.8047) [2022-10-02 19:39:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 254 training takes 0:06:05 [2022-10-02 19:39:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.782 (2.782) Loss 0.8346 (0.8346) Acc@1 80.469 (80.469) Acc@5 95.312 (95.312) [2022-10-02 19:39:20 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.396 Acc@5 95.380 [2022-10-02 19:39:20 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:39:20 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.44% [2022-10-02 19:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][0/1251] eta 1:13:46 lr 0.000064 time 3.5385 (3.5385) loss 3.2512 (3.2512) grad_norm 2.4988 (2.4988) [2022-10-02 19:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][100/1251] eta 0:06:12 lr 0.000064 time 0.2876 (0.3237) loss 3.5344 (3.1069) grad_norm 2.5414 (2.9170) [2022-10-02 19:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][200/1251] eta 0:05:23 lr 0.000064 time 0.2896 (0.3076) loss 2.2154 (3.0792) grad_norm 3.2764 (2.8606) [2022-10-02 19:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][300/1251] eta 0:04:46 lr 0.000063 time 0.2880 (0.3017) loss 2.5595 (3.0595) grad_norm 3.5429 (2.8620) [2022-10-02 19:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][400/1251] eta 0:04:14 lr 0.000063 time 0.2860 (0.2987) loss 3.4382 (3.0706) grad_norm 2.5736 (2.8658) [2022-10-02 19:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][500/1251] eta 0:03:42 lr 0.000063 time 0.2881 (0.2969) loss 3.0629 (3.0462) grad_norm 2.3674 (2.8682) [2022-10-02 19:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][600/1251] eta 0:03:12 lr 0.000063 time 0.2891 (0.2957) loss 3.1392 (3.0612) grad_norm 2.6153 (2.8596) [2022-10-02 19:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][700/1251] eta 0:02:42 lr 0.000063 time 0.2902 (0.2948) loss 3.4250 (3.0584) grad_norm 2.7317 (2.8557) [2022-10-02 19:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][800/1251] eta 0:02:12 lr 0.000062 time 0.2843 (0.2942) loss 3.4635 (3.0622) grad_norm 2.7778 (2.8643) [2022-10-02 19:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][900/1251] eta 0:01:43 lr 0.000062 time 0.2864 (0.2936) loss 3.0305 (3.0614) grad_norm 2.7177 (2.8761) [2022-10-02 19:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1000/1251] eta 0:01:13 lr 0.000062 time 0.2912 (0.2931) loss 2.2696 (3.0703) grad_norm 2.9282 (2.8612) [2022-10-02 19:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1100/1251] eta 0:00:44 lr 0.000062 time 0.2866 (0.2927) loss 3.4742 (3.0717) grad_norm 2.5400 (2.8591) [2022-10-02 19:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1200/1251] eta 0:00:14 lr 0.000062 time 0.2862 (0.2924) loss 3.1163 (3.0771) grad_norm 2.6954 (2.8579) [2022-10-02 19:45:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 255 training takes 0:06:06 [2022-10-02 19:45:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.264 (3.264) Loss 0.8561 (0.8561) Acc@1 78.516 (78.516) Acc@5 95.801 (95.801) [2022-10-02 19:45:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.420 Acc@5 95.306 [2022-10-02 19:45:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-10-02 19:45:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.44% [2022-10-02 19:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][0/1251] eta 1:08:56 lr 0.000062 time 3.3062 (3.3062) loss 3.3844 (3.3844) grad_norm 2.8224 (2.8224) [2022-10-02 19:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][100/1251] eta 0:06:09 lr 0.000061 time 0.2863 (0.3208) loss 3.3864 (3.0063) grad_norm 3.0791 (2.9239) [2022-10-02 19:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][200/1251] eta 0:05:21 lr 0.000061 time 0.2898 (0.3061) loss 2.3915 (3.0331) grad_norm 2.9660 (2.8684) [2022-10-02 19:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][300/1251] eta 0:04:46 lr 0.000061 time 0.2940 (0.3011) loss 3.0575 (3.0319) grad_norm 2.7873 (2.8623) [2022-10-02 19:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][400/1251] eta 0:04:13 lr 0.000061 time 0.2879 (0.2985) loss 2.4459 (3.0568) grad_norm 2.6584 (2.8641) [2022-10-02 19:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][500/1251] eta 0:03:42 lr 0.000061 time 0.2899 (0.2969) loss 3.6054 (3.0611) grad_norm 3.0673 (2.8588) [2022-10-02 19:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][600/1251] eta 0:03:12 lr 0.000061 time 0.2898 (0.2959) loss 3.7797 (3.0484) grad_norm 2.6251 (2.8570) [2022-10-02 19:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][700/1251] eta 0:02:42 lr 0.000060 time 0.2876 (0.2952) loss 3.5949 (3.0492) grad_norm 3.0414 (2.8525) [2022-10-02 19:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][800/1251] eta 0:02:12 lr 0.000060 time 0.2880 (0.2946) loss 2.8605 (3.0504) grad_norm 3.0524 (2.8677) [2022-10-02 19:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][900/1251] eta 0:01:43 lr 0.000060 time 0.2889 (0.2941) loss 3.6155 (3.0503) grad_norm 2.8158 (2.8676) [2022-10-02 19:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1000/1251] eta 0:01:13 lr 0.000060 time 0.2901 (0.2938) loss 3.2748 (3.0517) grad_norm 2.6115 (2.8608) [2022-10-02 19:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1100/1251] eta 0:00:44 lr 0.000060 time 0.2925 (0.2935) loss 3.2619 (3.0600) grad_norm 2.7158 (2.8541) [2022-10-02 19:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1200/1251] eta 0:00:14 lr 0.000059 time 0.2894 (0.2933) loss 3.5850 (3.0591) grad_norm 2.8941 (2.8578) [2022-10-02 19:51:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 256 training takes 0:06:07 [2022-10-02 19:51:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.939 (2.939) Loss 0.8488 (0.8488) Acc@1 80.176 (80.176) Acc@5 95.117 (95.117) [2022-10-02 19:51:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.494 Acc@5 95.328 [2022-10-02 19:51:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-10-02 19:51:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.49% [2022-10-02 19:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][0/1251] eta 1:09:47 lr 0.000059 time 3.3472 (3.3472) loss 3.4417 (3.4417) grad_norm 2.8945 (2.8945) [2022-10-02 19:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][100/1251] eta 0:06:06 lr 0.000059 time 0.2858 (0.3187) loss 3.3578 (3.0644) grad_norm 3.3639 (2.8524) [2022-10-02 19:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][200/1251] eta 0:05:18 lr 0.000059 time 0.2877 (0.3034) loss 2.2386 (3.0355) grad_norm 2.5584 (2.8495) [2022-10-02 19:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][300/1251] eta 0:04:43 lr 0.000059 time 0.2842 (0.2982) loss 3.3691 (3.0463) grad_norm 2.9994 (2.8493) [2022-10-02 19:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][400/1251] eta 0:04:11 lr 0.000059 time 0.2886 (0.2956) loss 3.3559 (3.0351) grad_norm 2.3926 (2.8560) [2022-10-02 19:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][500/1251] eta 0:03:40 lr 0.000058 time 0.2861 (0.2941) loss 2.7993 (3.0477) grad_norm 2.6289 (2.8626) [2022-10-02 19:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][600/1251] eta 0:03:10 lr 0.000058 time 0.2847 (0.2931) loss 3.2724 (3.0568) grad_norm 3.0140 (2.8568) [2022-10-02 19:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][700/1251] eta 0:02:41 lr 0.000058 time 0.2848 (0.2923) loss 3.7678 (3.0619) grad_norm 3.2901 (2.8624) [2022-10-02 19:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][800/1251] eta 0:02:11 lr 0.000058 time 0.2892 (0.2918) loss 3.3407 (3.0665) grad_norm 2.7868 (2.8612) [2022-10-02 19:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][900/1251] eta 0:01:42 lr 0.000058 time 0.2809 (0.2913) loss 3.3136 (3.0700) grad_norm 2.4046 (2.8584) [2022-10-02 19:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1000/1251] eta 0:01:13 lr 0.000058 time 0.2876 (0.2910) loss 3.2981 (3.0689) grad_norm 2.9122 (2.8567) [2022-10-02 19:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1100/1251] eta 0:00:43 lr 0.000057 time 0.2852 (0.2907) loss 3.1274 (3.0658) grad_norm 2.6643 (2.8590) [2022-10-02 19:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1200/1251] eta 0:00:14 lr 0.000057 time 0.2928 (0.2904) loss 3.6597 (3.0683) grad_norm 2.7001 (2.8613) [2022-10-02 19:58:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 257 training takes 0:06:03 [2022-10-02 19:58:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.772 (2.772) Loss 0.8553 (0.8553) Acc@1 80.469 (80.469) Acc@5 94.727 (94.727) [2022-10-02 19:58:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.596 Acc@5 95.360 [2022-10-02 19:58:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-02 19:58:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.60% [2022-10-02 19:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][0/1251] eta 1:13:20 lr 0.000057 time 3.5177 (3.5177) loss 3.3981 (3.3981) grad_norm 3.3093 (3.3093) [2022-10-02 19:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][100/1251] eta 0:06:10 lr 0.000057 time 0.2864 (0.3217) loss 3.3355 (3.0207) grad_norm 2.4062 (2.8633) [2022-10-02 19:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][200/1251] eta 0:05:21 lr 0.000057 time 0.2900 (0.3055) loss 3.4165 (3.0417) grad_norm 2.9769 (2.8506) [2022-10-02 19:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][300/1251] eta 0:04:45 lr 0.000057 time 0.2882 (0.3003) loss 3.1762 (3.0348) grad_norm 3.3301 (2.8404) [2022-10-02 20:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][400/1251] eta 0:04:13 lr 0.000056 time 0.2902 (0.2976) loss 3.2827 (3.0193) grad_norm 2.6598 (2.8518) [2022-10-02 20:00:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][500/1251] eta 0:03:42 lr 0.000056 time 0.2893 (0.2961) loss 2.6652 (3.0267) grad_norm 2.9048 (2.8415) [2022-10-02 20:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][600/1251] eta 0:03:11 lr 0.000056 time 0.2900 (0.2949) loss 3.5469 (3.0318) grad_norm 2.9180 (2.8445) [2022-10-02 20:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][700/1251] eta 0:02:42 lr 0.000056 time 0.2858 (0.2941) loss 3.6178 (3.0330) grad_norm 2.8156 (2.8494) [2022-10-02 20:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][800/1251] eta 0:02:12 lr 0.000056 time 0.2914 (0.2934) loss 3.0748 (3.0320) grad_norm 2.9639 (2.8530) [2022-10-02 20:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][900/1251] eta 0:01:42 lr 0.000056 time 0.2892 (0.2930) loss 3.1770 (3.0348) grad_norm 2.2881 (2.8624) [2022-10-02 20:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1000/1251] eta 0:01:13 lr 0.000055 time 0.2911 (0.2926) loss 3.2760 (3.0387) grad_norm 2.9181 (2.8654) [2022-10-02 20:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1100/1251] eta 0:00:44 lr 0.000055 time 0.2917 (0.2923) loss 3.5560 (3.0409) grad_norm 2.6367 (2.8708) [2022-10-02 20:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1200/1251] eta 0:00:14 lr 0.000055 time 0.2926 (0.2920) loss 3.8287 (3.0379) grad_norm 2.6923 (2.8652) [2022-10-02 20:04:21 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 258 training takes 0:06:05 [2022-10-02 20:04:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.250 (2.250) Loss 0.9178 (0.9178) Acc@1 77.246 (77.246) Acc@5 95.215 (95.215) [2022-10-02 20:04:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.562 Acc@5 95.358 [2022-10-02 20:04:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-02 20:04:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.60% [2022-10-02 20:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][0/1251] eta 1:04:09 lr 0.000055 time 3.0769 (3.0769) loss 2.4838 (2.4838) grad_norm 2.7959 (2.7959) [2022-10-02 20:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][100/1251] eta 0:06:08 lr 0.000055 time 0.2902 (0.3205) loss 3.2789 (3.0321) grad_norm 3.1274 (2.8620) [2022-10-02 20:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][200/1251] eta 0:05:21 lr 0.000055 time 0.2890 (0.3059) loss 2.0013 (3.0529) grad_norm 2.8147 (2.8658) [2022-10-02 20:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][300/1251] eta 0:04:46 lr 0.000054 time 0.2928 (0.3009) loss 3.2727 (3.0269) grad_norm 2.5801 (2.8678) [2022-10-02 20:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][400/1251] eta 0:04:13 lr 0.000054 time 0.2918 (0.2983) loss 3.0344 (3.0337) grad_norm 3.4899 (2.8689) [2022-10-02 20:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][500/1251] eta 0:03:42 lr 0.000054 time 0.2886 (0.2969) loss 3.1950 (3.0332) grad_norm 2.7336 (2.8671) [2022-10-02 20:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][600/1251] eta 0:03:12 lr 0.000054 time 0.2915 (0.2959) loss 3.4297 (3.0257) grad_norm 2.8467 (2.8783) [2022-10-02 20:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][700/1251] eta 0:02:42 lr 0.000054 time 0.2906 (0.2951) loss 3.3605 (3.0148) grad_norm 2.5815 (2.8809) [2022-10-02 20:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][800/1251] eta 0:02:12 lr 0.000054 time 0.2915 (0.2945) loss 2.2871 (3.0163) grad_norm 2.9445 (2.8828) [2022-10-02 20:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][900/1251] eta 0:01:43 lr 0.000053 time 0.2902 (0.2941) loss 2.0271 (3.0150) grad_norm 3.3088 (2.8821) [2022-10-02 20:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1000/1251] eta 0:01:13 lr 0.000053 time 0.2934 (0.2937) loss 2.5750 (3.0221) grad_norm 2.6837 (2.8854) [2022-10-02 20:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1100/1251] eta 0:00:44 lr 0.000053 time 0.2905 (0.2933) loss 2.7983 (3.0236) grad_norm 2.8276 (2.8889) [2022-10-02 20:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1200/1251] eta 0:00:14 lr 0.000053 time 0.2894 (0.2930) loss 2.3374 (3.0296) grad_norm 2.5176 (2.8929) [2022-10-02 20:10:41 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 259 training takes 0:06:06 [2022-10-02 20:10:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.575 (2.575) Loss 0.9156 (0.9156) Acc@1 78.125 (78.125) Acc@5 94.531 (94.531) [2022-10-02 20:10:54 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.584 Acc@5 95.384 [2022-10-02 20:10:54 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-02 20:10:54 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.60% [2022-10-02 20:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][0/1251] eta 0:48:08 lr 0.000053 time 2.3090 (2.3090) loss 3.3409 (3.3409) grad_norm 2.3613 (2.3613) [2022-10-02 20:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][100/1251] eta 0:06:05 lr 0.000053 time 0.2927 (0.3176) loss 3.3628 (3.0715) grad_norm 2.5404 (2.9532) [2022-10-02 20:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][200/1251] eta 0:05:19 lr 0.000052 time 0.2876 (0.3038) loss 3.4894 (3.0454) grad_norm 3.0140 (2.9073) [2022-10-02 20:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][300/1251] eta 0:04:44 lr 0.000052 time 0.2879 (0.2990) loss 2.1925 (3.0498) grad_norm 2.8131 (2.9079) [2022-10-02 20:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][400/1251] eta 0:04:12 lr 0.000052 time 0.2878 (0.2967) loss 3.2380 (3.0541) grad_norm 3.2049 (2.8987) [2022-10-02 20:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][500/1251] eta 0:03:41 lr 0.000052 time 0.2880 (0.2952) loss 2.4671 (3.0488) grad_norm 2.9421 (2.8967) [2022-10-02 20:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][600/1251] eta 0:03:11 lr 0.000052 time 0.2880 (0.2942) loss 3.1329 (3.0497) grad_norm 2.8520 (2.9025) [2022-10-02 20:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][700/1251] eta 0:02:41 lr 0.000052 time 0.2854 (0.2934) loss 3.3182 (3.0545) grad_norm 2.9536 (2.9096) [2022-10-02 20:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][800/1251] eta 0:02:12 lr 0.000051 time 0.2875 (0.2929) loss 2.9939 (3.0588) grad_norm 2.9039 (2.9168) [2022-10-02 20:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][900/1251] eta 0:01:42 lr 0.000051 time 0.2881 (0.2924) loss 2.1883 (3.0505) grad_norm 2.8759 (2.9158) [2022-10-02 20:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1000/1251] eta 0:01:13 lr 0.000051 time 0.2907 (0.2920) loss 2.6132 (3.0461) grad_norm 2.8562 (2.9043) [2022-10-02 20:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1100/1251] eta 0:00:44 lr 0.000051 time 0.2865 (0.2917) loss 2.0852 (3.0476) grad_norm 3.0096 (2.9053) [2022-10-02 20:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1200/1251] eta 0:00:14 lr 0.000051 time 0.2859 (0.2915) loss 2.9611 (3.0485) grad_norm 2.7385 (2.9044) [2022-10-02 20:16:59 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 260 training takes 0:06:04 [2022-10-02 20:16:59 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saving...... [2022-10-02 20:16:59 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saved !!! [2022-10-02 20:17:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.927 (2.927) Loss 0.8349 (0.8349) Acc@1 81.348 (81.348) Acc@5 94.922 (94.922) [2022-10-02 20:17:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.618 Acc@5 95.372 [2022-10-02 20:17:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-10-02 20:17:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.62% [2022-10-02 20:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][0/1251] eta 1:06:51 lr 0.000051 time 3.2063 (3.2063) loss 3.2318 (3.2318) grad_norm 3.3543 (3.3543) [2022-10-02 20:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][100/1251] eta 0:06:08 lr 0.000051 time 0.2869 (0.3203) loss 3.3256 (2.9574) grad_norm 2.7457 (2.9376) [2022-10-02 20:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][200/1251] eta 0:05:21 lr 0.000050 time 0.2878 (0.3055) loss 3.2828 (3.0024) grad_norm 2.9858 (2.9708) [2022-10-02 20:18:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][300/1251] eta 0:04:45 lr 0.000050 time 0.2883 (0.3003) loss 2.8042 (2.9994) grad_norm 2.6069 (2.9419) [2022-10-02 20:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][400/1251] eta 0:04:13 lr 0.000050 time 0.2874 (0.2977) loss 2.7864 (3.0049) grad_norm 2.8073 (2.9460) [2022-10-02 20:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][500/1251] eta 0:03:42 lr 0.000050 time 0.2890 (0.2960) loss 3.2561 (3.0027) grad_norm 2.7141 (2.9420) [2022-10-02 20:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][600/1251] eta 0:03:11 lr 0.000050 time 0.2887 (0.2948) loss 2.5594 (3.0122) grad_norm 2.5743 (2.9496) [2022-10-02 20:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][700/1251] eta 0:02:41 lr 0.000050 time 0.2899 (0.2940) loss 3.3797 (3.0195) grad_norm 3.2913 (2.9512) [2022-10-02 20:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][800/1251] eta 0:02:12 lr 0.000049 time 0.2891 (0.2933) loss 3.1977 (3.0215) grad_norm 3.1542 (2.9435) [2022-10-02 20:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][900/1251] eta 0:01:42 lr 0.000049 time 0.2915 (0.2928) loss 3.4523 (3.0289) grad_norm 2.1719 (2.9393) [2022-10-02 20:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1000/1251] eta 0:01:13 lr 0.000049 time 0.2884 (0.2924) loss 3.2163 (3.0367) grad_norm 2.5928 (2.9416) [2022-10-02 20:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1100/1251] eta 0:00:44 lr 0.000049 time 0.2924 (0.2921) loss 2.9347 (3.0392) grad_norm 2.8519 (2.9372) [2022-10-02 20:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1200/1251] eta 0:00:14 lr 0.000049 time 0.2892 (0.2919) loss 3.2765 (3.0417) grad_norm 3.2270 (2.9368) [2022-10-02 20:23:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 261 training takes 0:06:05 [2022-10-02 20:23:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.452 (2.452) Loss 0.8584 (0.8584) Acc@1 80.859 (80.859) Acc@5 95.898 (95.898) [2022-10-02 20:23:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.656 Acc@5 95.394 [2022-10-02 20:23:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:23:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.66% [2022-10-02 20:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][0/1251] eta 0:51:58 lr 0.000049 time 2.4925 (2.4925) loss 3.0032 (3.0032) grad_norm 2.6036 (2.6036) [2022-10-02 20:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][100/1251] eta 0:06:05 lr 0.000049 time 0.2939 (0.3179) loss 3.1867 (2.9861) grad_norm 2.8468 (2.9679) [2022-10-02 20:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][200/1251] eta 0:05:20 lr 0.000048 time 0.2942 (0.3047) loss 3.5846 (3.0046) grad_norm 2.6604 (2.9148) [2022-10-02 20:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][300/1251] eta 0:04:45 lr 0.000048 time 0.2933 (0.3002) loss 2.0869 (2.9931) grad_norm 3.0919 (2.9226) [2022-10-02 20:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][400/1251] eta 0:04:13 lr 0.000048 time 0.2936 (0.2977) loss 1.9393 (3.0062) grad_norm 2.9680 (2.9337) [2022-10-02 20:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][500/1251] eta 0:03:42 lr 0.000048 time 0.2918 (0.2964) loss 3.3337 (3.0203) grad_norm 3.1132 (2.9546) [2022-10-02 20:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][600/1251] eta 0:03:12 lr 0.000048 time 0.2893 (0.2954) loss 2.9074 (3.0327) grad_norm 3.1133 (2.9755) [2022-10-02 20:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][700/1251] eta 0:02:42 lr 0.000048 time 0.2937 (0.2946) loss 2.8569 (3.0287) grad_norm 2.4569 (2.9713) [2022-10-02 20:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][800/1251] eta 0:02:12 lr 0.000047 time 0.2880 (0.2940) loss 3.7569 (3.0287) grad_norm 2.6759 (2.9776) [2022-10-02 20:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][900/1251] eta 0:01:43 lr 0.000047 time 0.2902 (0.2935) loss 3.6114 (3.0358) grad_norm 3.0153 (2.9742) [2022-10-02 20:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1000/1251] eta 0:01:13 lr 0.000047 time 0.2952 (0.2931) loss 3.0422 (3.0430) grad_norm 2.8451 (2.9686) [2022-10-02 20:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1100/1251] eta 0:00:44 lr 0.000047 time 0.2963 (0.2928) loss 2.4976 (3.0405) grad_norm 3.0045 (2.9707) [2022-10-02 20:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1200/1251] eta 0:00:14 lr 0.000047 time 0.2882 (0.2925) loss 2.3662 (3.0420) grad_norm 2.8523 (2.9700) [2022-10-02 20:29:36 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 262 training takes 0:06:06 [2022-10-02 20:29:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.755 (2.755) Loss 0.8044 (0.8044) Acc@1 80.176 (80.176) Acc@5 95.996 (95.996) [2022-10-02 20:29:49 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.702 Acc@5 95.438 [2022-10-02 20:29:49 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:29:49 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.70% [2022-10-02 20:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][0/1251] eta 0:46:47 lr 0.000047 time 2.2439 (2.2439) loss 3.3963 (3.3963) grad_norm 2.8568 (2.8568) [2022-10-02 20:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][100/1251] eta 0:06:03 lr 0.000047 time 0.2922 (0.3158) loss 3.0455 (3.0341) grad_norm 2.8537 (3.0001) [2022-10-02 20:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][200/1251] eta 0:05:17 lr 0.000046 time 0.2889 (0.3024) loss 3.1626 (3.0190) grad_norm 2.9128 (2.9770) [2022-10-02 20:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][300/1251] eta 0:04:43 lr 0.000046 time 0.2909 (0.2980) loss 2.9200 (3.0045) grad_norm 2.7991 (2.9760) [2022-10-02 20:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][400/1251] eta 0:04:11 lr 0.000046 time 0.2875 (0.2955) loss 3.4752 (3.0226) grad_norm 2.6401 (2.9592) [2022-10-02 20:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][500/1251] eta 0:03:40 lr 0.000046 time 0.2884 (0.2941) loss 3.0255 (3.0234) grad_norm 4.4192 (2.9663) [2022-10-02 20:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][600/1251] eta 0:03:10 lr 0.000046 time 0.2941 (0.2931) loss 2.5845 (3.0260) grad_norm 2.6298 (2.9706) [2022-10-02 20:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][700/1251] eta 0:02:41 lr 0.000046 time 0.2898 (0.2924) loss 2.4154 (3.0370) grad_norm 3.0001 (2.9665) [2022-10-02 20:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][800/1251] eta 0:02:11 lr 0.000045 time 0.2858 (0.2919) loss 3.3179 (3.0385) grad_norm 3.0438 (2.9671) [2022-10-02 20:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][900/1251] eta 0:01:42 lr 0.000045 time 0.2909 (0.2914) loss 2.7883 (3.0287) grad_norm 2.5563 (2.9637) [2022-10-02 20:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1000/1251] eta 0:01:13 lr 0.000045 time 0.2859 (0.2910) loss 3.3885 (3.0312) grad_norm 2.5667 (2.9658) [2022-10-02 20:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1100/1251] eta 0:00:43 lr 0.000045 time 0.2891 (0.2908) loss 2.7730 (3.0278) grad_norm 3.0037 (2.9653) [2022-10-02 20:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1200/1251] eta 0:00:14 lr 0.000045 time 0.2872 (0.2905) loss 3.4926 (3.0269) grad_norm 3.1491 (2.9710) [2022-10-02 20:35:52 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 263 training takes 0:06:03 [2022-10-02 20:35:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.116 (3.116) Loss 0.8519 (0.8519) Acc@1 80.078 (80.078) Acc@5 95.898 (95.898) [2022-10-02 20:36:05 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.696 Acc@5 95.420 [2022-10-02 20:36:05 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:36:05 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.70% [2022-10-02 20:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][0/1251] eta 1:11:22 lr 0.000045 time 3.4232 (3.4232) loss 3.4811 (3.4811) grad_norm 2.8185 (2.8185) [2022-10-02 20:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][100/1251] eta 0:06:10 lr 0.000045 time 0.2920 (0.3216) loss 2.6982 (3.0266) grad_norm 2.5353 (3.0032) [2022-10-02 20:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][200/1251] eta 0:05:21 lr 0.000044 time 0.2898 (0.3060) loss 3.2152 (3.0344) grad_norm 2.7599 (2.9543) [2022-10-02 20:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][300/1251] eta 0:04:45 lr 0.000044 time 0.2869 (0.3001) loss 2.7065 (3.0231) grad_norm 2.5808 (2.9511) [2022-10-02 20:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][400/1251] eta 0:04:12 lr 0.000044 time 0.2913 (0.2973) loss 3.0842 (3.0327) grad_norm 2.8850 (2.9662) [2022-10-02 20:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][500/1251] eta 0:03:41 lr 0.000044 time 0.2878 (0.2956) loss 3.2297 (3.0245) grad_norm 2.9567 (2.9637) [2022-10-02 20:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][600/1251] eta 0:03:11 lr 0.000044 time 0.2892 (0.2943) loss 2.2134 (3.0273) grad_norm 2.7692 (2.9777) [2022-10-02 20:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][700/1251] eta 0:02:41 lr 0.000044 time 0.2907 (0.2935) loss 3.2723 (3.0283) grad_norm 2.6725 (2.9668) [2022-10-02 20:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][800/1251] eta 0:02:12 lr 0.000044 time 0.2923 (0.2928) loss 2.7653 (3.0182) grad_norm 2.7322 (2.9663) [2022-10-02 20:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][900/1251] eta 0:01:42 lr 0.000043 time 0.2863 (0.2922) loss 3.5820 (3.0208) grad_norm 3.0262 (2.9675) [2022-10-02 20:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1000/1251] eta 0:01:13 lr 0.000043 time 0.2925 (0.2917) loss 3.1454 (3.0232) grad_norm 2.8531 (2.9752) [2022-10-02 20:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1100/1251] eta 0:00:44 lr 0.000043 time 0.2879 (0.2914) loss 3.5457 (3.0260) grad_norm 2.5503 (2.9773) [2022-10-02 20:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1200/1251] eta 0:00:14 lr 0.000043 time 0.2951 (0.2912) loss 3.4535 (3.0278) grad_norm 2.9786 (2.9820) [2022-10-02 20:42:10 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 264 training takes 0:06:04 [2022-10-02 20:42:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.696 (2.696) Loss 0.8244 (0.8244) Acc@1 80.664 (80.664) Acc@5 95.703 (95.703) [2022-10-02 20:42:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.748 Acc@5 95.398 [2022-10-02 20:42:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:42:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.75% [2022-10-02 20:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][0/1251] eta 1:06:11 lr 0.000043 time 3.1743 (3.1743) loss 3.4465 (3.4465) grad_norm 2.8971 (2.8971) [2022-10-02 20:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][100/1251] eta 0:06:08 lr 0.000043 time 0.2944 (0.3204) loss 3.3666 (3.0881) grad_norm 2.5938 (3.0003) [2022-10-02 20:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][200/1251] eta 0:05:21 lr 0.000043 time 0.2901 (0.3059) loss 3.1025 (3.0365) grad_norm 2.8146 (3.0139) [2022-10-02 20:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][300/1251] eta 0:04:46 lr 0.000042 time 0.2868 (0.3010) loss 2.8513 (3.0303) grad_norm 2.6777 (3.0143) [2022-10-02 20:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][400/1251] eta 0:04:13 lr 0.000042 time 0.2921 (0.2983) loss 3.2204 (3.0421) grad_norm 2.7933 (3.0019) [2022-10-02 20:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][500/1251] eta 0:03:42 lr 0.000042 time 0.2902 (0.2969) loss 2.5054 (3.0284) grad_norm 3.3503 (2.9988) [2022-10-02 20:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][600/1251] eta 0:03:12 lr 0.000042 time 0.2882 (0.2958) loss 3.4828 (3.0190) grad_norm 2.5294 (3.0048) [2022-10-02 20:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][700/1251] eta 0:02:42 lr 0.000042 time 0.2845 (0.2951) loss 3.2914 (3.0167) grad_norm 3.0353 (2.9962) [2022-10-02 20:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][800/1251] eta 0:02:12 lr 0.000042 time 0.2922 (0.2945) loss 3.0874 (3.0094) grad_norm 2.6394 (3.0051) [2022-10-02 20:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][900/1251] eta 0:01:43 lr 0.000042 time 0.2877 (0.2941) loss 2.9351 (3.0107) grad_norm 3.0175 (3.0000) [2022-10-02 20:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1000/1251] eta 0:01:13 lr 0.000041 time 0.2951 (0.2938) loss 3.0596 (3.0114) grad_norm 3.5272 (3.0009) [2022-10-02 20:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1100/1251] eta 0:00:44 lr 0.000041 time 0.2870 (0.2934) loss 3.2317 (3.0134) grad_norm 2.6717 (2.9974) [2022-10-02 20:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1200/1251] eta 0:00:14 lr 0.000041 time 0.2911 (0.2931) loss 3.7350 (3.0159) grad_norm 3.4550 (3.0039) [2022-10-02 20:48:29 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 265 training takes 0:06:06 [2022-10-02 20:48:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.140 (3.140) Loss 0.7995 (0.7995) Acc@1 82.031 (82.031) Acc@5 95.703 (95.703) [2022-10-02 20:48:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.692 Acc@5 95.392 [2022-10-02 20:48:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:48:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.75% [2022-10-02 20:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][0/1251] eta 0:45:41 lr 0.000041 time 2.1917 (2.1917) loss 3.5127 (3.5127) grad_norm 2.8368 (2.8368) [2022-10-02 20:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][100/1251] eta 0:06:05 lr 0.000041 time 0.2868 (0.3175) loss 3.1645 (3.0267) grad_norm 2.8033 (3.1480) [2022-10-02 20:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][200/1251] eta 0:05:17 lr 0.000041 time 0.2863 (0.3025) loss 2.7867 (3.0237) grad_norm 3.4214 (3.0782) [2022-10-02 20:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][300/1251] eta 0:04:42 lr 0.000041 time 0.2863 (0.2975) loss 2.9551 (3.0294) grad_norm 3.2108 (3.0606) [2022-10-02 20:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][400/1251] eta 0:04:12 lr 0.000040 time 0.2870 (0.2963) loss 3.1112 (3.0354) grad_norm 2.7968 (3.0494) [2022-10-02 20:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][500/1251] eta 0:03:41 lr 0.000040 time 0.2867 (0.2944) loss 3.0940 (3.0178) grad_norm 2.7574 (3.0393) [2022-10-02 20:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][600/1251] eta 0:03:10 lr 0.000040 time 0.2859 (0.2933) loss 3.2651 (3.0002) grad_norm 3.1206 (3.0310) [2022-10-02 20:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][700/1251] eta 0:02:41 lr 0.000040 time 0.2868 (0.2924) loss 3.5865 (2.9946) grad_norm 2.7592 (3.0377) [2022-10-02 20:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][800/1251] eta 0:02:11 lr 0.000040 time 0.2890 (0.2918) loss 3.1113 (2.9880) grad_norm 3.6072 (3.0373) [2022-10-02 20:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][900/1251] eta 0:01:42 lr 0.000040 time 0.2850 (0.2913) loss 3.1863 (3.0020) grad_norm 2.7160 (3.0335) [2022-10-02 20:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1000/1251] eta 0:01:13 lr 0.000040 time 0.2867 (0.2909) loss 2.0867 (3.0038) grad_norm 3.0362 (3.0294) [2022-10-02 20:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1100/1251] eta 0:00:43 lr 0.000039 time 0.2884 (0.2906) loss 3.1081 (3.0101) grad_norm 3.1096 (3.0355) [2022-10-02 20:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1200/1251] eta 0:00:14 lr 0.000039 time 0.2873 (0.2903) loss 3.2000 (3.0155) grad_norm 2.7836 (3.0310) [2022-10-02 20:54:46 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 266 training takes 0:06:03 [2022-10-02 20:54:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.160 (3.160) Loss 0.8436 (0.8436) Acc@1 80.469 (80.469) Acc@5 95.020 (95.020) [2022-10-02 20:54:59 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.698 Acc@5 95.406 [2022-10-02 20:54:59 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 20:54:59 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.75% [2022-10-02 20:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][0/1251] eta 1:02:36 lr 0.000039 time 3.0029 (3.0029) loss 3.0961 (3.0961) grad_norm 3.1951 (3.1951) [2022-10-02 20:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][100/1251] eta 0:06:05 lr 0.000039 time 0.2877 (0.3178) loss 3.4651 (3.0583) grad_norm 3.0860 (3.0592) [2022-10-02 20:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][200/1251] eta 0:05:20 lr 0.000039 time 0.2905 (0.3045) loss 3.4826 (3.0636) grad_norm 2.8672 (3.0229) [2022-10-02 20:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][300/1251] eta 0:04:45 lr 0.000039 time 0.2868 (0.2999) loss 2.9905 (3.0618) grad_norm 2.7535 (3.0480) [2022-10-02 20:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][400/1251] eta 0:04:13 lr 0.000039 time 0.2911 (0.2977) loss 2.1098 (3.0341) grad_norm 2.9710 (3.0362) [2022-10-02 20:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][500/1251] eta 0:03:42 lr 0.000039 time 0.2924 (0.2958) loss 2.5475 (3.0256) grad_norm 3.5027 (3.0396) [2022-10-02 20:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][600/1251] eta 0:03:11 lr 0.000038 time 0.2899 (0.2946) loss 3.5433 (3.0496) grad_norm 2.8185 (3.0378) [2022-10-02 20:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][700/1251] eta 0:02:42 lr 0.000038 time 0.2888 (0.2942) loss 3.2175 (3.0573) grad_norm 2.7418 (3.0357) [2022-10-02 20:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][800/1251] eta 0:02:12 lr 0.000038 time 0.2923 (0.2938) loss 3.4286 (3.0526) grad_norm 2.6149 (3.0379) [2022-10-02 20:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][900/1251] eta 0:01:42 lr 0.000038 time 0.2905 (0.2932) loss 3.2047 (3.0463) grad_norm 2.4470 (3.0354) [2022-10-02 20:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1000/1251] eta 0:01:13 lr 0.000038 time 0.2876 (0.2929) loss 3.3046 (3.0473) grad_norm 2.6089 (3.0534) [2022-10-02 21:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1100/1251] eta 0:00:44 lr 0.000038 time 0.2862 (0.2925) loss 3.0040 (3.0501) grad_norm 2.9505 (3.0452) [2022-10-02 21:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1200/1251] eta 0:00:14 lr 0.000038 time 0.2892 (0.2922) loss 3.1741 (3.0485) grad_norm 3.5786 (3.0484) [2022-10-02 21:01:05 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 267 training takes 0:06:05 [2022-10-02 21:01:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.236 (3.236) Loss 0.8933 (0.8933) Acc@1 78.711 (78.711) Acc@5 95.020 (95.020) [2022-10-02 21:01:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.714 Acc@5 95.406 [2022-10-02 21:01:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-10-02 21:01:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.75% [2022-10-02 21:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][0/1251] eta 1:18:48 lr 0.000038 time 3.7800 (3.7800) loss 3.4562 (3.4562) grad_norm 2.8541 (2.8541) [2022-10-02 21:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][100/1251] eta 0:06:14 lr 0.000037 time 0.2879 (0.3250) loss 2.4205 (3.0092) grad_norm 2.5657 (3.0583) [2022-10-02 21:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][200/1251] eta 0:05:22 lr 0.000037 time 0.2895 (0.3065) loss 2.4938 (3.0187) grad_norm 2.7344 (3.0497) [2022-10-02 21:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][300/1251] eta 0:04:45 lr 0.000037 time 0.2897 (0.3004) loss 2.2038 (3.0246) grad_norm 2.5684 (3.0486) [2022-10-02 21:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][400/1251] eta 0:04:12 lr 0.000037 time 0.2917 (0.2972) loss 3.5736 (3.0379) grad_norm 2.6974 (3.0606) [2022-10-02 21:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][500/1251] eta 0:03:41 lr 0.000037 time 0.2895 (0.2953) loss 3.1980 (3.0269) grad_norm 3.6382 (3.0726) [2022-10-02 21:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][600/1251] eta 0:03:11 lr 0.000037 time 0.2874 (0.2941) loss 3.5895 (3.0229) grad_norm 2.9105 (3.0765) [2022-10-02 21:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][700/1251] eta 0:02:41 lr 0.000037 time 0.2895 (0.2932) loss 3.3463 (3.0286) grad_norm 2.9743 (3.0695) [2022-10-02 21:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][800/1251] eta 0:02:11 lr 0.000036 time 0.2881 (0.2925) loss 3.4179 (3.0289) grad_norm 2.9034 (3.0674) [2022-10-02 21:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][900/1251] eta 0:01:42 lr 0.000036 time 0.2887 (0.2919) loss 3.2491 (3.0265) grad_norm 2.9316 (3.0757) [2022-10-02 21:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1000/1251] eta 0:01:13 lr 0.000036 time 0.2850 (0.2915) loss 2.5559 (3.0183) grad_norm 2.7229 (3.0802) [2022-10-02 21:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1100/1251] eta 0:00:43 lr 0.000036 time 0.2898 (0.2912) loss 3.2033 (3.0187) grad_norm 3.2260 (3.0917) [2022-10-02 21:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1200/1251] eta 0:00:14 lr 0.000036 time 0.2886 (0.2909) loss 3.0535 (3.0185) grad_norm 2.9782 (3.0938) [2022-10-02 21:07:22 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 268 training takes 0:06:04 [2022-10-02 21:07:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.771 (2.771) Loss 0.8991 (0.8991) Acc@1 79.199 (79.199) Acc@5 94.727 (94.727) [2022-10-02 21:07:34 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.786 Acc@5 95.420 [2022-10-02 21:07:34 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:07:34 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.79% [2022-10-02 21:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][0/1251] eta 1:10:19 lr 0.000036 time 3.3729 (3.3729) loss 3.3546 (3.3546) grad_norm 2.8592 (2.8592) [2022-10-02 21:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][100/1251] eta 0:06:10 lr 0.000036 time 0.2934 (0.3215) loss 3.3171 (3.0015) grad_norm 3.0395 (3.0576) [2022-10-02 21:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][200/1251] eta 0:05:21 lr 0.000036 time 0.2842 (0.3056) loss 2.8163 (3.0144) grad_norm 2.7997 (3.0613) [2022-10-02 21:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][300/1251] eta 0:04:45 lr 0.000035 time 0.2902 (0.3003) loss 3.2349 (2.9967) grad_norm 2.7953 (3.0776) [2022-10-02 21:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][400/1251] eta 0:04:13 lr 0.000035 time 0.2882 (0.2978) loss 1.9773 (2.9773) grad_norm 2.6666 (3.0695) [2022-10-02 21:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][500/1251] eta 0:03:42 lr 0.000035 time 0.2925 (0.2962) loss 2.9071 (2.9916) grad_norm 3.1618 (3.0676) [2022-10-02 21:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][600/1251] eta 0:03:12 lr 0.000035 time 0.2871 (0.2951) loss 3.1066 (2.9960) grad_norm 3.1291 (3.0752) [2022-10-02 21:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][700/1251] eta 0:02:42 lr 0.000035 time 0.2903 (0.2943) loss 3.0769 (3.0094) grad_norm 2.5824 (3.0742) [2022-10-02 21:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][800/1251] eta 0:02:12 lr 0.000035 time 0.2884 (0.2937) loss 2.6115 (3.0124) grad_norm 3.4647 (3.0722) [2022-10-02 21:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][900/1251] eta 0:01:42 lr 0.000035 time 0.2901 (0.2933) loss 2.1373 (3.0156) grad_norm 2.6122 (3.0756) [2022-10-02 21:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1000/1251] eta 0:01:13 lr 0.000035 time 0.2873 (0.2928) loss 2.4981 (3.0113) grad_norm 3.7147 (3.0751) [2022-10-02 21:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1100/1251] eta 0:00:44 lr 0.000034 time 0.2915 (0.2925) loss 2.7161 (3.0133) grad_norm 2.5160 (3.0772) [2022-10-02 21:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1200/1251] eta 0:00:14 lr 0.000034 time 0.2876 (0.2922) loss 3.2364 (3.0097) grad_norm 2.9869 (3.0794) [2022-10-02 21:13:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 269 training takes 0:06:05 [2022-10-02 21:13:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.625 (2.625) Loss 0.7921 (0.7921) Acc@1 81.543 (81.543) Acc@5 96.094 (96.094) [2022-10-02 21:13:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.844 Acc@5 95.466 [2022-10-02 21:13:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:13:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.84% [2022-10-02 21:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][0/1251] eta 0:50:15 lr 0.000034 time 2.4104 (2.4104) loss 3.4756 (3.4756) grad_norm 3.3151 (3.3151) [2022-10-02 21:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][100/1251] eta 0:06:03 lr 0.000034 time 0.2920 (0.3154) loss 3.1632 (3.0441) grad_norm 2.8561 (3.0899) [2022-10-02 21:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][200/1251] eta 0:05:17 lr 0.000034 time 0.2867 (0.3019) loss 2.9266 (3.0209) grad_norm 2.8876 (3.1485) [2022-10-02 21:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][300/1251] eta 0:04:42 lr 0.000034 time 0.2869 (0.2975) loss 2.0422 (3.0044) grad_norm 2.9354 (3.1163) [2022-10-02 21:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][400/1251] eta 0:04:11 lr 0.000034 time 0.2898 (0.2953) loss 3.1895 (3.0124) grad_norm 3.3117 (3.1104) [2022-10-02 21:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][500/1251] eta 0:03:40 lr 0.000034 time 0.2877 (0.2939) loss 2.9026 (3.0084) grad_norm 2.8463 (3.0929) [2022-10-02 21:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][600/1251] eta 0:03:10 lr 0.000033 time 0.2876 (0.2930) loss 3.1594 (3.0047) grad_norm 2.5316 (3.0783) [2022-10-02 21:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][700/1251] eta 0:02:41 lr 0.000033 time 0.2881 (0.2922) loss 2.4724 (3.0104) grad_norm 3.0496 (3.0695) [2022-10-02 21:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][800/1251] eta 0:02:11 lr 0.000033 time 0.2865 (0.2917) loss 2.9963 (3.0189) grad_norm 2.7779 (3.0723) [2022-10-02 21:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][900/1251] eta 0:01:42 lr 0.000033 time 0.2864 (0.2913) loss 3.0710 (3.0200) grad_norm 2.5382 (3.0828) [2022-10-02 21:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1000/1251] eta 0:01:13 lr 0.000033 time 0.2883 (0.2910) loss 2.1297 (3.0215) grad_norm 2.8370 (3.0852) [2022-10-02 21:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1100/1251] eta 0:00:43 lr 0.000033 time 0.2878 (0.2907) loss 2.8649 (3.0221) grad_norm 3.7086 (3.0827) [2022-10-02 21:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1200/1251] eta 0:00:14 lr 0.000033 time 0.2878 (0.2904) loss 2.9298 (3.0230) grad_norm 3.4097 (3.1004) [2022-10-02 21:19:57 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 270 training takes 0:06:03 [2022-10-02 21:19:57 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saving...... [2022-10-02 21:19:57 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saved !!! [2022-10-02 21:20:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.723 (2.723) Loss 0.8286 (0.8286) Acc@1 81.836 (81.836) Acc@5 95.312 (95.312) [2022-10-02 21:20:10 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.836 Acc@5 95.488 [2022-10-02 21:20:10 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:20:10 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.84% [2022-10-02 21:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][0/1251] eta 1:09:53 lr 0.000033 time 3.3519 (3.3519) loss 3.4025 (3.4025) grad_norm 3.4837 (3.4837) [2022-10-02 21:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][100/1251] eta 0:06:08 lr 0.000033 time 0.2881 (0.3203) loss 3.2702 (2.9980) grad_norm 3.3395 (3.0786) [2022-10-02 21:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][200/1251] eta 0:05:20 lr 0.000032 time 0.2903 (0.3048) loss 3.1727 (2.9921) grad_norm 2.8799 (3.0763) [2022-10-02 21:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][300/1251] eta 0:04:45 lr 0.000032 time 0.2890 (0.2997) loss 3.4404 (2.9841) grad_norm 2.7473 (3.1127) [2022-10-02 21:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][400/1251] eta 0:04:12 lr 0.000032 time 0.2891 (0.2971) loss 2.3083 (2.9962) grad_norm 4.3440 (3.1229) [2022-10-02 21:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][500/1251] eta 0:03:41 lr 0.000032 time 0.2866 (0.2955) loss 3.3699 (2.9876) grad_norm 3.0951 (3.1261) [2022-10-02 21:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][600/1251] eta 0:03:11 lr 0.000032 time 0.2890 (0.2945) loss 3.1517 (3.0042) grad_norm 3.0854 (3.1169) [2022-10-02 21:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][700/1251] eta 0:02:41 lr 0.000032 time 0.2881 (0.2937) loss 2.1128 (3.0148) grad_norm 2.8749 (3.1194) [2022-10-02 21:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][800/1251] eta 0:02:12 lr 0.000032 time 0.2896 (0.2931) loss 3.1035 (3.0127) grad_norm 3.0309 (3.1237) [2022-10-02 21:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][900/1251] eta 0:01:42 lr 0.000032 time 0.2883 (0.2927) loss 3.4320 (3.0099) grad_norm 2.7168 (3.1218) [2022-10-02 21:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1000/1251] eta 0:01:13 lr 0.000031 time 0.2860 (0.2924) loss 2.3906 (3.0010) grad_norm 2.7722 (3.1253) [2022-10-02 21:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1100/1251] eta 0:00:44 lr 0.000031 time 0.2891 (0.2920) loss 3.8289 (3.0035) grad_norm 3.8310 (3.1253) [2022-10-02 21:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1200/1251] eta 0:00:14 lr 0.000031 time 0.2903 (0.2918) loss 3.4293 (2.9999) grad_norm 3.4707 (3.1270) [2022-10-02 21:26:15 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 271 training takes 0:06:05 [2022-10-02 21:26:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.200 (3.200) Loss 0.9171 (0.9171) Acc@1 77.832 (77.832) Acc@5 94.922 (94.922) [2022-10-02 21:26:28 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.826 Acc@5 95.500 [2022-10-02 21:26:28 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:26:28 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.84% [2022-10-02 21:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][0/1251] eta 0:49:11 lr 0.000031 time 2.3593 (2.3593) loss 3.0539 (3.0539) grad_norm 3.2035 (3.2035) [2022-10-02 21:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][100/1251] eta 0:06:03 lr 0.000031 time 0.2885 (0.3157) loss 3.7538 (3.0283) grad_norm 3.8977 (3.1930) [2022-10-02 21:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][200/1251] eta 0:05:17 lr 0.000031 time 0.2842 (0.3020) loss 2.7709 (3.0303) grad_norm 2.9615 (3.1432) [2022-10-02 21:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][300/1251] eta 0:04:42 lr 0.000031 time 0.2886 (0.2972) loss 3.2966 (3.0368) grad_norm 3.4726 (3.1308) [2022-10-02 21:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][400/1251] eta 0:04:10 lr 0.000031 time 0.2902 (0.2948) loss 3.1340 (3.0403) grad_norm 2.9814 (3.1193) [2022-10-02 21:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][500/1251] eta 0:03:40 lr 0.000031 time 0.2861 (0.2933) loss 2.9146 (3.0423) grad_norm 2.8136 (3.0962) [2022-10-02 21:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][600/1251] eta 0:03:10 lr 0.000030 time 0.2875 (0.2923) loss 1.8565 (3.0347) grad_norm 3.4558 (3.0941) [2022-10-02 21:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][700/1251] eta 0:02:40 lr 0.000030 time 0.2839 (0.2917) loss 2.5812 (3.0254) grad_norm 3.7804 (3.0985) [2022-10-02 21:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][800/1251] eta 0:02:11 lr 0.000030 time 0.2860 (0.2911) loss 3.4510 (3.0144) grad_norm 3.1114 (3.1060) [2022-10-02 21:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][900/1251] eta 0:01:42 lr 0.000030 time 0.2856 (0.2906) loss 3.1415 (3.0092) grad_norm 2.8732 (3.1084) [2022-10-02 21:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1000/1251] eta 0:01:12 lr 0.000030 time 0.2894 (0.2903) loss 1.7949 (3.0091) grad_norm 3.1956 (3.1155) [2022-10-02 21:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1100/1251] eta 0:00:43 lr 0.000030 time 0.2890 (0.2899) loss 1.8584 (3.0077) grad_norm 3.2153 (3.1185) [2022-10-02 21:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1200/1251] eta 0:00:14 lr 0.000030 time 0.2930 (0.2897) loss 3.0979 (3.0020) grad_norm 3.3820 (3.1286) [2022-10-02 21:32:31 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 272 training takes 0:06:02 [2022-10-02 21:32:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.785 (2.785) Loss 0.7857 (0.7857) Acc@1 80.762 (80.762) Acc@5 95.801 (95.801) [2022-10-02 21:32:43 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.798 Acc@5 95.448 [2022-10-02 21:32:43 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:32:43 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.84% [2022-10-02 21:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][0/1251] eta 0:57:07 lr 0.000030 time 2.7399 (2.7399) loss 3.2900 (3.2900) grad_norm 2.8383 (2.8383) [2022-10-02 21:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][100/1251] eta 0:06:06 lr 0.000030 time 0.2871 (0.3188) loss 2.8205 (2.9894) grad_norm 3.4190 (3.0680) [2022-10-02 21:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][200/1251] eta 0:05:19 lr 0.000029 time 0.2915 (0.3044) loss 3.7402 (2.9671) grad_norm 3.1278 (3.1466) [2022-10-02 21:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][300/1251] eta 0:04:44 lr 0.000029 time 0.2890 (0.2996) loss 3.4719 (2.9800) grad_norm 3.3395 (3.1535) [2022-10-02 21:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][400/1251] eta 0:04:12 lr 0.000029 time 0.2930 (0.2971) loss 3.0346 (2.9811) grad_norm 2.8149 (3.1560) [2022-10-02 21:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][500/1251] eta 0:03:42 lr 0.000029 time 0.2901 (0.2957) loss 2.6814 (2.9745) grad_norm 3.2319 (3.1501) [2022-10-02 21:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][600/1251] eta 0:03:11 lr 0.000029 time 0.2900 (0.2947) loss 3.0220 (2.9724) grad_norm 2.9683 (3.1571) [2022-10-02 21:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][700/1251] eta 0:02:41 lr 0.000029 time 0.2886 (0.2939) loss 2.7549 (2.9752) grad_norm 2.6720 (3.1505) [2022-10-02 21:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][800/1251] eta 0:02:12 lr 0.000029 time 0.2902 (0.2935) loss 3.3112 (2.9726) grad_norm 3.2706 (3.1513) [2022-10-02 21:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][900/1251] eta 0:01:42 lr 0.000029 time 0.2890 (0.2931) loss 2.5510 (2.9818) grad_norm 3.3906 (3.1597) [2022-10-02 21:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1000/1251] eta 0:01:13 lr 0.000029 time 0.2901 (0.2927) loss 3.0454 (2.9845) grad_norm 2.8143 (3.1657) [2022-10-02 21:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1100/1251] eta 0:00:44 lr 0.000028 time 0.2891 (0.2925) loss 3.1564 (2.9858) grad_norm 2.9431 (3.1673) [2022-10-02 21:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1200/1251] eta 0:00:14 lr 0.000028 time 0.2859 (0.2922) loss 3.7030 (2.9860) grad_norm 2.9745 (3.1633) [2022-10-02 21:38:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 273 training takes 0:06:05 [2022-10-02 21:38:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.631 (2.631) Loss 0.8466 (0.8466) Acc@1 80.762 (80.762) Acc@5 95.605 (95.605) [2022-10-02 21:39:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.832 Acc@5 95.470 [2022-10-02 21:39:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-10-02 21:39:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.84% [2022-10-02 21:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][0/1251] eta 0:48:53 lr 0.000028 time 2.3446 (2.3446) loss 3.5689 (3.5689) grad_norm 3.7064 (3.7064) [2022-10-02 21:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][100/1251] eta 0:06:08 lr 0.000028 time 0.2885 (0.3200) loss 3.4492 (2.9319) grad_norm 2.8659 (3.1905) [2022-10-02 21:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][200/1251] eta 0:05:21 lr 0.000028 time 0.2941 (0.3060) loss 3.0844 (2.9006) grad_norm 4.4751 (3.2493) [2022-10-02 21:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][300/1251] eta 0:04:46 lr 0.000028 time 0.2901 (0.3013) loss 3.4621 (2.9210) grad_norm 3.2371 (3.1956) [2022-10-02 21:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][400/1251] eta 0:04:14 lr 0.000028 time 0.2864 (0.2987) loss 2.8542 (2.9625) grad_norm 3.1487 (3.1737) [2022-10-02 21:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][500/1251] eta 0:03:43 lr 0.000028 time 0.2934 (0.2972) loss 2.9832 (2.9786) grad_norm 4.2683 (3.1854) [2022-10-02 21:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][600/1251] eta 0:03:12 lr 0.000028 time 0.2892 (0.2961) loss 2.8295 (2.9753) grad_norm 3.0654 (3.1820) [2022-10-02 21:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][700/1251] eta 0:02:42 lr 0.000027 time 0.2918 (0.2952) loss 3.2809 (2.9790) grad_norm 3.3410 (3.1740) [2022-10-02 21:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][800/1251] eta 0:02:12 lr 0.000027 time 0.2892 (0.2946) loss 3.2129 (2.9925) grad_norm 2.9088 (3.1727) [2022-10-02 21:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][900/1251] eta 0:01:43 lr 0.000027 time 0.2902 (0.2941) loss 2.6843 (2.9954) grad_norm 2.6104 (3.1827) [2022-10-02 21:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1000/1251] eta 0:01:13 lr 0.000027 time 0.2886 (0.2937) loss 3.0713 (2.9967) grad_norm 3.2262 (3.1800) [2022-10-02 21:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1100/1251] eta 0:00:44 lr 0.000027 time 0.2921 (0.2933) loss 2.0363 (2.9924) grad_norm 2.8696 (3.1743) [2022-10-02 21:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1200/1251] eta 0:00:14 lr 0.000027 time 0.2917 (0.2931) loss 3.2720 (2.9907) grad_norm 3.0669 (3.1639) [2022-10-02 21:45:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 274 training takes 0:06:06 [2022-10-02 21:45:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.779 (2.779) Loss 0.7554 (0.7554) Acc@1 83.301 (83.301) Acc@5 95.703 (95.703) [2022-10-02 21:45:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.904 Acc@5 95.450 [2022-10-02 21:45:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-02 21:45:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.90% [2022-10-02 21:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][0/1251] eta 0:50:40 lr 0.000027 time 2.4306 (2.4306) loss 3.4251 (3.4251) grad_norm 2.7857 (2.7857) [2022-10-02 21:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][100/1251] eta 0:06:04 lr 0.000027 time 0.2870 (0.3170) loss 3.1200 (3.0432) grad_norm 2.9568 (3.2335) [2022-10-02 21:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][200/1251] eta 0:05:18 lr 0.000027 time 0.2874 (0.3031) loss 3.2691 (3.0008) grad_norm 13.1755 (3.3024) [2022-10-02 21:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][300/1251] eta 0:04:43 lr 0.000027 time 0.2913 (0.2984) loss 2.7783 (2.9916) grad_norm 3.6621 (3.2664) [2022-10-02 21:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][400/1251] eta 0:04:11 lr 0.000026 time 0.2870 (0.2961) loss 3.2876 (2.9897) grad_norm 2.8049 (3.2547) [2022-10-02 21:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][500/1251] eta 0:03:41 lr 0.000026 time 0.2912 (0.2946) loss 2.3329 (2.9884) grad_norm 3.3597 (3.2240) [2022-10-02 21:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][600/1251] eta 0:03:11 lr 0.000026 time 0.2867 (0.2937) loss 2.9185 (2.9897) grad_norm 3.2951 (3.2249) [2022-10-02 21:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][700/1251] eta 0:02:41 lr 0.000026 time 0.2920 (0.2929) loss 1.8820 (2.9805) grad_norm 2.8091 (3.2341) [2022-10-02 21:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][800/1251] eta 0:02:11 lr 0.000026 time 0.2896 (0.2924) loss 2.1937 (2.9912) grad_norm 2.8930 (3.2286) [2022-10-02 21:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][900/1251] eta 0:01:42 lr 0.000026 time 0.2902 (0.2920) loss 3.4099 (2.9883) grad_norm 2.8769 (3.2202) [2022-10-02 21:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1000/1251] eta 0:01:13 lr 0.000026 time 0.2891 (0.2916) loss 2.9813 (2.9899) grad_norm 3.0424 (3.2142) [2022-10-02 21:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1100/1251] eta 0:00:43 lr 0.000026 time 0.2925 (0.2913) loss 2.9549 (2.9949) grad_norm 2.9970 (3.2189) [2022-10-02 21:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1200/1251] eta 0:00:14 lr 0.000026 time 0.2877 (0.2911) loss 2.7123 (2.9984) grad_norm 3.7964 (3.2198) [2022-10-02 21:51:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 275 training takes 0:06:04 [2022-10-02 21:51:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.828 (2.828) Loss 0.8248 (0.8248) Acc@1 80.566 (80.566) Acc@5 95.410 (95.410) [2022-10-02 21:51:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.880 Acc@5 95.448 [2022-10-02 21:51:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-10-02 21:51:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.90% [2022-10-02 21:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][0/1251] eta 1:10:15 lr 0.000026 time 3.3695 (3.3695) loss 3.9062 (3.9062) grad_norm 2.9635 (2.9635) [2022-10-02 21:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][100/1251] eta 0:06:11 lr 0.000025 time 0.2869 (0.3228) loss 3.4765 (3.0061) grad_norm 2.8357 (3.1946) [2022-10-02 21:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][200/1251] eta 0:05:22 lr 0.000025 time 0.2902 (0.3073) loss 3.6084 (3.0162) grad_norm 3.1075 (3.2646) [2022-10-02 21:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][300/1251] eta 0:04:47 lr 0.000025 time 0.2912 (0.3022) loss 3.5628 (3.0089) grad_norm 3.3052 (3.2417) [2022-10-02 21:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][400/1251] eta 0:04:14 lr 0.000025 time 0.2881 (0.2995) loss 3.3546 (3.0215) grad_norm 2.8447 (3.2502) [2022-10-02 21:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][500/1251] eta 0:03:43 lr 0.000025 time 0.2902 (0.2979) loss 3.6395 (3.0170) grad_norm 3.7676 (3.2546) [2022-10-02 21:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][600/1251] eta 0:03:13 lr 0.000025 time 0.2886 (0.2968) loss 3.0062 (3.0224) grad_norm 3.1806 (3.2522) [2022-10-02 21:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][700/1251] eta 0:02:43 lr 0.000025 time 0.2931 (0.2961) loss 2.4717 (3.0104) grad_norm 3.2068 (3.2413) [2022-10-02 21:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][800/1251] eta 0:02:13 lr 0.000025 time 0.2867 (0.2954) loss 1.8965 (3.0164) grad_norm 4.4172 (3.2456) [2022-10-02 21:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][900/1251] eta 0:01:43 lr 0.000025 time 0.2927 (0.2950) loss 3.3687 (3.0177) grad_norm 3.1187 (3.2432) [2022-10-02 21:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1000/1251] eta 0:01:13 lr 0.000025 time 0.2884 (0.2946) loss 2.0365 (3.0140) grad_norm 3.3182 (3.2388) [2022-10-02 21:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1100/1251] eta 0:00:44 lr 0.000024 time 0.2865 (0.2942) loss 3.3037 (3.0134) grad_norm 3.0251 (3.2445) [2022-10-02 21:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1200/1251] eta 0:00:14 lr 0.000024 time 0.2891 (0.2939) loss 3.3952 (3.0118) grad_norm 3.3871 (3.2449) [2022-10-02 21:57:47 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 276 training takes 0:06:07 [2022-10-02 21:57:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.202 (3.202) Loss 0.7514 (0.7514) Acc@1 82.227 (82.227) Acc@5 96.094 (96.094) [2022-10-02 21:58:00 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.968 Acc@5 95.422 [2022-10-02 21:58:00 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 21:58:00 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.97% [2022-10-02 21:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][0/1251] eta 0:57:59 lr 0.000024 time 2.7817 (2.7817) loss 2.4298 (2.4298) grad_norm 2.7163 (2.7163) [2022-10-02 21:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][100/1251] eta 0:06:03 lr 0.000024 time 0.2927 (0.3156) loss 2.9557 (2.9734) grad_norm 3.0998 (3.2214) [2022-10-02 21:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][200/1251] eta 0:05:17 lr 0.000024 time 0.2853 (0.3026) loss 2.4902 (2.9625) grad_norm 3.7889 (3.2475) [2022-10-02 21:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][300/1251] eta 0:04:43 lr 0.000024 time 0.2934 (0.2981) loss 2.7658 (2.9807) grad_norm 3.4154 (3.2363) [2022-10-02 21:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][400/1251] eta 0:04:11 lr 0.000024 time 0.2886 (0.2958) loss 3.4716 (2.9947) grad_norm 3.2437 (3.2486) [2022-10-02 22:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][500/1251] eta 0:03:41 lr 0.000024 time 0.2916 (0.2944) loss 3.0924 (3.0071) grad_norm 4.3188 (3.2431) [2022-10-02 22:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][600/1251] eta 0:03:11 lr 0.000024 time 0.2868 (0.2935) loss 3.0028 (3.0062) grad_norm 3.5630 (3.2274) [2022-10-02 22:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][700/1251] eta 0:02:41 lr 0.000024 time 0.2884 (0.2929) loss 2.6887 (3.0058) grad_norm 2.9856 (3.2188) [2022-10-02 22:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][800/1251] eta 0:02:11 lr 0.000024 time 0.2844 (0.2925) loss 3.3831 (3.0063) grad_norm 2.8265 (3.2278) [2022-10-02 22:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][900/1251] eta 0:01:42 lr 0.000023 time 0.2906 (0.2921) loss 2.6282 (3.0102) grad_norm 3.8526 (3.2280) [2022-10-02 22:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1000/1251] eta 0:01:13 lr 0.000023 time 0.2836 (0.2919) loss 3.0397 (3.0028) grad_norm 3.1379 (3.2360) [2022-10-02 22:03:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1100/1251] eta 0:00:44 lr 0.000023 time 0.2925 (0.2917) loss 2.6527 (3.0005) grad_norm 3.0267 (3.2360) [2022-10-02 22:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1200/1251] eta 0:00:14 lr 0.000023 time 0.2891 (0.2915) loss 2.5389 (2.9983) grad_norm 3.1900 (3.2296) [2022-10-02 22:04:04 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 277 training takes 0:06:04 [2022-10-02 22:04:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.891 (2.891) Loss 0.7673 (0.7673) Acc@1 82.422 (82.422) Acc@5 95.312 (95.312) [2022-10-02 22:04:17 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 80.984 Acc@5 95.448 [2022-10-02 22:04:17 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 22:04:17 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 80.98% [2022-10-02 22:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][0/1251] eta 1:10:43 lr 0.000023 time 3.3922 (3.3922) loss 3.1972 (3.1972) grad_norm 3.0480 (3.0480) [2022-10-02 22:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][100/1251] eta 0:06:13 lr 0.000023 time 0.2881 (0.3242) loss 3.1258 (3.0023) grad_norm 3.2169 (3.1224) [2022-10-02 22:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][200/1251] eta 0:05:23 lr 0.000023 time 0.2981 (0.3081) loss 3.6065 (2.9977) grad_norm 3.1673 (3.2246) [2022-10-02 22:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][300/1251] eta 0:04:47 lr 0.000023 time 0.2883 (0.3026) loss 3.7637 (2.9973) grad_norm 4.0912 (3.2518) [2022-10-02 22:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][400/1251] eta 0:04:15 lr 0.000023 time 0.2946 (0.2998) loss 3.0768 (3.0126) grad_norm 2.9384 (3.2437) [2022-10-02 22:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][500/1251] eta 0:03:43 lr 0.000023 time 0.2857 (0.2981) loss 3.1682 (3.0198) grad_norm 3.9369 (3.2167) [2022-10-02 22:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][600/1251] eta 0:03:13 lr 0.000023 time 0.2910 (0.2969) loss 3.4502 (3.0186) grad_norm 2.9089 (3.2168) [2022-10-02 22:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][700/1251] eta 0:02:43 lr 0.000022 time 0.2906 (0.2961) loss 3.2130 (3.0150) grad_norm 3.4901 (3.2115) [2022-10-02 22:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][800/1251] eta 0:02:13 lr 0.000022 time 0.2909 (0.2954) loss 3.3704 (3.0113) grad_norm 3.5260 (3.2132) [2022-10-02 22:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][900/1251] eta 0:01:43 lr 0.000022 time 0.2853 (0.2948) loss 2.8107 (3.0182) grad_norm 3.1060 (3.2166) [2022-10-02 22:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1000/1251] eta 0:01:13 lr 0.000022 time 0.2891 (0.2943) loss 3.0611 (3.0144) grad_norm 2.9062 (3.2145) [2022-10-02 22:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1100/1251] eta 0:00:44 lr 0.000022 time 0.2859 (0.2939) loss 2.8943 (3.0122) grad_norm 3.4388 (3.2218) [2022-10-02 22:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1200/1251] eta 0:00:14 lr 0.000022 time 0.2900 (0.2936) loss 2.3939 (3.0088) grad_norm 3.5104 (3.2269) [2022-10-02 22:10:25 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 278 training takes 0:06:07 [2022-10-02 22:10:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.232 (3.232) Loss 0.7789 (0.7789) Acc@1 81.641 (81.641) Acc@5 96.289 (96.289) [2022-10-02 22:10:38 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.014 Acc@5 95.470 [2022-10-02 22:10:38 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 22:10:38 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.01% [2022-10-02 22:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][0/1251] eta 1:09:32 lr 0.000022 time 3.3356 (3.3356) loss 2.9297 (2.9297) grad_norm 3.1281 (3.1281) [2022-10-02 22:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][100/1251] eta 0:06:09 lr 0.000022 time 0.2949 (0.3208) loss 1.8484 (3.0165) grad_norm 2.8172 (3.3127) [2022-10-02 22:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][200/1251] eta 0:05:21 lr 0.000022 time 0.2891 (0.3056) loss 3.3460 (2.9869) grad_norm 2.9987 (3.3271) [2022-10-02 22:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][300/1251] eta 0:04:45 lr 0.000022 time 0.2854 (0.3003) loss 3.3399 (2.9875) grad_norm 2.4671 (3.3195) [2022-10-02 22:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][400/1251] eta 0:04:13 lr 0.000022 time 0.2909 (0.2976) loss 3.2799 (2.9808) grad_norm 3.3607 (3.3348) [2022-10-02 22:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][500/1251] eta 0:03:42 lr 0.000021 time 0.2873 (0.2960) loss 3.2982 (2.9691) grad_norm 3.2279 (3.3108) [2022-10-02 22:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][600/1251] eta 0:03:12 lr 0.000021 time 0.2928 (0.2950) loss 2.6428 (2.9836) grad_norm 3.4128 (3.3034) [2022-10-02 22:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][700/1251] eta 0:02:42 lr 0.000021 time 0.2858 (0.2942) loss 3.0307 (3.0021) grad_norm 5.9913 (3.2992) [2022-10-02 22:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][800/1251] eta 0:02:12 lr 0.000021 time 0.2926 (0.2935) loss 3.2504 (2.9966) grad_norm 3.4207 (3.3035) [2022-10-02 22:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][900/1251] eta 0:01:42 lr 0.000021 time 0.2853 (0.2930) loss 2.8750 (3.0002) grad_norm 3.5602 (3.3057) [2022-10-02 22:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1000/1251] eta 0:01:13 lr 0.000021 time 0.2922 (0.2925) loss 2.9161 (2.9952) grad_norm 4.0373 (3.2944) [2022-10-02 22:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1100/1251] eta 0:00:44 lr 0.000021 time 0.2871 (0.2922) loss 2.6754 (2.9940) grad_norm 3.4839 (3.2904) [2022-10-02 22:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1200/1251] eta 0:00:14 lr 0.000021 time 0.2881 (0.2919) loss 3.3335 (2.9934) grad_norm 2.9134 (3.2834) [2022-10-02 22:16:43 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 279 training takes 0:06:05 [2022-10-02 22:16:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.556 (2.556) Loss 0.8249 (0.8249) Acc@1 80.664 (80.664) Acc@5 95.508 (95.508) [2022-10-02 22:16:56 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.044 Acc@5 95.470 [2022-10-02 22:16:56 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 22:16:56 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.04% [2022-10-02 22:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][0/1251] eta 0:52:03 lr 0.000021 time 2.4969 (2.4969) loss 2.7202 (2.7202) grad_norm 3.4874 (3.4874) [2022-10-02 22:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][100/1251] eta 0:06:02 lr 0.000021 time 0.2868 (0.3147) loss 3.1789 (2.9513) grad_norm 3.0193 (3.2017) [2022-10-02 22:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][200/1251] eta 0:05:17 lr 0.000021 time 0.2891 (0.3017) loss 3.1013 (2.9561) grad_norm 2.7942 (3.2311) [2022-10-02 22:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][300/1251] eta 0:04:42 lr 0.000021 time 0.2861 (0.2974) loss 2.6531 (2.9760) grad_norm 3.5754 (3.2314) [2022-10-02 22:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][400/1251] eta 0:04:11 lr 0.000020 time 0.2891 (0.2951) loss 3.5409 (2.9859) grad_norm 3.5973 (3.2428) [2022-10-02 22:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][500/1251] eta 0:03:40 lr 0.000020 time 0.2892 (0.2938) loss 3.0425 (2.9716) grad_norm 3.1955 (3.2443) [2022-10-02 22:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][600/1251] eta 0:03:10 lr 0.000020 time 0.2888 (0.2929) loss 3.2683 (2.9814) grad_norm 3.0387 (3.2449) [2022-10-02 22:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][700/1251] eta 0:02:41 lr 0.000020 time 0.2848 (0.2923) loss 3.1651 (2.9730) grad_norm 3.1590 (3.2383) [2022-10-02 22:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][800/1251] eta 0:02:11 lr 0.000020 time 0.2867 (0.2918) loss 2.7557 (2.9704) grad_norm 3.2791 (3.2355) [2022-10-02 22:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][900/1251] eta 0:01:42 lr 0.000020 time 0.2898 (0.2914) loss 2.3842 (2.9781) grad_norm 3.2893 (3.2401) [2022-10-02 22:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1000/1251] eta 0:01:13 lr 0.000020 time 0.2956 (0.2911) loss 2.8663 (2.9791) grad_norm 2.4647 (3.2363) [2022-10-02 22:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1100/1251] eta 0:00:43 lr 0.000020 time 0.2853 (0.2909) loss 3.0454 (2.9813) grad_norm 2.8537 (3.2423) [2022-10-02 22:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1200/1251] eta 0:00:14 lr 0.000020 time 0.2873 (0.2907) loss 3.0697 (2.9820) grad_norm 3.1552 (3.2446) [2022-10-02 22:23:00 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 280 training takes 0:06:03 [2022-10-02 22:23:00 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saving...... [2022-10-02 22:23:00 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saved !!! [2022-10-02 22:23:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.895 (2.895) Loss 0.7905 (0.7905) Acc@1 81.445 (81.445) Acc@5 96.191 (96.191) [2022-10-02 22:23:13 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.068 Acc@5 95.502 [2022-10-02 22:23:13 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 22:23:13 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-02 22:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][0/1251] eta 0:55:59 lr 0.000020 time 2.6852 (2.6852) loss 3.1728 (3.1728) grad_norm 3.2165 (3.2165) [2022-10-02 22:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][100/1251] eta 0:06:05 lr 0.000020 time 0.2896 (0.3179) loss 3.1388 (3.0293) grad_norm 3.4627 (3.3073) [2022-10-02 22:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][200/1251] eta 0:05:19 lr 0.000020 time 0.2882 (0.3039) loss 3.1189 (3.0048) grad_norm 3.1884 (3.2947) [2022-10-02 22:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][300/1251] eta 0:04:44 lr 0.000020 time 0.2894 (0.2992) loss 3.1484 (2.9912) grad_norm 3.1000 (3.2886) [2022-10-02 22:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][400/1251] eta 0:04:12 lr 0.000019 time 0.2871 (0.2968) loss 3.2364 (2.9884) grad_norm 2.6763 (3.2617) [2022-10-02 22:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][500/1251] eta 0:03:41 lr 0.000019 time 0.2849 (0.2952) loss 2.9212 (2.9756) grad_norm 3.7963 (3.2669) [2022-10-02 22:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][600/1251] eta 0:03:11 lr 0.000019 time 0.2894 (0.2941) loss 3.6104 (2.9728) grad_norm 3.7036 (3.2790) [2022-10-02 22:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][700/1251] eta 0:02:41 lr 0.000019 time 0.2897 (0.2934) loss 3.3992 (2.9717) grad_norm 3.6050 (3.2736) [2022-10-02 22:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][800/1251] eta 0:02:12 lr 0.000019 time 0.2857 (0.2928) loss 2.9473 (2.9794) grad_norm 3.4064 (3.2644) [2022-10-02 22:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][900/1251] eta 0:01:42 lr 0.000019 time 0.2885 (0.2922) loss 1.9973 (2.9783) grad_norm 4.2271 (3.2604) [2022-10-02 22:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1000/1251] eta 0:01:13 lr 0.000019 time 0.2895 (0.2918) loss 2.9307 (2.9813) grad_norm 2.8970 (3.2564) [2022-10-02 22:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1100/1251] eta 0:00:44 lr 0.000019 time 0.2846 (0.2914) loss 2.5842 (2.9796) grad_norm 3.2959 (3.2649) [2022-10-02 22:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1200/1251] eta 0:00:14 lr 0.000019 time 0.2865 (0.2911) loss 2.4330 (2.9797) grad_norm 3.6376 (3.2660) [2022-10-02 22:29:17 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 281 training takes 0:06:04 [2022-10-02 22:29:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.402 (3.402) Loss 0.8314 (0.8314) Acc@1 81.738 (81.738) Acc@5 95.215 (95.215) [2022-10-02 22:29:30 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.052 Acc@5 95.456 [2022-10-02 22:29:30 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 22:29:30 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-02 22:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][0/1251] eta 0:52:22 lr 0.000019 time 2.5116 (2.5116) loss 3.2810 (3.2810) grad_norm 2.6861 (2.6861) [2022-10-02 22:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][100/1251] eta 0:06:01 lr 0.000019 time 0.2857 (0.3142) loss 3.0371 (2.9524) grad_norm 3.5661 (3.2189) [2022-10-02 22:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][200/1251] eta 0:05:16 lr 0.000019 time 0.2905 (0.3011) loss 2.4981 (2.9523) grad_norm 2.9662 (3.2724) [2022-10-02 22:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][300/1251] eta 0:04:42 lr 0.000019 time 0.2875 (0.2969) loss 3.0446 (2.9556) grad_norm 4.1899 (3.2558) [2022-10-02 22:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][400/1251] eta 0:04:10 lr 0.000018 time 0.2881 (0.2949) loss 3.5661 (2.9733) grad_norm 2.7408 (3.2662) [2022-10-02 22:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][500/1251] eta 0:03:40 lr 0.000018 time 0.2868 (0.2936) loss 2.7136 (2.9690) grad_norm 4.3166 (3.2661) [2022-10-02 22:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][600/1251] eta 0:03:10 lr 0.000018 time 0.2881 (0.2928) loss 3.3057 (2.9701) grad_norm 3.3785 (3.2559) [2022-10-02 22:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][700/1251] eta 0:02:40 lr 0.000018 time 0.2877 (0.2921) loss 3.5570 (2.9669) grad_norm 3.2977 (3.2536) [2022-10-02 22:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][800/1251] eta 0:02:11 lr 0.000018 time 0.2842 (0.2917) loss 3.3422 (2.9745) grad_norm 2.7770 (3.2602) [2022-10-02 22:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][900/1251] eta 0:01:42 lr 0.000018 time 0.2853 (0.2913) loss 3.3660 (2.9820) grad_norm 3.0277 (3.2528) [2022-10-02 22:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1000/1251] eta 0:01:13 lr 0.000018 time 0.2892 (0.2909) loss 2.5281 (2.9799) grad_norm 2.9819 (3.2513) [2022-10-02 22:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1100/1251] eta 0:00:43 lr 0.000018 time 0.2872 (0.2906) loss 2.9846 (2.9834) grad_norm 3.2772 (3.2620) [2022-10-02 22:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1200/1251] eta 0:00:14 lr 0.000018 time 0.2878 (0.2904) loss 2.8376 (2.9855) grad_norm 3.3274 (3.2660) [2022-10-02 22:35:34 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 282 training takes 0:06:03 [2022-10-02 22:35:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.220 (2.220) Loss 0.8075 (0.8075) Acc@1 81.152 (81.152) Acc@5 95.801 (95.801) [2022-10-02 22:35:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.026 Acc@5 95.476 [2022-10-02 22:35:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 22:35:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-02 22:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][0/1251] eta 0:50:54 lr 0.000018 time 2.4420 (2.4420) loss 1.9146 (1.9146) grad_norm 2.8411 (2.8411) [2022-10-02 22:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][100/1251] eta 0:06:01 lr 0.000018 time 0.2842 (0.3140) loss 1.9363 (2.9301) grad_norm 3.1889 (3.2918) [2022-10-02 22:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][200/1251] eta 0:05:16 lr 0.000018 time 0.2908 (0.3009) loss 3.4296 (2.9704) grad_norm 2.7116 (3.3183) [2022-10-02 22:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][300/1251] eta 0:04:41 lr 0.000018 time 0.2882 (0.2965) loss 2.8186 (2.9921) grad_norm 2.9939 (3.3134) [2022-10-02 22:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][400/1251] eta 0:04:10 lr 0.000018 time 0.2883 (0.2942) loss 3.2795 (2.9932) grad_norm 3.5222 (3.3183) [2022-10-02 22:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][500/1251] eta 0:03:39 lr 0.000017 time 0.2838 (0.2928) loss 3.3743 (2.9832) grad_norm 3.2939 (3.3104) [2022-10-02 22:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][600/1251] eta 0:03:10 lr 0.000017 time 0.2869 (0.2919) loss 3.0253 (2.9860) grad_norm 2.7504 (3.3128) [2022-10-02 22:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][700/1251] eta 0:02:40 lr 0.000017 time 0.2856 (0.2913) loss 2.4995 (2.9848) grad_norm 2.9394 (3.3075) [2022-10-02 22:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][800/1251] eta 0:02:11 lr 0.000017 time 0.2880 (0.2908) loss 3.1997 (2.9870) grad_norm 3.3471 (3.3074) [2022-10-02 22:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][900/1251] eta 0:01:41 lr 0.000017 time 0.2870 (0.2904) loss 3.5004 (2.9831) grad_norm 3.2033 (3.3057) [2022-10-02 22:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1000/1251] eta 0:01:12 lr 0.000017 time 0.2896 (0.2902) loss 2.8485 (2.9792) grad_norm 3.1232 (3.3054) [2022-10-02 22:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1100/1251] eta 0:00:43 lr 0.000017 time 0.2878 (0.2899) loss 1.9256 (2.9783) grad_norm 3.1616 (3.3047) [2022-10-02 22:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1200/1251] eta 0:00:14 lr 0.000017 time 0.2912 (0.2897) loss 3.3782 (2.9803) grad_norm 3.3090 (3.3052) [2022-10-02 22:41:49 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 283 training takes 0:06:02 [2022-10-02 22:41:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.948 (2.948) Loss 0.8659 (0.8659) Acc@1 78.418 (78.418) Acc@5 95.020 (95.020) [2022-10-02 22:42:02 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.048 Acc@5 95.500 [2022-10-02 22:42:02 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 22:42:02 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.07% [2022-10-02 22:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][0/1251] eta 1:10:24 lr 0.000017 time 3.3766 (3.3766) loss 3.6577 (3.6577) grad_norm 2.8440 (2.8440) [2022-10-02 22:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][100/1251] eta 0:06:08 lr 0.000017 time 0.2909 (0.3199) loss 2.9222 (2.9397) grad_norm 3.2604 (3.2981) [2022-10-02 22:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][200/1251] eta 0:05:20 lr 0.000017 time 0.2883 (0.3045) loss 2.0700 (2.9929) grad_norm 2.7619 (3.3586) [2022-10-02 22:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][300/1251] eta 0:04:44 lr 0.000017 time 0.2879 (0.2992) loss 2.8045 (2.9708) grad_norm 3.5663 (3.3581) [2022-10-02 22:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][400/1251] eta 0:04:12 lr 0.000017 time 0.2851 (0.2966) loss 2.1749 (2.9689) grad_norm 3.6153 (3.3616) [2022-10-02 22:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][500/1251] eta 0:03:41 lr 0.000017 time 0.2914 (0.2949) loss 2.0160 (2.9775) grad_norm 2.6295 (3.3343) [2022-10-02 22:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][600/1251] eta 0:03:11 lr 0.000017 time 0.2891 (0.2937) loss 3.1659 (2.9878) grad_norm 2.5689 (3.3286) [2022-10-02 22:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][700/1251] eta 0:02:41 lr 0.000016 time 0.2899 (0.2929) loss 2.9514 (2.9882) grad_norm 3.3378 (3.3245) [2022-10-02 22:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][800/1251] eta 0:02:11 lr 0.000016 time 0.2840 (0.2924) loss 2.3769 (2.9835) grad_norm 3.7089 (3.3222) [2022-10-02 22:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][900/1251] eta 0:01:42 lr 0.000016 time 0.2906 (0.2919) loss 3.1288 (2.9900) grad_norm 2.6540 (3.3258) [2022-10-02 22:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1000/1251] eta 0:01:13 lr 0.000016 time 0.2934 (0.2916) loss 3.2268 (2.9898) grad_norm 3.0940 (3.3405) [2022-10-02 22:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1100/1251] eta 0:00:43 lr 0.000016 time 0.2898 (0.2913) loss 3.4573 (2.9868) grad_norm 4.4277 (3.3562) [2022-10-02 22:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1200/1251] eta 0:00:14 lr 0.000016 time 0.2896 (0.2910) loss 3.2695 (2.9808) grad_norm 3.0864 (3.3506) [2022-10-02 22:48:07 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 284 training takes 0:06:04 [2022-10-02 22:48:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.863 (2.863) Loss 0.8797 (0.8797) Acc@1 79.883 (79.883) Acc@5 95.215 (95.215) [2022-10-02 22:48:19 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.106 Acc@5 95.524 [2022-10-02 22:48:19 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 22:48:19 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.11% [2022-10-02 22:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][0/1251] eta 1:06:54 lr 0.000016 time 3.2091 (3.2091) loss 3.0918 (3.0918) grad_norm 3.6002 (3.6002) [2022-10-02 22:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][100/1251] eta 0:06:09 lr 0.000016 time 0.2916 (0.3212) loss 3.4667 (2.9684) grad_norm 3.4455 (3.4087) [2022-10-02 22:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][200/1251] eta 0:05:21 lr 0.000016 time 0.2889 (0.3059) loss 2.6493 (2.9513) grad_norm 3.5296 (3.4078) [2022-10-02 22:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][300/1251] eta 0:04:46 lr 0.000016 time 0.2899 (0.3010) loss 2.3546 (2.9377) grad_norm 3.3486 (3.3810) [2022-10-02 22:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][400/1251] eta 0:04:13 lr 0.000016 time 0.2934 (0.2984) loss 3.0187 (2.9688) grad_norm 2.9838 (3.3408) [2022-10-02 22:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][500/1251] eta 0:03:42 lr 0.000016 time 0.2903 (0.2968) loss 2.5993 (2.9606) grad_norm 3.4186 (3.3574) [2022-10-02 22:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][600/1251] eta 0:03:12 lr 0.000016 time 0.2929 (0.2958) loss 3.5386 (2.9637) grad_norm 3.4350 (3.3610) [2022-10-02 22:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][700/1251] eta 0:02:42 lr 0.000016 time 0.2896 (0.2950) loss 3.4254 (2.9613) grad_norm 3.1417 (3.3531) [2022-10-02 22:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][800/1251] eta 0:02:12 lr 0.000016 time 0.2916 (0.2945) loss 3.2248 (2.9737) grad_norm 5.1768 (3.3373) [2022-10-02 22:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][900/1251] eta 0:01:43 lr 0.000016 time 0.2893 (0.2940) loss 3.2799 (2.9692) grad_norm 3.5232 (3.3406) [2022-10-02 22:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1000/1251] eta 0:01:13 lr 0.000015 time 0.2919 (0.2937) loss 2.7346 (2.9759) grad_norm 4.5242 (3.3478) [2022-10-02 22:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1100/1251] eta 0:00:44 lr 0.000015 time 0.2907 (0.2933) loss 2.1274 (2.9786) grad_norm 3.8525 (3.3421) [2022-10-02 22:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1200/1251] eta 0:00:14 lr 0.000015 time 0.2917 (0.2931) loss 3.3367 (2.9773) grad_norm 3.0738 (3.3416) [2022-10-02 22:54:26 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 285 training takes 0:06:06 [2022-10-02 22:54:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.822 (2.822) Loss 0.7862 (0.7862) Acc@1 80.859 (80.859) Acc@5 95.605 (95.605) [2022-10-02 22:54:39 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.090 Acc@5 95.466 [2022-10-02 22:54:39 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 22:54:39 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.11% [2022-10-02 22:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][0/1251] eta 0:49:01 lr 0.000015 time 2.3516 (2.3516) loss 3.3312 (3.3312) grad_norm 4.2303 (4.2303) [2022-10-02 22:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][100/1251] eta 0:06:06 lr 0.000015 time 0.2887 (0.3180) loss 2.1413 (3.0320) grad_norm 2.8075 (3.3515) [2022-10-02 22:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][200/1251] eta 0:05:19 lr 0.000015 time 0.2923 (0.3041) loss 3.2281 (3.0043) grad_norm 3.5680 (3.3063) [2022-10-02 22:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][300/1251] eta 0:04:44 lr 0.000015 time 0.2900 (0.2996) loss 3.2783 (3.0030) grad_norm 3.2964 (3.3098) [2022-10-02 22:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][400/1251] eta 0:04:13 lr 0.000015 time 0.2900 (0.2973) loss 3.2630 (2.9818) grad_norm 3.2760 (3.3224) [2022-10-02 22:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][500/1251] eta 0:03:42 lr 0.000015 time 0.2917 (0.2959) loss 3.2422 (2.9858) grad_norm 2.8306 (3.3160) [2022-10-02 22:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][600/1251] eta 0:03:12 lr 0.000015 time 0.2896 (0.2950) loss 3.1423 (2.9803) grad_norm 3.2051 (3.3221) [2022-10-02 22:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][700/1251] eta 0:02:42 lr 0.000015 time 0.2916 (0.2944) loss 3.3177 (2.9774) grad_norm 3.0089 (3.3358) [2022-10-02 22:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][800/1251] eta 0:02:12 lr 0.000015 time 0.2892 (0.2939) loss 2.3312 (2.9836) grad_norm 3.4908 (3.3367) [2022-10-02 22:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][900/1251] eta 0:01:43 lr 0.000015 time 0.2882 (0.2935) loss 3.1252 (2.9919) grad_norm 3.2041 (3.3382) [2022-10-02 22:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1000/1251] eta 0:01:13 lr 0.000015 time 0.2863 (0.2931) loss 2.7061 (2.9912) grad_norm 4.0452 (3.3395) [2022-10-02 23:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1100/1251] eta 0:00:44 lr 0.000015 time 0.2870 (0.2928) loss 2.3009 (2.9895) grad_norm 2.9289 (3.3343) [2022-10-02 23:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1200/1251] eta 0:00:14 lr 0.000015 time 0.2879 (0.2926) loss 3.1670 (2.9884) grad_norm 3.5581 (3.3343) [2022-10-02 23:00:45 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 286 training takes 0:06:06 [2022-10-02 23:00:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.835 (2.835) Loss 0.7356 (0.7356) Acc@1 82.520 (82.520) Acc@5 96.094 (96.094) [2022-10-02 23:00:58 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.036 Acc@5 95.516 [2022-10-02 23:00:58 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-10-02 23:00:58 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.11% [2022-10-02 23:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][0/1251] eta 0:49:31 lr 0.000015 time 2.3751 (2.3751) loss 2.5653 (2.5653) grad_norm 4.1885 (4.1885) [2022-10-02 23:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][100/1251] eta 0:06:05 lr 0.000015 time 0.2940 (0.3173) loss 3.5917 (2.9831) grad_norm 3.2649 (3.3287) [2022-10-02 23:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][200/1251] eta 0:05:18 lr 0.000014 time 0.2917 (0.3033) loss 3.3302 (2.9928) grad_norm 3.2455 (3.3317) [2022-10-02 23:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][300/1251] eta 0:04:44 lr 0.000014 time 0.2894 (0.2988) loss 3.1585 (2.9899) grad_norm 3.3895 (3.3507) [2022-10-02 23:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][400/1251] eta 0:04:12 lr 0.000014 time 0.2922 (0.2965) loss 3.6172 (3.0005) grad_norm 3.4784 (3.3670) [2022-10-02 23:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][500/1251] eta 0:03:41 lr 0.000014 time 0.2899 (0.2952) loss 3.2395 (3.0065) grad_norm 3.8437 (3.3634) [2022-10-02 23:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][600/1251] eta 0:03:11 lr 0.000014 time 0.2911 (0.2941) loss 3.1405 (3.0039) grad_norm 3.0768 (3.3342) [2022-10-02 23:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][700/1251] eta 0:02:41 lr 0.000014 time 0.2898 (0.2935) loss 3.3002 (3.0078) grad_norm 4.0895 (3.3322) [2022-10-02 23:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][800/1251] eta 0:02:12 lr 0.000014 time 0.2930 (0.2929) loss 3.0945 (3.0144) grad_norm 4.0944 (3.3270) [2022-10-02 23:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][900/1251] eta 0:01:42 lr 0.000014 time 0.2883 (0.2924) loss 2.9156 (3.0069) grad_norm 2.8968 (3.3216) [2022-10-02 23:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1000/1251] eta 0:01:13 lr 0.000014 time 0.2883 (0.2921) loss 3.3485 (2.9981) grad_norm 3.1086 (3.3164) [2022-10-02 23:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1100/1251] eta 0:00:44 lr 0.000014 time 0.2922 (0.2917) loss 3.2666 (2.9918) grad_norm 3.6099 (3.3198) [2022-10-02 23:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1200/1251] eta 0:00:14 lr 0.000014 time 0.2867 (0.2915) loss 2.9067 (2.9965) grad_norm 3.6101 (3.3253) [2022-10-02 23:07:03 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 287 training takes 0:06:04 [2022-10-02 23:07:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.183 (3.183) Loss 0.8295 (0.8295) Acc@1 80.859 (80.859) Acc@5 95.020 (95.020) [2022-10-02 23:07:16 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.164 Acc@5 95.482 [2022-10-02 23:07:16 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:07:16 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.16% [2022-10-02 23:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][0/1251] eta 0:45:52 lr 0.000014 time 2.2002 (2.2002) loss 2.9121 (2.9121) grad_norm 3.6169 (3.6169) [2022-10-02 23:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][100/1251] eta 0:05:58 lr 0.000014 time 0.2864 (0.3116) loss 2.6654 (2.9872) grad_norm 2.9482 (3.3827) [2022-10-02 23:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][200/1251] eta 0:05:15 lr 0.000014 time 0.2864 (0.2997) loss 2.9833 (2.9744) grad_norm 3.2524 (3.3335) [2022-10-02 23:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][300/1251] eta 0:04:41 lr 0.000014 time 0.2910 (0.2959) loss 2.8170 (2.9710) grad_norm 2.8274 (3.3361) [2022-10-02 23:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][400/1251] eta 0:04:10 lr 0.000014 time 0.2896 (0.2938) loss 3.3603 (2.9752) grad_norm 3.0567 (3.3353) [2022-10-02 23:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][500/1251] eta 0:03:39 lr 0.000014 time 0.2862 (0.2927) loss 2.1277 (2.9805) grad_norm 3.3121 (3.3456) [2022-10-02 23:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][600/1251] eta 0:03:09 lr 0.000014 time 0.2865 (0.2918) loss 3.1511 (2.9853) grad_norm 3.2894 (3.3403) [2022-10-02 23:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][700/1251] eta 0:02:40 lr 0.000014 time 0.2840 (0.2912) loss 3.3972 (2.9824) grad_norm 2.8617 (3.3428) [2022-10-02 23:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][800/1251] eta 0:02:11 lr 0.000013 time 0.2888 (0.2908) loss 2.0460 (2.9828) grad_norm 2.8543 (3.3418) [2022-10-02 23:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][900/1251] eta 0:01:41 lr 0.000013 time 0.2879 (0.2904) loss 3.3492 (2.9874) grad_norm 3.8759 (3.3379) [2022-10-02 23:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1000/1251] eta 0:01:12 lr 0.000013 time 0.2887 (0.2901) loss 2.6217 (2.9818) grad_norm 3.2747 (3.3453) [2022-10-02 23:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1100/1251] eta 0:00:43 lr 0.000013 time 0.2877 (0.2899) loss 3.2353 (2.9824) grad_norm 3.0553 (3.3430) [2022-10-02 23:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1200/1251] eta 0:00:14 lr 0.000013 time 0.2853 (0.2897) loss 2.8802 (2.9799) grad_norm 3.2725 (3.3550) [2022-10-02 23:13:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 288 training takes 0:06:02 [2022-10-02 23:13:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.320 (3.320) Loss 0.8458 (0.8458) Acc@1 80.371 (80.371) Acc@5 95.801 (95.801) [2022-10-02 23:13:32 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.218 Acc@5 95.534 [2022-10-02 23:13:32 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:13:32 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.22% [2022-10-02 23:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][0/1251] eta 0:58:11 lr 0.000013 time 2.7908 (2.7908) loss 2.8635 (2.8635) grad_norm 3.0265 (3.0265) [2022-10-02 23:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][100/1251] eta 0:06:10 lr 0.000013 time 0.2900 (0.3222) loss 2.3588 (2.8888) grad_norm 3.1164 (3.3533) [2022-10-02 23:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][200/1251] eta 0:05:23 lr 0.000013 time 0.2861 (0.3074) loss 1.9087 (2.9044) grad_norm 3.2619 (3.3451) [2022-10-02 23:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][300/1251] eta 0:04:47 lr 0.000013 time 0.2924 (0.3026) loss 3.1715 (2.9491) grad_norm 2.7161 (3.3208) [2022-10-02 23:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][400/1251] eta 0:04:15 lr 0.000013 time 0.2916 (0.3002) loss 3.3106 (2.9495) grad_norm 3.6245 (3.3234) [2022-10-02 23:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][500/1251] eta 0:03:44 lr 0.000013 time 0.2931 (0.2986) loss 3.2994 (2.9479) grad_norm 3.5173 (3.3210) [2022-10-02 23:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][600/1251] eta 0:03:13 lr 0.000013 time 0.2929 (0.2975) loss 2.9951 (2.9562) grad_norm 3.5220 (3.3398) [2022-10-02 23:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][700/1251] eta 0:02:43 lr 0.000013 time 0.2898 (0.2966) loss 3.8777 (2.9565) grad_norm 3.2538 (3.3601) [2022-10-02 23:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][800/1251] eta 0:02:13 lr 0.000013 time 0.2853 (0.2959) loss 3.2294 (2.9587) grad_norm 2.8428 (3.3660) [2022-10-02 23:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][900/1251] eta 0:01:43 lr 0.000013 time 0.2922 (0.2954) loss 2.4558 (2.9638) grad_norm 3.2366 (3.3630) [2022-10-02 23:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1000/1251] eta 0:01:14 lr 0.000013 time 0.2875 (0.2951) loss 2.9203 (2.9667) grad_norm 3.1571 (3.3629) [2022-10-02 23:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1100/1251] eta 0:00:44 lr 0.000013 time 0.2901 (0.2948) loss 2.7956 (2.9685) grad_norm 3.2296 (3.3636) [2022-10-02 23:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1200/1251] eta 0:00:15 lr 0.000013 time 0.2896 (0.2945) loss 3.5842 (2.9701) grad_norm 3.7190 (3.3568) [2022-10-02 23:19:40 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 289 training takes 0:06:08 [2022-10-02 23:19:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.466 (3.466) Loss 0.7492 (0.7492) Acc@1 82.520 (82.520) Acc@5 96.191 (96.191) [2022-10-02 23:19:53 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.158 Acc@5 95.528 [2022-10-02 23:19:53 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:19:53 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.22% [2022-10-02 23:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][0/1251] eta 1:09:02 lr 0.000013 time 3.3116 (3.3116) loss 2.9360 (2.9360) grad_norm 3.3682 (3.3682) [2022-10-02 23:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][100/1251] eta 0:06:07 lr 0.000013 time 0.2923 (0.3193) loss 2.2887 (2.9506) grad_norm 3.2370 (3.4052) [2022-10-02 23:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][200/1251] eta 0:05:20 lr 0.000013 time 0.2886 (0.3045) loss 1.8513 (2.9768) grad_norm 3.2880 (3.4171) [2022-10-02 23:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][300/1251] eta 0:04:44 lr 0.000013 time 0.2881 (0.2994) loss 2.6227 (2.9648) grad_norm 2.9660 (3.4415) [2022-10-02 23:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][400/1251] eta 0:04:12 lr 0.000013 time 0.2886 (0.2968) loss 2.0467 (2.9768) grad_norm 2.6822 (3.4256) [2022-10-02 23:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][500/1251] eta 0:03:41 lr 0.000012 time 0.2876 (0.2953) loss 2.9793 (2.9841) grad_norm 3.7905 (3.4377) [2022-10-02 23:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][600/1251] eta 0:03:11 lr 0.000012 time 0.2876 (0.2943) loss 3.3387 (2.9801) grad_norm 3.4770 (3.4349) [2022-10-02 23:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][700/1251] eta 0:02:41 lr 0.000012 time 0.2866 (0.2935) loss 2.2832 (2.9739) grad_norm 3.3663 (3.4328) [2022-10-02 23:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][800/1251] eta 0:02:12 lr 0.000012 time 0.2881 (0.2929) loss 3.0520 (2.9687) grad_norm 3.3678 (3.4408) [2022-10-02 23:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][900/1251] eta 0:01:42 lr 0.000012 time 0.2845 (0.2924) loss 3.6282 (2.9701) grad_norm 3.5418 (3.4322) [2022-10-02 23:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1000/1251] eta 0:01:13 lr 0.000012 time 0.2885 (0.2921) loss 3.2496 (2.9731) grad_norm 3.0769 (3.4248) [2022-10-02 23:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1100/1251] eta 0:00:44 lr 0.000012 time 0.2877 (0.2917) loss 2.3853 (2.9690) grad_norm 3.1919 (3.4204) [2022-10-02 23:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1200/1251] eta 0:00:14 lr 0.000012 time 0.2879 (0.2914) loss 2.5931 (2.9666) grad_norm 3.6156 (3.4112) [2022-10-02 23:25:58 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 290 training takes 0:06:04 [2022-10-02 23:25:58 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saving...... [2022-10-02 23:25:59 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saved !!! [2022-10-02 23:26:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.477 (2.477) Loss 0.7735 (0.7735) Acc@1 82.031 (82.031) Acc@5 95.508 (95.508) [2022-10-02 23:26:11 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.092 Acc@5 95.502 [2022-10-02 23:26:11 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 23:26:11 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.22% [2022-10-02 23:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][0/1251] eta 0:46:49 lr 0.000012 time 2.2458 (2.2458) loss 3.0323 (3.0323) grad_norm 3.5251 (3.5251) [2022-10-02 23:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][100/1251] eta 0:06:06 lr 0.000012 time 0.2869 (0.3182) loss 2.5481 (2.9728) grad_norm 3.1642 (3.6313) [2022-10-02 23:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][200/1251] eta 0:05:19 lr 0.000012 time 0.2893 (0.3039) loss 2.9836 (2.9882) grad_norm 3.3503 (3.6088) [2022-10-02 23:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][300/1251] eta 0:04:44 lr 0.000012 time 0.2872 (0.2992) loss 2.2863 (3.0078) grad_norm 3.2321 (3.5305) [2022-10-02 23:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][400/1251] eta 0:04:12 lr 0.000012 time 0.2911 (0.2967) loss 3.1200 (2.9874) grad_norm 3.1452 (3.5220) [2022-10-02 23:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][500/1251] eta 0:03:41 lr 0.000012 time 0.2880 (0.2952) loss 3.4520 (2.9869) grad_norm 3.8350 (3.4900) [2022-10-02 23:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][600/1251] eta 0:03:11 lr 0.000012 time 0.2912 (0.2942) loss 2.1515 (2.9753) grad_norm 3.3236 (3.4795) [2022-10-02 23:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][700/1251] eta 0:02:41 lr 0.000012 time 0.2883 (0.2935) loss 3.6480 (2.9795) grad_norm 3.4384 (3.4736) [2022-10-02 23:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][800/1251] eta 0:02:12 lr 0.000012 time 0.2926 (0.2930) loss 2.5426 (2.9810) grad_norm 3.3740 (3.4584) [2022-10-02 23:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][900/1251] eta 0:01:42 lr 0.000012 time 0.2869 (0.2925) loss 3.0817 (2.9843) grad_norm 3.7958 (3.4463) [2022-10-02 23:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1000/1251] eta 0:01:13 lr 0.000012 time 0.2956 (0.2920) loss 3.2075 (2.9895) grad_norm 3.9023 (3.4374) [2022-10-02 23:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1100/1251] eta 0:00:44 lr 0.000012 time 0.2861 (0.2917) loss 2.4696 (2.9945) grad_norm 3.6665 (3.4404) [2022-10-02 23:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1200/1251] eta 0:00:14 lr 0.000012 time 0.2928 (0.2914) loss 3.3955 (2.9942) grad_norm 3.2264 (3.4312) [2022-10-02 23:32:16 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 291 training takes 0:06:04 [2022-10-02 23:32:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.381 (3.381) Loss 0.8115 (0.8115) Acc@1 81.836 (81.836) Acc@5 95.410 (95.410) [2022-10-02 23:32:29 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.258 Acc@5 95.548 [2022-10-02 23:32:29 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.3% [2022-10-02 23:32:29 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-02 23:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][0/1251] eta 0:57:21 lr 0.000012 time 2.7509 (2.7509) loss 3.2407 (3.2407) grad_norm 3.3671 (3.3671) [2022-10-02 23:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][100/1251] eta 0:06:04 lr 0.000012 time 0.2855 (0.3169) loss 3.3263 (3.0250) grad_norm 3.1968 (3.3812) [2022-10-02 23:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][200/1251] eta 0:05:18 lr 0.000012 time 0.2889 (0.3030) loss 3.2190 (3.0577) grad_norm 3.9621 (3.4149) [2022-10-02 23:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][300/1251] eta 0:04:43 lr 0.000012 time 0.2891 (0.2984) loss 3.0823 (3.0469) grad_norm 2.6971 (3.3902) [2022-10-02 23:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][400/1251] eta 0:04:11 lr 0.000012 time 0.2918 (0.2960) loss 3.3734 (3.0267) grad_norm 3.9254 (3.3821) [2022-10-02 23:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][500/1251] eta 0:03:41 lr 0.000012 time 0.2864 (0.2945) loss 3.1364 (3.0341) grad_norm 3.4838 (3.3734) [2022-10-02 23:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][600/1251] eta 0:03:11 lr 0.000012 time 0.2894 (0.2936) loss 3.1173 (3.0289) grad_norm 3.3984 (3.3835) [2022-10-02 23:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][700/1251] eta 0:02:41 lr 0.000012 time 0.2877 (0.2930) loss 3.1958 (3.0068) grad_norm 2.9845 (3.3812) [2022-10-02 23:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][800/1251] eta 0:02:11 lr 0.000011 time 0.2885 (0.2924) loss 2.6497 (2.9915) grad_norm 3.1101 (3.3758) [2022-10-02 23:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2861 (0.2920) loss 3.2673 (2.9885) grad_norm 3.4587 (3.3881) [2022-10-02 23:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2856 (0.2916) loss 3.6099 (2.9781) grad_norm 2.7628 (3.3819) [2022-10-02 23:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1100/1251] eta 0:00:43 lr 0.000011 time 0.2879 (0.2913) loss 3.1544 (2.9710) grad_norm 2.7537 (3.3813) [2022-10-02 23:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2876 (0.2911) loss 2.7144 (2.9686) grad_norm 3.4155 (3.3869) [2022-10-02 23:38:33 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 292 training takes 0:06:04 [2022-10-02 23:38:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.020 (3.020) Loss 0.8154 (0.8154) Acc@1 81.348 (81.348) Acc@5 95.508 (95.508) [2022-10-02 23:38:46 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.156 Acc@5 95.516 [2022-10-02 23:38:46 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:38:46 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-02 23:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][0/1251] eta 1:11:58 lr 0.000011 time 3.4518 (3.4518) loss 3.0607 (3.0607) grad_norm 3.4370 (3.4370) [2022-10-02 23:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][100/1251] eta 0:06:10 lr 0.000011 time 0.2885 (0.3221) loss 3.7865 (2.9881) grad_norm 3.1609 (3.3460) [2022-10-02 23:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][200/1251] eta 0:05:21 lr 0.000011 time 0.2895 (0.3060) loss 3.3322 (2.9498) grad_norm 2.9063 (3.3697) [2022-10-02 23:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][300/1251] eta 0:04:45 lr 0.000011 time 0.2868 (0.3007) loss 3.5145 (2.9570) grad_norm 3.9846 (3.3718) [2022-10-02 23:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][400/1251] eta 0:04:13 lr 0.000011 time 0.2881 (0.2981) loss 3.4048 (2.9776) grad_norm 3.6011 (3.3713) [2022-10-02 23:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][500/1251] eta 0:03:42 lr 0.000011 time 0.2891 (0.2964) loss 3.1429 (2.9812) grad_norm 2.8255 (3.3718) [2022-10-02 23:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][600/1251] eta 0:03:12 lr 0.000011 time 0.2874 (0.2953) loss 3.1954 (2.9776) grad_norm 3.1841 (3.3926) [2022-10-02 23:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][700/1251] eta 0:02:42 lr 0.000011 time 0.2928 (0.2945) loss 1.9415 (2.9676) grad_norm 3.2575 (3.3986) [2022-10-02 23:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][800/1251] eta 0:02:12 lr 0.000011 time 0.2882 (0.2939) loss 1.8843 (2.9626) grad_norm 3.8227 (3.3953) [2022-10-02 23:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2906 (0.2933) loss 3.3220 (2.9606) grad_norm 3.2774 (3.3955) [2022-10-02 23:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2874 (0.2929) loss 1.9413 (2.9623) grad_norm 3.5673 (3.3973) [2022-10-02 23:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1100/1251] eta 0:00:44 lr 0.000011 time 0.2864 (0.2925) loss 3.1581 (2.9552) grad_norm 3.7457 (3.3946) [2022-10-02 23:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2881 (0.2922) loss 3.3277 (2.9500) grad_norm 4.0949 (3.4009) [2022-10-02 23:44:52 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 293 training takes 0:06:05 [2022-10-02 23:44:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.876 (2.876) Loss 0.8667 (0.8667) Acc@1 80.762 (80.762) Acc@5 94.922 (94.922) [2022-10-02 23:45:04 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.168 Acc@5 95.526 [2022-10-02 23:45:04 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:45:04 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-02 23:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][0/1251] eta 0:47:39 lr 0.000011 time 2.2861 (2.2861) loss 3.1769 (3.1769) grad_norm 3.2763 (3.2763) [2022-10-02 23:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][100/1251] eta 0:06:02 lr 0.000011 time 0.2908 (0.3152) loss 2.6470 (2.9979) grad_norm 3.0673 (3.4920) [2022-10-02 23:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][200/1251] eta 0:05:18 lr 0.000011 time 0.2912 (0.3026) loss 2.6430 (3.0009) grad_norm 2.7200 (3.4605) [2022-10-02 23:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][300/1251] eta 0:04:43 lr 0.000011 time 0.2925 (0.2982) loss 1.7603 (2.9635) grad_norm 4.1039 (3.4548) [2022-10-02 23:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][400/1251] eta 0:04:11 lr 0.000011 time 0.2898 (0.2960) loss 3.1413 (2.9713) grad_norm 3.5856 (3.4627) [2022-10-02 23:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][500/1251] eta 0:03:41 lr 0.000011 time 0.2881 (0.2947) loss 2.3637 (2.9546) grad_norm 3.0117 (3.4399) [2022-10-02 23:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][600/1251] eta 0:03:11 lr 0.000011 time 0.2887 (0.2938) loss 3.2645 (2.9556) grad_norm 3.0258 (3.4227) [2022-10-02 23:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][700/1251] eta 0:02:41 lr 0.000011 time 0.2882 (0.2932) loss 3.3301 (2.9604) grad_norm 3.7321 (3.4300) [2022-10-02 23:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][800/1251] eta 0:02:11 lr 0.000011 time 0.2864 (0.2927) loss 2.4636 (2.9575) grad_norm 4.0539 (3.4544) [2022-10-02 23:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][900/1251] eta 0:01:42 lr 0.000011 time 0.2955 (0.2923) loss 2.5632 (2.9606) grad_norm 3.3307 (3.4444) [2022-10-02 23:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1000/1251] eta 0:01:13 lr 0.000011 time 0.2899 (0.2920) loss 1.8091 (2.9609) grad_norm 3.8690 (3.4463) [2022-10-02 23:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1100/1251] eta 0:00:44 lr 0.000011 time 0.2890 (0.2917) loss 3.2057 (2.9589) grad_norm 3.6190 (3.4462) [2022-10-02 23:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1200/1251] eta 0:00:14 lr 0.000011 time 0.2896 (0.2915) loss 3.3919 (2.9599) grad_norm 3.1617 (3.4439) [2022-10-02 23:51:09 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 294 training takes 0:06:04 [2022-10-02 23:51:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.756 (2.756) Loss 0.7295 (0.7295) Acc@1 83.691 (83.691) Acc@5 96.582 (96.582) [2022-10-02 23:51:22 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.194 Acc@5 95.516 [2022-10-02 23:51:22 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-02 23:51:22 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-02 23:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][0/1251] eta 0:58:48 lr 0.000011 time 2.8208 (2.8208) loss 2.0410 (2.0410) grad_norm 3.4861 (3.4861) [2022-10-02 23:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][100/1251] eta 0:06:02 lr 0.000011 time 0.2864 (0.3154) loss 3.1735 (2.9645) grad_norm 3.7301 (3.4662) [2022-10-02 23:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][200/1251] eta 0:05:17 lr 0.000011 time 0.2911 (0.3023) loss 3.4203 (2.9853) grad_norm 3.2216 (3.4203) [2022-10-02 23:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][300/1251] eta 0:04:43 lr 0.000011 time 0.2868 (0.2980) loss 2.7755 (2.9707) grad_norm 3.0939 (3.4320) [2022-10-02 23:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][400/1251] eta 0:04:11 lr 0.000011 time 0.2899 (0.2958) loss 1.9845 (2.9626) grad_norm 3.3783 (3.4090) [2022-10-02 23:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][500/1251] eta 0:03:41 lr 0.000011 time 0.2853 (0.2945) loss 3.2431 (2.9639) grad_norm 3.2503 (3.4134) [2022-10-02 23:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][600/1251] eta 0:03:11 lr 0.000011 time 0.2900 (0.2936) loss 2.3742 (2.9711) grad_norm 3.4634 (3.4118) [2022-10-02 23:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][700/1251] eta 0:02:41 lr 0.000011 time 0.2863 (0.2930) loss 3.3028 (2.9692) grad_norm 2.6623 (3.4062) [2022-10-02 23:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][800/1251] eta 0:02:11 lr 0.000011 time 0.2885 (0.2925) loss 3.4240 (2.9641) grad_norm 3.6360 (3.4102) [2022-10-02 23:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2871 (0.2922) loss 3.1409 (2.9625) grad_norm 3.2542 (3.4083) [2022-10-02 23:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2904 (0.2919) loss 3.4637 (2.9671) grad_norm 3.0573 (3.4432) [2022-10-02 23:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2850 (0.2916) loss 3.6410 (2.9702) grad_norm 3.5435 (3.4443) [2022-10-02 23:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2881 (0.2914) loss 2.5614 (2.9687) grad_norm 3.0197 (3.4421) [2022-10-02 23:57:27 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 295 training takes 0:06:04 [2022-10-02 23:57:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.231 (3.231) Loss 0.8260 (0.8260) Acc@1 80.664 (80.664) Acc@5 95.312 (95.312) [2022-10-02 23:57:40 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.118 Acc@5 95.530 [2022-10-02 23:57:40 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-02 23:57:40 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-02 23:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][0/1251] eta 0:45:52 lr 0.000010 time 2.2003 (2.2003) loss 2.9586 (2.9586) grad_norm 3.1112 (3.1112) [2022-10-02 23:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][100/1251] eta 0:06:04 lr 0.000010 time 0.2925 (0.3171) loss 2.3103 (2.9833) grad_norm 3.2141 (3.4009) [2022-10-02 23:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][200/1251] eta 0:05:18 lr 0.000010 time 0.2886 (0.3033) loss 2.3916 (2.9537) grad_norm 3.5113 (3.4253) [2022-10-02 23:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][300/1251] eta 0:04:43 lr 0.000010 time 0.2912 (0.2986) loss 3.2515 (2.9846) grad_norm 3.2923 (3.4343) [2022-10-02 23:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2879 (0.2961) loss 3.1230 (2.9703) grad_norm 3.5170 (3.4314) [2022-10-03 00:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2916 (0.2946) loss 2.4853 (2.9765) grad_norm 3.4248 (3.4338) [2022-10-03 00:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2899 (0.2936) loss 2.6334 (2.9679) grad_norm 3.2265 (3.4250) [2022-10-03 00:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2891 (0.2929) loss 2.5333 (2.9687) grad_norm 3.3917 (3.4273) [2022-10-03 00:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][800/1251] eta 0:02:11 lr 0.000010 time 0.2904 (0.2924) loss 3.0321 (2.9676) grad_norm 2.8011 (3.4345) [2022-10-03 00:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2894 (0.2920) loss 3.2143 (2.9587) grad_norm 3.1381 (3.4477) [2022-10-03 00:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2906 (0.2917) loss 3.1714 (2.9631) grad_norm 3.6562 (3.4304) [2022-10-03 00:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2913 (0.2915) loss 1.8950 (2.9635) grad_norm 3.6499 (3.4351) [2022-10-03 00:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2884 (0.2913) loss 2.3489 (2.9596) grad_norm 3.4446 (3.4367) [2022-10-03 00:03:44 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 296 training takes 0:06:04 [2022-10-03 00:03:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.688 (2.688) Loss 0.8076 (0.8076) Acc@1 80.859 (80.859) Acc@5 96.094 (96.094) [2022-10-03 00:03:57 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.142 Acc@5 95.566 [2022-10-03 00:03:57 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-10-03 00:03:57 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-03 00:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][0/1251] eta 1:13:20 lr 0.000010 time 3.5175 (3.5175) loss 2.9871 (2.9871) grad_norm 3.6028 (3.6028) [2022-10-03 00:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][100/1251] eta 0:06:09 lr 0.000010 time 0.2869 (0.3209) loss 2.9498 (3.0745) grad_norm 3.3507 (3.4551) [2022-10-03 00:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][200/1251] eta 0:05:20 lr 0.000010 time 0.2864 (0.3047) loss 3.0779 (3.0237) grad_norm 3.0138 (3.4294) [2022-10-03 00:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][300/1251] eta 0:04:44 lr 0.000010 time 0.2893 (0.2993) loss 3.1292 (2.9927) grad_norm 3.3597 (3.4759) [2022-10-03 00:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2886 (0.2966) loss 2.5998 (2.9875) grad_norm 3.8558 (3.4409) [2022-10-03 00:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2864 (0.2950) loss 2.9549 (2.9864) grad_norm 4.0370 (3.4413) [2022-10-03 00:06:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2820 (0.2940) loss 3.2901 (2.9942) grad_norm 3.3432 (3.4497) [2022-10-03 00:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2876 (0.2933) loss 2.1868 (2.9822) grad_norm 3.3431 (3.4532) [2022-10-03 00:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2867 (0.2927) loss 2.3381 (2.9792) grad_norm 3.4667 (3.4397) [2022-10-03 00:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2872 (0.2923) loss 2.5883 (2.9637) grad_norm 3.7138 (3.4425) [2022-10-03 00:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2870 (0.2919) loss 2.3406 (2.9670) grad_norm 3.8793 (3.4445) [2022-10-03 00:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2865 (0.2916) loss 3.1499 (2.9692) grad_norm 3.2526 (3.4468) [2022-10-03 00:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2862 (0.2913) loss 2.9862 (2.9645) grad_norm 3.0302 (3.4468) [2022-10-03 00:10:02 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 297 training takes 0:06:04 [2022-10-03 00:10:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.688 (2.688) Loss 0.7011 (0.7011) Acc@1 83.301 (83.301) Acc@5 97.266 (97.266) [2022-10-03 00:10:15 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.190 Acc@5 95.544 [2022-10-03 00:10:15 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-03 00:10:15 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-03 00:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][0/1251] eta 0:57:07 lr 0.000010 time 2.7401 (2.7401) loss 2.8524 (2.8524) grad_norm 3.6739 (3.6739) [2022-10-03 00:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][100/1251] eta 0:06:06 lr 0.000010 time 0.2873 (0.3182) loss 3.0957 (2.9582) grad_norm 3.0942 (3.4068) [2022-10-03 00:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][200/1251] eta 0:05:19 lr 0.000010 time 0.2865 (0.3039) loss 3.4452 (2.9851) grad_norm 4.2725 (3.4566) [2022-10-03 00:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][300/1251] eta 0:04:44 lr 0.000010 time 0.2888 (0.2991) loss 2.9834 (2.9846) grad_norm 4.0014 (3.4781) [2022-10-03 00:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2854 (0.2968) loss 3.4120 (2.9997) grad_norm 2.9795 (3.4626) [2022-10-03 00:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2875 (0.2954) loss 2.8605 (2.9765) grad_norm 2.9642 (3.4750) [2022-10-03 00:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2870 (0.2942) loss 2.5438 (2.9713) grad_norm 3.3192 (3.4761) [2022-10-03 00:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][700/1251] eta 0:02:41 lr 0.000010 time 0.2853 (0.2934) loss 2.6085 (2.9513) grad_norm 3.4869 (3.4775) [2022-10-03 00:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2914 (0.2928) loss 3.5447 (2.9595) grad_norm 2.9384 (3.4744) [2022-10-03 00:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2833 (0.2923) loss 3.5949 (2.9489) grad_norm 3.7845 (3.4730) [2022-10-03 00:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2860 (0.2920) loss 2.6811 (2.9446) grad_norm 3.9074 (3.4746) [2022-10-03 00:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2854 (0.2916) loss 3.3989 (2.9489) grad_norm 4.1961 (3.4705) [2022-10-03 00:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2867 (0.2914) loss 3.5052 (2.9507) grad_norm 3.4686 (3.4687) [2022-10-03 00:16:19 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 298 training takes 0:06:04 [2022-10-03 00:16:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 3.373 (3.373) Loss 0.8327 (0.8327) Acc@1 79.688 (79.688) Acc@5 96.387 (96.387) [2022-10-03 00:16:33 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.196 Acc@5 95.520 [2022-10-03 00:16:33 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-03 00:16:33 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-03 00:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][0/1251] eta 0:55:46 lr 0.000010 time 2.6753 (2.6753) loss 3.3920 (3.3920) grad_norm 3.9622 (3.9622) [2022-10-03 00:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][100/1251] eta 0:06:03 lr 0.000010 time 0.2864 (0.3162) loss 2.4242 (2.9940) grad_norm 3.5175 (3.4798) [2022-10-03 00:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][200/1251] eta 0:05:18 lr 0.000010 time 0.2875 (0.3035) loss 1.7507 (2.9879) grad_norm 4.7098 (3.4704) [2022-10-03 00:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][300/1251] eta 0:04:44 lr 0.000010 time 0.2868 (0.2992) loss 2.9515 (2.9583) grad_norm 3.3175 (3.4643) [2022-10-03 00:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][400/1251] eta 0:04:12 lr 0.000010 time 0.2899 (0.2969) loss 3.4837 (2.9679) grad_norm 3.2589 (3.4560) [2022-10-03 00:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][500/1251] eta 0:03:41 lr 0.000010 time 0.2906 (0.2956) loss 2.6915 (2.9653) grad_norm 3.3425 (3.4564) [2022-10-03 00:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][600/1251] eta 0:03:11 lr 0.000010 time 0.2875 (0.2947) loss 3.2398 (2.9729) grad_norm 3.7391 (3.4614) [2022-10-03 00:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][700/1251] eta 0:02:42 lr 0.000010 time 0.2859 (0.2941) loss 3.4343 (2.9699) grad_norm 3.9132 (3.4625) [2022-10-03 00:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][800/1251] eta 0:02:12 lr 0.000010 time 0.2869 (0.2935) loss 3.4484 (2.9624) grad_norm 3.8875 (3.4513) [2022-10-03 00:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][900/1251] eta 0:01:42 lr 0.000010 time 0.2868 (0.2930) loss 3.2234 (2.9640) grad_norm 3.2535 (3.4574) [2022-10-03 00:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1000/1251] eta 0:01:13 lr 0.000010 time 0.2889 (0.2927) loss 2.8968 (2.9573) grad_norm 3.8812 (3.4674) [2022-10-03 00:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1100/1251] eta 0:00:44 lr 0.000010 time 0.2871 (0.2925) loss 3.4052 (2.9574) grad_norm 3.3258 (3.4698) [2022-10-03 00:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1200/1251] eta 0:00:14 lr 0.000010 time 0.2883 (0.2923) loss 1.8930 (2.9550) grad_norm 3.4176 (3.4758) [2022-10-03 00:22:38 swin_tiny_patch4_window7_224] (main.py 201): INFO EPOCH 299 training takes 0:06:05 [2022-10-03 00:22:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saving...... [2022-10-03 00:22:39 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saved !!! [2022-10-03 00:22:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 2.843 (2.843) Loss 0.8138 (0.8138) Acc@1 82.227 (82.227) Acc@5 95.215 (95.215) [2022-10-03 00:22:52 swin_tiny_patch4_window7_224] (main.py 246): INFO * Acc@1 81.162 Acc@5 95.554 [2022-10-03 00:22:52 swin_tiny_patch4_window7_224] (main.py 133): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-10-03 00:22:52 swin_tiny_patch4_window7_224] (main.py 135): INFO Max accuracy: 81.26% [2022-10-03 00:22:52 swin_tiny_patch4_window7_224] (main.py 139): INFO Training time 1 day, 7:31:58